๐ฅ Silver layer โ clean, conform, validate¶
Deep-dive companion to Medallion architecture with ELTMaestro. Silver turns raw Bronze into clean, conformed, deduplicated, validated data that the business can trust. Everything here should be idempotent and re-runnable.
Cleaning & shaping¶
| Need | Step | Notes |
|---|---|---|
| Filter rows | Filter |
Drop/keep rows by a condition expression. |
| Deduplicate | Dedupe |
Remove duplicates by key โ define the dedupe key carefully (it decides which row wins). |
| Derive / clean columns | Function / Function2 |
Per-row scalar (F(x)) and aggregate (ฮฃ(x)) expressions in the expression builder; functions come from the function catalog. |
| Mask PII | Datamask |
Redact/obfuscate sensitive columns as data enters the trusted zone. |
| Re-expose | View, Pivot |
Logical views / pivot reshaping. |
Conforming & integrating¶
| Need | Step |
|---|---|
| Join sources | Join (column mapping + join condition in the expression builder) |
| Combine sets | Union, Minus |
| Generic transform | Transformer, SparkDataCache (cache an intermediate for reuse) |
Conform to canonical types and keys here โ cast the raw Bronze strings to real types, standardize codes, and resolve natural keys so every downstream join lines up.
Historization (slowly-changing dimensions)¶
| Step | Use |
|---|---|
SCD1 |
Overwrite-in-place โ keep only the current value (no history). |
SCD2 |
Versioned history โ new row per change, with effective dating. |
SCD steps detect change via the platform's hash template ($HASH_TEMPLATE / $HASH_RETURNS / $HASH_EXPRESSION, e.g. NVL(MD5($ARG::TEXT),'')) โ only rows whose hash changed are versioned, so reruns are stable.
Upsert / merge into Silver¶
| Step | Pattern |
|---|---|
OnstageDelta |
Apply a staged delta to the Silver table. |
Onstage group loaders + isCDC |
Parallel delete-then-insert merge via $QUERY_CDC_DELETE + $QUERY_CDC_INSERT (the engine applies it through its internal parallel JDBC processor). |
These are the idempotent merge primitives โ re-applying the same delta converges to the same Silver state.
The quality gate (the heart of Silver)¶
The Bronze โ Silver promotion is where you enforce trust:
| Step / feature | Role |
|---|---|
ControlTest |
Assert invariants before promoting โ row counts, key uniqueness, referential integrity, value ranges. |
| Data Quality Management (Administration โธ Metrics Configuration) | Define and manage reusable DQ rules. |
WriteIncident |
Quarantine failing rows to an incident table instead of dropping them or letting them poison Silver. |
SQLMetrics |
Capture per-run metrics (counts, timings) for observability. |
MetaProfile / column profiler |
Profile column distributions to catch drift. |
Quarantine, don't drop. Route rejects through
WriteIncidentso bad data is auditable and fixable โ silently dropping rows hides upstream problems.
Incremental Silver¶
Drive Silver from a Watermark so each run only reprocesses new/changed Bronze rows, then merge with OnstageDelta / SCD. This keeps Silver cheap and idempotent.
Gotchas¶
- Dedupe key correctness โ the wrong key silently keeps the wrong row; verify with a
ControlTeston uniqueness. - SCD2 effective-dating โ make sure change detection (hash) covers exactly the tracked columns, or you'll over- or under-version.
- Idempotency โ a rerun must converge; rely on watermarks + upsert/SCD, never blind inserts.
Next¶
Build marts in ๐ฅ Gold โ aggregate & serve. Back to the Medallion overview ยท ๐ฅ Bronze.