Tag

dbt

4 posts tagged “dbt”.

Playbook · Data validation May 21, 2026 28 min read

Data validation in the lakehouse

Most data pipelines fail silently when a source schema drifts. dbt tests run AFTER the model — they catch the broken state, they do not prevent it from being written. We wire Great Expectations as the OSS validation engine on every engagement, with a clear-eyed view of where it shines, where it doesn't, what we are NOT doing after GX Cloud's May-2026 shutdown announcement, and which alternatives (Soda, Pandera, dbt-native tests, Elementary) we layer alongside it. Includes the current GX 1.x Fluent-API code, the integration patterns that actually work in production, the real performance bottlenecks (with citations), the competitive landscape (GX vs Soda vs Pandera vs Anomalo vs Monte Carlo vs Bigeye), and the anti-patterns we audit in client engagements.

Read post →

Playbook · Reference engagement May 21, 2026 29 min read

From a fresh AWS account to dashboards and AI chat — a TwiceData engagement walked end-to-end.

The canonical TwiceData engagement: customer starts with a freshly provisioned AWS account and the vendor's Postgres database. Twelve weeks later they have a Iceberg lakehouse on S3, dbt-modeled metrics, a governed semantic layer, Looker dashboards their team designs, AND an AI chat surface that answers natural-language questions on the same data. This post walks through every layer of the build — the choices, the tradeoffs, the seams between layers, and the day-91 handoff where you keep the keys.

Read post →

Deep dive · Dimensional modeling May 21, 2026 37 min read

Slowly Changing Dimensions — a diagnostic walkthrough of all eight types, the hybrids, and when to build your own.

Most data teams default to SCD Type 2 because it's the only pattern they remember from Kimball. There are eight types, three modern variants, and three hybrid systems — and the right one for your pipeline is determined by signals in your incoming data, not by tradition. This article walks the diagnostic loop end-to-end: identify the data pattern, identify the query need, pick the type (or compose a hybrid), implement in dbt + Iceberg. Every type gets its own worked example.

Read post →

Engineering May 14, 2026 2 min read

A governed ARR rollup in 47 lines of dbt.

The exact dbt model we drop into mid-market SaaS engagements — the four contract types it normalizes, the three tests that gate it, and the lineage hook that keeps it honest.

Read post →