dbt’s power comes from its composability — models reference other models, tests validate the graph, and metadata from one project feeds downstream consumers. Advanced dbt involves exposures to track dashboards powered by your models, packages to share logic across projects, the semantic layer for metric consistency, and data contracts to guarantee interface stability. Claude Code implements all of these and builds CI workflows that only test what changed.
CLAUDE.md for Advanced dbt Projects
## dbt Project Structure
- Version: dbt-core 1.8+
- Adapter: dbt-snowflake (or dbt-bigquery)
- Packages: dbt-utils, dbt-expectations, dbt-audit-helper
- Repository structure: single project for analytics, shared package in separate repo
- Staging models: 1:1 with source tables — no transformations beyond type casting
- Intermediate: joins and filtering logic
- Marts: business-ready, used by BI tools and downstream applications
- Semantic layer: metrics defined in _metrics.yml, exposed via dbt Cloud Semantic Layer API
- Contracts: core mart models have enforced contracts (column types + constraints)
Exposures: Tracking Downstream Usage
# models/marts/_exposures.yml
version: 2
exposures:
- name: revenue_dashboard
label: "Revenue Dashboard"
type: dashboard
maturity: high # high = production, medium = beta, low = experimental
url: https://looker.mycompany.com/dashboards/42
description: "Executive dashboard showing daily and weekly revenue metrics."
owner:
name: Revenue Analytics Team
email: [email protected]
# Which models does this exposure depend on?
depends_on:
- ref('fct_orders')
- ref('dim_customers')
- ref('fct_revenue_daily')
tags: ['finance', 'executive']
- name: fraud_detection_model
label: "Fraud Detection ML Model"
type: ml
maturity: medium
description: "ML model using transaction features to predict fraud probability."
owner:
name: ML Platform Team
email: [email protected]
depends_on:
- ref('fct_transactions')
- ref('int_customer_risk_features')
Exposures let you run dbt ls --select +exposure:revenue_dashboard to see all models that feed a dashboard, and alert stakeholders when upstream models change.
dbt Packages
# packages.yml
packages:
- package: dbt-labs/dbt_utils
version: [">=1.1.0", "<2.0.0"]
- package: calogica/dbt_expectations
version: [">=0.10.0", "<0.11.0"]
# Internal shared package from your own repo
- git: "https://github.com/myorg/dbt-shared-macros.git"
revision: v2.1.0
-- Using dbt_utils macros
-- models/marts/fct_orders.sql
with orders as (
select * from {{ ref('int_orders__enriched') }}
),
final as (
select
{{ dbt_utils.generate_surrogate_key(['order_id', 'customer_id']) }} as order_sk,
order_id,
customer_id,
-- Star-safe division from dbt_utils
{{ dbt_utils.safe_divide('revenue_cents', 'item_count') }} as avg_item_price_cents,
-- Date spine (useful for filling gaps)
{{ dbt_utils.date_trunc('week', 'ordered_at') }} as order_week,
total_cents,
status,
ordered_at
from orders
)
select * from final
# Using dbt_expectations for data quality
# models/marts/fct_orders.yml
models:
- name: fct_orders
columns:
- name: total_cents
tests:
- dbt_expectations.expect_column_values_to_be_between:
min_value: 0
max_value: 10000000 # $100k max
- dbt_expectations.expect_column_quantile_values_to_be_between:
quantile: 0.95
min_value: 0
max_value: 500000 # 95th percentile < $5k
- name: status
tests:
- dbt_expectations.expect_column_values_to_be_in_set:
value_set: ['PENDING', 'CONFIRMED', 'SHIPPED', 'DELIVERED', 'CANCELLED']
- dbt_expectations.expect_column_proportion_of_unique_values_to_be_between:
min_value: 0.0001 # At least some variety
Data Contracts
# models/marts/fct_orders.yml — enforce schema contract
models:
- name: fct_orders
config:
contract:
enforced: true # dbt will fail if the model doesn't match this spec
columns:
- name: order_sk
data_type: varchar
constraints:
- type: not_null
- type: primary_key
- name: order_id
data_type: varchar
constraints:
- type: not_null
- type: unique
- name: total_cents
data_type: bigint
constraints:
- type: not_null
- name: status
data_type: varchar
constraints:
- type: not_null
- name: ordered_at
data_type: timestamp
constraints:
- type: not_null
Semantic Layer: Metrics
# models/_metrics.yml — define metrics once, use everywhere
semantic_models:
- name: orders
description: "Order transactions"
model: ref('fct_orders')
entities:
- name: order
type: primary
expr: order_sk
- name: customer
type: foreign
expr: customer_id
dimensions:
- name: status
type: categorical
- name: ordered_at
type: time
type_params:
time_granularity: day
measures:
- name: order_count
agg: count
expr: order_sk
- name: revenue
agg: sum
expr: total_cents
- name: avg_order_value
agg: average
expr: total_cents
metrics:
- name: monthly_revenue
label: "Monthly Revenue"
type: simple
type_params:
measure: revenue
filter: |
{{ Dimension('ordered_at__month') }} = date_trunc('month', current_date)
- name: revenue_growth_wow
label: "Revenue Growth Week-over-Week"
type: ratio
type_params:
numerator: revenue
denominator:
name: revenue
offset_window: 1 week
CI: Slim CI with State
# .github/workflows/dbt-ci.yml
# Only run tests on changed models and their children
- name: Get dbt state from production
run: |
# Fetch the production manifest to compare against
dbt-cloud cli run list --job-id $PROD_JOB_ID --limit 1 --output json \
| jq -r '.data[0].run_id' > /tmp/run_id.txt
mkdir -p /tmp/prod-state
dbt-cloud cli run get-artifact --run-id $(cat /tmp/run_id.txt) \
--path manifest.json --output /tmp/prod-state/manifest.json
- name: Install packages
run: dbt deps
- name: Build changed models only
run: |
# state:modified+ = changed models + their downstream children
dbt build \
--select state:modified+ \
--state /tmp/prod-state \
--defer # Use production results for unmodified upstream dependencies
# Also run tests on changed sources
dbt test \
--select source_status:fresher+ \
--state /tmp/prod-state
- name: Check for breaking contract changes
run: |
dbt-audit-helper compare_column_values \
--a ref('fct_orders') \
--b ref('fct_orders', version=previous) \
--primary_key order_sk
Cross-Project References
# dbt_project.yml — reference another dbt project's models
# (dbt Cloud only, or via dbt Mesh)
vars:
# Reference the shared data platform project
data_platform_project: "myorg_data_platform"
# In a model:
# select * from {{ ref('data_platform', 'dim_dates') }}
# This reads from the other project's published models
-- Cross-project ref: use the public model from the data platform project
with calendar as (
-- dim_dates is owned by the data platform team
select * from {{ ref('data_platform', 'dim_dates') }}
),
orders as (
select * from {{ ref('fct_orders') }}
),
final as (
select
d.date_day,
d.day_of_week,
d.is_weekend,
coalesce(sum(o.total_cents), 0) as daily_revenue_cents,
count(o.order_sk) as daily_order_count
from calendar d
left join orders o on d.date_day = o.ordered_at::date
group by 1, 2, 3
)
select * from final
For the foundational dbt patterns including staging models and incremental builds, see the dbt analytics guide. For the Flink and Kafka Streams pipelines that produce the raw data that dbt transforms, the Apache Flink guide covers stream processing. The Claude Skills 360 bundle includes advanced dbt skill sets covering exposures, packages, contracts, and semantic layer metrics. Start with the free tier to try dbt exposure and contract generation.