Recce is the data impact review tool for dbt — compares your PR branch against production before merging. pip install recce. recce server starts an interactive UI comparing two dbt environments. recce run executes all defined checks non-interactively for CI. Config: recce.yml in project root with checks array defining what to verify. Row count check: { type: row_count_diff, model: orders_daily, threshold: 0.1 } — fails if row count changes more than 10%. Schema diff: { type: schema_diff, model: orders_daily } — fails on column add/remove/rename. Value diff: { type: value_diff, model: orders_daily, primary_key: order_id, columns: [amount_usd, status] } — samples rows and compares values. Profile diff: { type: profile_diff, model: orders_daily } — statistical profile comparison (min, max, avg, null_rate). Query diff: { type: query_diff, query: "SELECT COUNT(*) FROM orders_daily WHERE status = 'completed'" }. Recce Cloud: RECCE_CLOUD_TOKEN env var — pushes results to Recce Cloud for review in GitHub PR comments. GitHub Actions: run dbt build on PR branch, then dbt build on production with --target prod, then recce run. recce summary outputs a markdown summary for PR comment. --select flag filters to specific models. Recce state file: recce_state.json captures all check results for review. recce check add interactively adds checks. Claude Code generates Recce configuration, GitHub Actions workflows, check definitions, and PR comment integrations.
CLAUDE.md for Recce
## Recce Stack
- Version: recce >= 0.40
- Config: recce.yml in dbt project root — checks array with type + model + threshold
- Run: recce run — executes all checks, outputs recce_state.json
- CI: dbt build (PR branch) → dbt build --target prod → recce run
- Cloud: RECCE_CLOUD_TOKEN env var → pushes PR review to Recce Cloud
- Types: row_count_diff, schema_diff, value_diff, profile_diff, query_diff
- Summary: recce summary --output-file recce_summary.md for PR comment
Recce Configuration
# recce.yml — Recce check definitions for dbt project
checks:
# ── Core tables: zero tolerance for row count changes > 1% ───────────────
- name: orders row count stable
type: row_count_diff
model: orders_daily
threshold: 0.01 # fail if >1% change
- name: users row count stable
type: row_count_diff
model: stg_users
threshold: 0.05
# ── Schema checks: no unexpected column changes ───────────────────────────
- name: orders schema unchanged
type: schema_diff
model: orders_daily
- name: revenue metrics schema
type: schema_diff
model: revenue_hourly
# ── Value sampling: key metric columns unchanged ──────────────────────────
- name: order amounts match
type: value_diff
model: orders_daily
primary_key: order_id
columns:
- amount_usd
- status
limit: 2000 # Sample 2000 rows for comparison
- name: user enrichment correct
type: value_diff
model: orders_daily
primary_key: order_id
columns:
- user_plan
- user_country
limit: 500
# ── Statistical profiles ──────────────────────────────────────────────────
- name: amount distribution stable
type: profile_diff
model: orders_daily
columns:
- amount_usd
- days_since_last_order
# ── Critical business metrics ──────────────────────────────────────────────
- name: total completed revenue
type: query_diff
query: |
SELECT
SUM(amount_usd) AS total_revenue,
COUNT(*) AS order_count
FROM {{ model }}
WHERE status = 'completed'
model: orders_daily
threshold: 0.001 # 0.1% tolerance
- name: revenue by plan
type: query_diff
query: |
SELECT
user_plan,
SUM(amount_usd) AS revenue,
COUNT(*) AS orders
FROM {{ model }}
WHERE status = 'completed'
GROUP BY 1
ORDER BY 1
model: orders_daily
# ── Downstream impact ──────────────────────────────────────────────────────
- name: churn features distribution
type: profile_diff
model: churn_features
columns:
- days_since_last_order
- order_count_30d
- churn_label
GitHub Actions Workflow
# .github/workflows/dbt-recce.yml — Recce CI in GitHub Actions
name: dbt CI with Recce Review
on:
pull_request:
branches: [main]
paths:
- "models/**"
- "macros/**"
- "seeds/**"
- "recce.yml"
jobs:
recce-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: "3.11" }
- name: Install dependencies
run: pip install dbt-bigquery recce
- name: Configure dbt profiles
env:
DBT_BIGQUERY_KEYFILE_JSON: ${{ secrets.GCP_SA_JSON }}
run: |
mkdir -p ~/.dbt
cat > ~/.dbt/profiles.yml << EOF
my_project:
outputs:
dev:
type: bigquery
project: ${{ vars.GCP_PROJECT }}
dataset: dbt_pr_${{ github.event.pull_request.number }}
keyfile_json: $(echo $DBT_BIGQUERY_KEYFILE_JSON)
prod:
type: bigquery
project: ${{ vars.GCP_PROJECT }}
dataset: analytics
keyfile_json: $(echo $DBT_BIGQUERY_KEYFILE_JSON)
target: dev
EOF
- name: Run dbt on production (base state)
run: dbt build --target prod --profiles-dir ~/.dbt
env:
DBT_BIGQUERY_KEYFILE_JSON: ${{ secrets.GCP_SA_JSON }}
- name: Store production artifacts
run: |
cp target/manifest.json base_manifest.json
cp target/catalog.json base_catalog.json
- name: Run dbt on PR branch (current state)
run: dbt build --target dev --profiles-dir ~/.dbt
env:
DBT_BIGQUERY_KEYFILE_JSON: ${{ secrets.GCP_SA_JSON }}
- name: Run Recce checks
env:
RECCE_CLOUD_TOKEN: ${{ secrets.RECCE_CLOUD_TOKEN }}
run: |
recce run \
--base-manifest base_manifest.json \
--base-catalog base_catalog.json \
--target-manifest target/manifest.json \
--target-catalog target/catalog.json
- name: Generate Recce summary
if: always()
run: recce summary --output-file recce_summary.md
- name: Post PR comment
if: always()
uses: actions/github-script@v7
with:
script: |
const fs = require('fs')
const summary = fs.readFileSync('recce_summary.md', 'utf8')
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '## Recce Data Impact Review\n\n' + summary,
})
Python Integration
# scripts/recce_check.py — programmatic Recce execution
import subprocess
import json
import sys
import os
def run_recce_checks(
base_manifest: str = "base_manifest.json",
base_catalog: str = "base_catalog.json",
target_manifest: str = "target/manifest.json",
target_catalog: str = "target/catalog.json",
) -> dict:
"""Run Recce checks and return results."""
result = subprocess.run(
[
"recce", "run",
"--base-manifest", base_manifest,
"--base-catalog", base_catalog,
"--target-manifest", target_manifest,
"--target-catalog", target_catalog,
"--output", "recce_state.json",
],
capture_output=True,
text=True,
)
print(result.stdout)
if result.stderr:
print(result.stderr, file=sys.stderr)
if not os.path.exists("recce_state.json"):
return {"error": "recce_state.json not generated", "passed": False}
with open("recce_state.json") as f:
state = json.load(f)
checks = state.get("checks", [])
failures = [c for c in checks if c.get("status") in ("failed", "error")]
warnings = [c for c in checks if c.get("status") == "warning"]
return {
"passed": result.returncode == 0,
"total": len(checks),
"failures": len(failures),
"warnings": len(warnings),
"failed_checks": [{"name": c["name"], "type": c["type"]} for c in failures],
}
if __name__ == "__main__":
result = run_recce_checks()
print(json.dumps(result, indent=2))
sys.exit(0 if result["passed"] else 1)
For the dbt source freshness / dbt tests alternative when wanting data quality checks built directly into the dbt project without a separate tool — dbt’s built-in tests (not_null, unique, accepted_values, relationships) and source freshness run on the same target without comparing environments, while Recce is specifically designed for the PR review workflow comparing “before this PR” vs “after this PR.” For the SQLMesh Diff alternative when already using SQLMesh as your transformation tool — SQLMesh has built-in environment comparison and virtual schema promotion that eliminates the need for a separate diff tool; use Recce specifically with dbt projects that don’t yet have virtual environment support. The Claude Skills 360 bundle includes Recce skill sets covering check configuration, GitHub Actions CI, and programmatic execution. Start with the free tier to try dbt review generation.