PostgreSQL’s advanced SQL features — window functions, recursive CTEs, lateral joins — can replace complex application logic with a single query that’s both faster and easier to reason about. Claude Code writes advanced PostgreSQL queries correctly, understands when to use each feature, and explains query execution plans.
This guide covers advanced PostgreSQL with Claude Code: window functions for analytics, recursive CTEs, lateral joins, index strategies, and query optimization.
CLAUDE.md for PostgreSQL Optimization
## PostgreSQL Query Patterns
- PostgreSQL 16 with pg_stat_statements enabled
- ORM: Drizzle (TypeScript) — but complex analytics queries written raw SQL
- Query budget: < 100ms for user-facing, < 1s for reports
## When to use what
- Window functions: rankings, running totals, moving averages, lead/lag comparisons
- CTEs: readability, recursive hierarchies, intermediate result caching (MATERIALIZED)
- Lateral joins: "top N per group", correlated subqueries with access to outer row
- Partial indexes: filter conditions match WHERE clauses in queries
- Materialized views: expensive aggregations queried frequently
Window Functions
Calculate monthly revenue with month-over-month growth percentage
and a 3-month moving average. One query.
WITH monthly_revenue AS (
SELECT
DATE_TRUNC('month', created_at) AS month,
SUM(total_cents) AS revenue_cents,
COUNT(*) AS order_count
FROM orders
WHERE status = 'completed'
AND created_at >= NOW() - INTERVAL '12 months'
GROUP BY 1
)
SELECT
month,
revenue_cents,
order_count,
-- Month-over-month change
revenue_cents - LAG(revenue_cents, 1) OVER (ORDER BY month) AS mom_change_cents,
ROUND(
100.0 * (revenue_cents - LAG(revenue_cents, 1) OVER (ORDER BY month))
/ NULLIF(LAG(revenue_cents, 1) OVER (ORDER BY month), 0),
1
) AS mom_growth_pct,
-- 3-month moving average (includes current month)
ROUND(
AVG(revenue_cents) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),
0
) AS moving_avg_3mo,
-- Cumulative revenue (running total)
SUM(revenue_cents) OVER (ORDER BY month) AS cumulative_revenue,
-- Rank by revenue (1 = highest)
RANK() OVER (ORDER BY revenue_cents DESC) AS revenue_rank
FROM monthly_revenue
ORDER BY month;
Find the top 3 products by revenue in each category,
using window functions (not subqueries).
SELECT category, product_id, product_name, revenue_cents, revenue_rank
FROM (
SELECT
p.category,
p.id AS product_id,
p.name AS product_name,
SUM(oi.price_cents * oi.quantity) AS revenue_cents,
RANK() OVER (
PARTITION BY p.category -- Rank within each category
ORDER BY SUM(oi.price_cents * oi.quantity) DESC
) AS revenue_rank
FROM order_items oi
JOIN products p ON p.id = oi.product_id
JOIN orders o ON o.id = oi.order_id
WHERE o.status = 'completed'
AND o.created_at >= NOW() - INTERVAL '90 days'
GROUP BY p.category, p.id, p.name
) ranked
WHERE revenue_rank <= 3
ORDER BY category, revenue_rank;
Recursive CTEs
Our categories table is a tree (each category has a parent_id).
Write a query that returns a category and all its descendants.
WITH RECURSIVE category_tree AS (
-- Base case: the root category we're starting from
SELECT
id,
name,
parent_id,
0 AS depth,
ARRAY[id] AS path, -- Track path for cycle detection
name AS full_path
FROM categories
WHERE id = $1 -- Starting category
UNION ALL
-- Recursive case: join children to each found category
SELECT
c.id,
c.name,
c.parent_id,
ct.depth + 1,
ct.path || c.id,
ct.full_path || ' > ' || c.name
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
WHERE NOT c.id = ANY(ct.path) -- Prevent infinite loops
AND ct.depth < 10 -- Safety depth limit
)
SELECT
id,
name,
depth,
full_path,
-- Count of products in this category (not recursive — just this level)
(SELECT COUNT(*) FROM products WHERE category_id = category_tree.id) AS product_count
FROM category_tree
ORDER BY path; -- Natural tree order
LATERAL Joins: Top N Per Group
For each user, get their last 3 orders. Standard GROUP BY won't work.
-- LATERAL allows the subquery to reference columns from the outer query
SELECT
u.id,
u.email,
recent_orders.id AS order_id,
recent_orders.total_cents,
recent_orders.created_at
FROM users u
JOIN LATERAL (
SELECT id, total_cents, status, created_at
FROM orders o
WHERE o.user_id = u.id -- References outer query's u.id
AND o.status != 'cancelled'
ORDER BY created_at DESC
LIMIT 3 -- Top 3 per user
) recent_orders ON true -- ON true because we want all users (even those with no orders use LEFT JOIN LATERAL)
WHERE u.created_at >= NOW() - INTERVAL '30 days';
Index Strategy
The query "WHERE status = 'active' AND created_at > NOW() - INTERVAL '7d'"
is slow. What indexes should I add?
-- Partial index: only indexes rows where status = 'active'
-- Much smaller than a full index — faster to scan, less storage
CREATE INDEX CONCURRENTLY idx_orders_active_recent
ON orders (created_at DESC)
WHERE status = 'active'; -- Partial index condition
-- If you also frequently filter by user_id:
CREATE INDEX CONCURRENTLY idx_orders_user_active_recent
ON orders (user_id, created_at DESC)
WHERE status = 'active';
-- GIN index for JSONB columns (full-text search in metadata)
CREATE INDEX CONCURRENTLY idx_orders_metadata
ON orders USING GIN (metadata);
-- Query: WHERE metadata @> '{"channel": "mobile"}'
-- Expression index for case-insensitive email lookup
CREATE INDEX CONCURRENTLY idx_users_email_lower
ON users (LOWER(email));
-- Query: WHERE LOWER(email) = LOWER($1)
EXPLAIN ANALYZE: Reading Query Plans
How do I read EXPLAIN ANALYZE output? My query is doing a Seq Scan
when I expect an index to be used.
-- Run with BUFFERS to see cache hit ratio
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE user_id = $1 AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;
What to look for in EXPLAIN ANALYZE:
1. Seq Scan on large table → Missing or unusable index
- Common cause: low cardinality column (e.g., status with 3 values)
- Common cause: function on indexed column (WHERE LOWER(email) but index is on email)
- Common cause: planner chose Seq Scan because table is small (< 1000 rows) — OK
2. "actual rows=50000 rows=500" → Stale statistics
- Fix: ANALYZE table_name
- Or increase autovacuum frequency for fast-changing tables
3. Hash Join vs Nested Loop
- Hash Join: good for large tables
- Nested Loop: good when inner result is small (matching index)
- Sort Merge Join: good for pre-sorted data
4. Buffers: hit=9800 read=200
- hit = served from shared_buffers (cache)
- read = from disk — low read ratio is good
5. Planning Time vs Execution Time
- High planning time: complex query, many options to evaluate
- Can cache with PREPARE / prepared statements
Materialized Views
-- Expensive aggregation queried frequently
CREATE MATERIALIZED VIEW product_stats AS
SELECT
p.id,
p.name,
p.category,
COUNT(DISTINCT oi.order_id) AS orders_count,
SUM(oi.quantity) AS total_units_sold,
SUM(oi.price_cents * oi.quantity) AS total_revenue_cents,
AVG(r.rating) AS avg_rating,
COUNT(r.id) AS review_count
FROM products p
LEFT JOIN order_items oi ON oi.product_id = p.id
LEFT JOIN orders o ON o.id = oi.order_id AND o.status = 'completed'
LEFT JOIN reviews r ON r.product_id = p.id
GROUP BY p.id, p.name, p.category;
-- Index on the materialized view
CREATE UNIQUE INDEX ON product_stats (id);
CREATE INDEX ON product_stats (category, total_revenue_cents DESC);
-- Refresh (can be concurrent — no locking with UNIQUE index)
REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats;
-- Refresh on schedule (e.g., via pg_cron)
SELECT cron.schedule('0 * * * *', 'REFRESH MATERIALIZED VIEW CONCURRENTLY product_stats');
For database migrations that safely add columns and indexes to large tables, see the database migrations guide. For sharding strategies when a single PostgreSQL server isn’t enough, see the database sharding guide. The Claude Skills 360 bundle includes database skill sets covering advanced SQL patterns, query optimization, and schema design. Start with the free tier to try PostgreSQL query generation.