OpenTelemetry is the open standard for observability — one SDK that instruments your code, multiple backends to receive the data (Datadog, Jaeger, Grafana, Honeycomb). Claude Code generates instrumentation that goes beyond the auto-instrumentation libraries: custom spans for business operations, metrics that match your SLIs, and trace-log correlation that makes production debugging possible.
This guide covers OpenTelemetry with Claude Code: tracing, custom spans, metrics, log correlation, and backend configuration.
OTel SDK Setup
CLAUDE.md for Instrumented Services
## Service Observability
- OpenTelemetry SDK: @opentelemetry/sdk-node
- Auto-instrumentation: HTTP, Express, gRPC, PostgreSQL, Redis — all enabled
- Custom spans: wrap all business operations (not just HTTP calls)
- Metrics: record latency histograms, error rates, business metrics (orders created, etc.)
- Exporter: OTLP to Datadog Agent (or Jaeger for local dev)
- Log correlation: inject traceId/spanId into all log lines
## Span naming conventions
- HTTP server spans: auto-instrumented (don't add custom)
- Database spans: auto-instrumented (don't add custom)
- Business operations: {domain}.{operation} format (orders.create, payments.charge)
- Use span attributes for context: user.id, order.id, payment.amount
- Error spans: recordException() + setStatus(SpanStatusCode.ERROR)
## Sampling
- Production: 10% trace sampling for high-volume paths
- Errors: 100% sampling via Tail Sampling Processor
- Development: 100% (configured via OTEL_TRACES_SAMPLER=always_on)
Initialization (load first, before other imports)
// src/instrumentation.ts — MUST be loaded before any other imports
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
import { Resource } from '@opentelemetry/resources';
import { SEMRESATTRS_SERVICE_NAME, SEMRESATTRS_SERVICE_VERSION } from '@opentelemetry/semantic-conventions';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { ParentBasedSampler, TraceIdRatioBased } from '@opentelemetry/sdk-trace-base';
const sdk = new NodeSDK({
resource: new Resource({
[SEMRESATTRS_SERVICE_NAME]: process.env.SERVICE_NAME ?? 'my-service',
[SEMRESATTRS_SERVICE_VERSION]: process.env.SERVICE_VERSION ?? '0.0.1',
'deployment.environment': process.env.NODE_ENV ?? 'development',
}),
traceExporter: new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/traces',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT ?? 'http://localhost:4318/v1/metrics',
}),
exportIntervalMillis: 10000,
}),
// 10% base sampling — errors always sampled (configured in Collector)
sampler: new ParentBasedSampler({
root: new TraceIdRatioBased(
process.env.OTEL_SAMPLE_RATE ? parseFloat(process.env.OTEL_SAMPLE_RATE) : 0.1
),
}),
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-fs': { enabled: false }, // Too noisy
'@opentelemetry/instrumentation-http': {
requestHook: (span, request) => {
// Add user context to HTTP spans if present
const userId = (request as any).user?.id;
if (userId) span.setAttribute('user.id', userId);
},
},
}),
],
});
sdk.start();
process.on('SIGTERM', () => sdk.shutdown());
// src/index.ts — load instrumentation FIRST
import './instrumentation.js'; // Must be before any other imports
import express from 'express';
// ... rest of app
Custom Spans
We have an order processing pipeline: validate → charge → fulfill → notify.
Add spans to track each step with timing and error state.
// src/lib/tracer.ts
import { trace } from '@opentelemetry/api';
export const tracer = trace.getTracer('order-service', '1.0.0');
// src/services/order-processor.ts
import { SpanStatusCode, SpanKind } from '@opentelemetry/api';
import { tracer } from '../lib/tracer';
export async function processOrder(orderId: string, userId: string) {
// Parent span for the entire operation
return tracer.startActiveSpan(
'orders.process',
{
kind: SpanKind.INTERNAL,
attributes: {
'order.id': orderId,
'user.id': userId,
},
},
async (parentSpan) => {
try {
const order = await withSpan('orders.validate', async (span) => {
const order = await getOrder(orderId);
span.setAttributes({
'order.item_count': order.items.length,
'order.total_cents': order.totalCents,
});
await validateOrder(order);
return order;
});
const charge = await withSpan('payments.charge', async (span) => {
span.setAttributes({
'payment.amount_cents': order.totalCents,
'payment.method': order.paymentMethod,
});
return await chargeCustomer(order);
});
await withSpan('orders.fulfill', async (span) => {
span.setAttribute('fulfillment.warehouse_id', order.warehouseId);
await fulfillOrder(order, charge.id);
});
await withSpan('notifications.send', async (span) => {
span.setAttribute('notification.channel', 'email');
await sendOrderConfirmation(userId, order);
});
parentSpan.setStatus({ code: SpanStatusCode.OK });
return order;
} catch (error) {
parentSpan.recordException(error as Error);
parentSpan.setStatus({
code: SpanStatusCode.ERROR,
message: (error as Error).message,
});
throw error;
} finally {
parentSpan.end();
}
}
);
}
// Helper to reduce boilerplate
async function withSpan<T>(
name: string,
fn: (span: Span) => Promise<T>,
attributes?: Record<string, string | number | boolean>,
): Promise<T> {
return tracer.startActiveSpan(name, { attributes }, async (span) => {
try {
const result = await fn(span);
span.setStatus({ code: SpanStatusCode.OK });
return result;
} catch (error) {
span.recordException(error as Error);
span.setStatus({
code: SpanStatusCode.ERROR,
message: (error as Error).message,
});
throw error;
} finally {
span.end();
}
});
}
Custom Metrics
Track business metrics: orders per minute, payment success rate,
and p95/p99 order processing time.
// src/lib/metrics.ts
import { metrics, ValueType } from '@opentelemetry/api';
const meter = metrics.getMeter('order-service', '1.0.0');
// Counter: total orders created (ever-increasing)
export const ordersCreated = meter.createCounter('orders.created', {
description: 'Total number of orders created',
unit: '1',
});
// Counter with labels: payment outcomes
export const paymentAttempts = meter.createCounter('payments.attempts', {
description: 'Payment attempt outcomes',
unit: '1',
});
// Histogram: latency distribution (p50, p95, p99)
export const orderProcessingDuration = meter.createHistogram('orders.processing_duration', {
description: 'Time to fully process an order',
unit: 'ms',
valueType: ValueType.INT,
});
// UpDownCounter: current queue depth (can go up and down)
export const orderQueueDepth = meter.createUpDownCounter('orders.queue_depth', {
description: 'Current number of orders in processing queue',
unit: '1',
});
// Observable gauge: register once, callback called on each collection
meter.createObservableGauge('database.connections.active', {
description: 'Active database connections',
}, async (observableResult) => {
const count = await getActiveConnectionCount();
observableResult.observe(count);
});
// Usage in order processing
export async function processOrder(orderId: string) {
const startTime = Date.now();
orderQueueDepth.add(1);
try {
const result = await processOrderInternal(orderId);
ordersCreated.add(1, {
'order.type': result.type,
'payment.method': result.paymentMethod,
});
paymentAttempts.add(1, { 'payment.outcome': 'success' });
return result;
} catch (error) {
paymentAttempts.add(1, {
'payment.outcome': 'failure',
'error.type': (error as Error).name,
});
throw error;
} finally {
orderProcessingDuration.record(Date.now() - startTime, {
'order.type': 'standard',
});
orderQueueDepth.add(-1);
}
}
Log Correlation
When we have a trace ID from Datadog, we can't find
the corresponding log lines. Add trace correlation.
// src/lib/logger.ts
import pino from 'pino';
import { trace, context } from '@opentelemetry/api';
const base = pino({
level: process.env.LOG_LEVEL ?? 'info',
formatters: {
log: (object) => {
// Inject current trace ID and span ID into every log line
const span = trace.getActiveSpan();
if (span) {
const spanContext = span.spanContext();
return {
...object,
trace_id: spanContext.traceId,
span_id: spanContext.spanId,
trace_flags: spanContext.traceFlags,
};
}
return object;
},
},
});
export const logger = base;
// Usage — trace_id appears automatically
logger.info({ orderId: '123' }, 'Order created');
// Output: {"level":"INFO","trace_id":"abc123...","span_id":"def456...","orderId":"123","msg":"Order created"}
Now in Datadog (or any OTel backend), you can click on a trace, see the trace ID, and filter logs by that exact trace_id to see every log line from that specific request — across all services.
For the broader observability setup including dashboards and alerting, see the observability guide. For distributed tracing across microservices where OTel context propagation is critical, see the microservices guide. The Claude Skills 360 bundle includes OpenTelemetry skill sets for production instrumentation. Start with the free tier to instrument your first service with distributed tracing.