OpenTelemetry is the vendor-neutral standard for distributed traces, metrics, and logs — @opentelemetry/sdk-node provides NodeSDK. new NodeSDK({ traceExporter, metricReader, instrumentations: [getNodeAutoInstrumentations()] }) bootstraps everything. Manual spans: const tracer = trace.getTracer("service-name", "1.0.0"), tracer.startActiveSpan("operation", (span) => { span.setAttribute(SEMATTRS_HTTP_METHOD, "GET"); span.end() }). Nested spans inherit context automatically. context.with(trace.setSpan(context.active(), parentSpan), () => child tracer calls) propagates explicit context. span.recordException(err) and span.setStatus({ code: SpanStatusCode.ERROR, message: err.message }) mark failures. W3C propagation: propagator.inject(context.active(), headers, setter) for outbound HTTP, propagator.extract(context.active(), inboundHeaders, getter) for inbound. Baggage: propagation.setBaggage(context.active(), propagation.createBaggage({ userId: { value: id } })) passes key-value across service boundaries. Semantic conventions: SEMATTRS_HTTP_METHOD, SEMATTRS_HTTP_URL, SEMATTRS_DB_STATEMENT, SEMATTRS_DB_SYSTEM from @opentelemetry/semantic-conventions. Metrics: const meter = metrics.getMeter("service-name"), meter.createCounter("requests.total"), meter.createHistogram("request.duration", { unit: "ms" }), meter.createObservableGauge("queue.depth", { callback: (result) => result.observe(getQueueDepth()) }). OTLP export: new OTLPTraceExporter({ url: "http://collector:4318/v1/traces", headers: { "x-honeycomb-team": TOKEN } }). Claude Code generates OTel SDK setup, manual span instrumentation, context propagation, and OTLP exporter configuration.
CLAUDE.md for OpenTelemetry
## OpenTelemetry Stack
- SDK: @opentelemetry/sdk-node + @opentelemetry/auto-instrumentations-node
- Init: NodeSDK in instrumentation.ts — loaded before app code via --require
- Tracer: const tracer = trace.getTracer("service-name") — one per logical component
- Spans: tracer.startActiveSpan("name", span => { ... span.end() }) — always call end()
- Attributes: use @opentelemetry/semantic-conventions constants (SEMATTRS_*)
- Export: OTLPTraceExporter to localhost:4318 (Collector) or direct to vendor
- Metrics: metrics.getMeter("service-name") — Counter/Histogram/ObservableGauge
- Propagation: W3C traceparent — auto for HTTP, manual inject/extract for message queues
NodeSDK Initialization
// instrumentation.ts — OTel SDK bootstrap (loaded via --require or Next.js instrumentation)
import { NodeSDK } from "@opentelemetry/sdk-node"
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-http"
import { PeriodicExportingMetricReader } from "@opentelemetry/sdk-metrics"
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node"
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-node"
import { Resource } from "@opentelemetry/resources"
import {
SEMRESATTRS_SERVICE_NAME,
SEMRESATTRS_SERVICE_VERSION,
SEMRESATTRS_DEPLOYMENT_ENVIRONMENT,
} from "@opentelemetry/semantic-conventions"
import { W3CTraceContextPropagator } from "@opentelemetry/core"
const resource = new Resource({
[SEMRESATTRS_SERVICE_NAME]: process.env.SERVICE_NAME ?? "my-app",
[SEMRESATTRS_SERVICE_VERSION]: process.env.npm_package_version ?? "0.0.0",
[SEMRESATTRS_DEPLOYMENT_ENVIRONMENT]: process.env.NODE_ENV ?? "production",
})
const traceExporter = new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_TRACES_ENDPOINT ?? "http://localhost:4318/v1/traces",
headers: process.env.HONEYCOMB_API_KEY
? { "x-honeycomb-team": process.env.HONEYCOMB_API_KEY }
: {},
})
const metricExporter = new OTLPMetricExporter({
url: process.env.OTEL_EXPORTER_OTLP_METRICS_ENDPOINT ?? "http://localhost:4318/v1/metrics",
})
export const sdk = new NodeSDK({
resource,
spanProcessor: new BatchSpanProcessor(traceExporter, {
maxQueueSize: 2048,
maxExportBatchSize: 512,
scheduledDelayMillis: 5000,
}),
metricReader: new PeriodicExportingMetricReader({
exporter: metricExporter,
exportIntervalMillis: 30_000,
}),
textMapPropagator: new W3CTraceContextPropagator(),
instrumentations: [
getNodeAutoInstrumentations({
"@opentelemetry/instrumentation-http": {
ignoreIncomingPaths: ["/api/health", "/metrics"],
requestHook: (span, req) => {
span.setAttribute("http.request.body.size", req.headers["content-length"] ?? 0)
},
},
"@opentelemetry/instrumentation-pg": { dbStatementSerializer: (sql) => sql.slice(0, 200) },
"@opentelemetry/instrumentation-fs": { enabled: false }, // Too noisy
}),
],
})
sdk.start()
process.on("SIGTERM", () => sdk.shutdown().catch(console.error))
process.on("SIGINT", () => sdk.shutdown().catch(console.error))
Manual Span Instrumentation
// lib/tracing/spans.ts — manual tracing helpers
import {
trace,
context,
SpanStatusCode,
type Span,
type SpanOptions,
} from "@opentelemetry/api"
import {
SEMATTRS_DB_SYSTEM,
SEMATTRS_DB_STATEMENT,
SEMATTRS_DB_NAME,
SEMATTRS_HTTP_METHOD,
SEMATTRS_HTTP_URL,
SEMATTRS_RPC_SERVICE,
SEMATTRS_RPC_METHOD,
} from "@opentelemetry/semantic-conventions"
const tracer = trace.getTracer("app", process.env.npm_package_version ?? "0")
/** Wrap an async function in a span */
export async function withSpan<T>(
name: string,
fn: (span: Span) => Promise<T>,
options?: SpanOptions,
): Promise<T> {
return tracer.startActiveSpan(name, options ?? {}, async (span) => {
try {
const result = await fn(span)
span.setStatus({ code: SpanStatusCode.OK })
return result
} catch (err) {
const error = err instanceof Error ? err : new Error(String(err))
span.recordException(error)
span.setStatus({ code: SpanStatusCode.ERROR, message: error.message })
throw err
} finally {
span.end()
}
})
}
/** Instrument a database query with semantic conventions */
export async function withDbSpan<T>(
operation: string,
options: {
system: string // "postgresql" | "redis" | "mongodb"
dbName?: string
statement: string
},
fn: () => Promise<T>,
): Promise<T> {
return withSpan(`db.${operation}`, async (span) => {
span.setAttributes({
[SEMATTRS_DB_SYSTEM]: options.system,
[SEMATTRS_DB_STATEMENT]: options.statement.slice(0, 500), // truncate long queries
...(options.dbName ? { [SEMATTRS_DB_NAME]: options.dbName } : {}),
})
return fn()
}, { kind: 2 /* SpanKind.CLIENT */ })
}
/** Instrument an outbound HTTP call */
export async function withHttpSpan<T>(
method: string,
url: string,
fn: () => Promise<T>,
): Promise<T> {
return withSpan(`HTTP ${method} ${new URL(url).pathname}`, async (span) => {
span.setAttributes({
[SEMATTRS_HTTP_METHOD]: method,
[SEMATTRS_HTTP_URL]: url,
})
return fn()
}, { kind: 2 /* SpanKind.CLIENT */ })
}
/** Instrument an RPC / service call */
export async function withRpcSpan<T>(
service: string,
method: string,
fn: () => Promise<T>,
): Promise<T> {
return withSpan(`${service}/${method}`, async (span) => {
span.setAttributes({
[SEMATTRS_RPC_SERVICE]: service,
[SEMATTRS_RPC_METHOD]: method,
})
return fn()
})
}
Context Propagation
// lib/tracing/propagation.ts — W3C traceparent header helpers
import {
context,
propagation,
trace,
} from "@opentelemetry/api"
type HeaderMap = Record<string, string>
/** Inject current trace context into outbound fetch headers */
export function injectTraceHeaders(headers: HeaderMap = {}): HeaderMap {
const carrier: HeaderMap = { ...headers }
propagation.inject(context.active(), carrier)
return carrier
}
/** Extract trace context from inbound request headers and run fn in that context */
export async function withExtractedContext<T>(
headers: Headers | HeaderMap,
fn: () => Promise<T>,
): Promise<T> {
// Convert Headers object to plain object
const carrier: HeaderMap = {}
if (headers instanceof Headers) {
headers.forEach((value, key) => { carrier[key] = value })
} else {
Object.assign(carrier, headers)
}
const extractedCtx = propagation.extract(context.active(), carrier)
return context.with(extractedCtx, fn)
}
/** Get current trace/span IDs for correlation with logs */
export function getCurrentTraceIds(): { traceId: string; spanId: string } | null {
const span = trace.getActiveSpan()
if (!span) return null
const ctx = span.spanContext()
return { traceId: ctx.traceId, spanId: ctx.spanId }
}
/** Inject trace IDs into pino logger child context */
export function getTraceBindings(): Record<string, string> {
const ids = getCurrentTraceIds()
if (!ids) return {}
return { traceId: ids.traceId, spanId: ids.spanId }
}
OTel Metrics
// lib/tracing/metrics.ts — OTel Metrics API
import { metrics, type ObservableResult } from "@opentelemetry/api"
const meter = metrics.getMeter("app", process.env.npm_package_version ?? "0")
// ── Counters ──────────────────────────────────────────────────────────────
export const requestCounter = meter.createCounter("app.requests.total", {
description: "Total HTTP requests processed",
unit: "{request}",
})
export const errorCounter = meter.createCounter("app.errors.total", {
description: "Total errors by type",
unit: "{error}",
})
export const cacheCounter = meter.createCounter("app.cache.operations.total", {
description: "Cache hit/miss counts",
})
// ── Histograms ─────────────────────────────────────────────────────────────
export const requestDuration = meter.createHistogram("app.request.duration", {
description: "HTTP request duration",
unit: "ms",
advice: { explicitBucketBoundaries: [5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000] },
})
export const dbQueryDuration = meter.createHistogram("app.db.query.duration", {
description: "Database query latency",
unit: "ms",
advice: { explicitBucketBoundaries: [1, 5, 10, 25, 50, 100, 250, 500] },
})
// ── Observable Gauges (pull-based) ─────────────────────────────────────────
let currentQueueDepth = 0
export function setQueueDepth(depth: number): void { currentQueueDepth = depth }
meter.createObservableGauge("app.queue.depth", {
description: "Current job queue depth",
unit: "{job}",
}).addCallback((result: ObservableResult) => {
result.observe(currentQueueDepth)
})
// ── Helpers ────────────────────────────────────────────────────────────────
/** Time an async operation and record duration as histogram */
export async function timeOperation<T>(
histogram: ReturnType<typeof meter.createHistogram>,
attributes: Record<string, string>,
fn: () => Promise<T>,
): Promise<T> {
const start = performance.now()
try {
return await fn()
} finally {
histogram.record(performance.now() - start, attributes)
}
}
OTel Collector Config
# otel-collector.yaml — OpenTelemetry Collector configuration
receivers:
otlp:
protocols:
grpc: { endpoint: "0.0.0.0:4317" }
http: { endpoint: "0.0.0.0:4318" }
processors:
batch:
timeout: 5s
send_batch_size: 512
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
resource:
attributes:
- key: deployment.environment
value: production
action: upsert
exporters:
# Grafana Tempo (traces)
otlp/tempo:
endpoint: "tempo:4317"
tls: { insecure: true }
# Grafana Mimir / Prometheus (metrics)
prometheusremotewrite:
endpoint: https://prometheus-prod-{id}.grafana.net/api/prom/push
auth:
authenticator: basicauth/grafana
# Debug (dev only)
debug:
verbosity: normal
# Honeycomb (alternative SaaS)
otlp/honeycomb:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${HONEYCOMB_API_KEY}"
extensions:
basicauth/grafana:
client_auth:
username: "${GRAFANA_METRICS_USER}"
password: "${GRAFANA_METRICS_PASSWORD}"
service:
extensions: [basicauth/grafana]
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp/tempo]
metrics:
receivers: [otlp]
processors: [memory_limiter, resource, batch]
exporters: [prometheusremotewrite]
For the Datadog APM alternative when needing an all-in-one SaaS platform with APM, infrastructure monitoring, log management, synthetics, and RUM under one product with minimal configuration — Datadog provides a unified paid observability platform while OpenTelemetry is vendor-neutral and allows exporting to any OTLP-compatible backend without vendor lock-in. For the Jaeger alternative when self-hosting a dedicated open-source trace storage and UI — Jaeger is a standalone tracing backend that consumes OTLP or Jaeger formats while OpenTelemetry is the instrumentation layer and collector that routes to backends like Jaeger, Tempo, or Honeycomb; use both together with the OTel Collector forwarding to Jaeger. The Claude Skills 360 bundle includes OpenTelemetry skill sets covering NodeSDK setup, manual spans, context propagation, and Collector configuration. Start with the free tier to try distributed tracing generation.