Oct 20, 20182 min read

Node.js Observability with OpenTelemetry

Node.jsObservability

Node.js Observability with OpenTelemetry

Observability gives you visibility into what your system is doing in production. With OpenTelemetry (OTel), you can standardize traces, metrics, and logs across services.

Core Signals

  • Traces: request flow across services
  • Metrics: time-series measurements (latency, throughput, error rate)
  • Logs: event details and context

Why OpenTelemetry?

  • Vendor-neutral instrumentation
  • Unified semantic conventions
  • Works with most backends (Jaeger, Tempo, Datadog, New Relic, etc.)

Basic Node.js Setup

ts
import { NodeSDK } from '@opentelemetry/sdk-node' import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' const sdk = new NodeSDK({ traceExporter: new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, }), instrumentations: [getNodeAutoInstrumentations()], }) sdk.start()

Instrument HTTP + Database

Auto-instrumentation can capture:

  • inbound HTTP requests
  • outbound HTTP/fetch calls
  • PostgreSQL/MySQL/Redis operations

This helps identify where latency is spent (network, DB, external API, etc.).

Add Manual Spans for Business Logic

ts
const span = tracer.startSpan('checkout.calculateTotals') try { const result = await calculateTotals(cart) span.setAttribute('cart.items', cart.items.length) return result } finally { span.end() }

Manual spans make traces meaningful beyond framework-level events.

Propagation Across Services

Ensure

terminal
traceparent
headers are forwarded so traces remain connected end-to-end. Without propagation, each service appears as isolated spans.

Production Tips

  • Sample intelligently (e.g., 10% baseline, 100% on errors)
  • Tag spans with tenant/user-safe identifiers
  • Redact PII in attributes/logs
  • Set SLO-based alerts from metrics

Golden Dashboard

Start with a dashboard including:

  • P50/P95/P99 latency
  • error rate by route
  • top slow spans
  • upstream dependency health

Final Takeaway

Observability is not just tooling—it is feedback for architecture decisions. OpenTelemetry gives a practical baseline to understand performance, reliability, and failure behavior in Node.js systems.

Written by Anant Kumar

Systems Engineer & Full Stack Developer

Anant Kumar

Bridging the gap between high-level applications and low-level systems. Crafting resilient software with a focus on performance and observability.

Expertise

  • Systems Engineering
  • Full Stack Development
  • Cloud Infrastructure
  • Digital Signal Processing
  • Embedded Systems

Stay Connected

Open to opportunities and interesting conversations.

Get in Touch

© 2026 Anant Kumar. All rights reserved.

Systems Operational