OpenTelemetry Exemplars for Metrics↔Traces Correlation (Production Playbook)
Date: 2026-03-23
Category: knowledge
Scope: Practical rollout guide for linking latency metrics to concrete traces using OpenTelemetry exemplars, Prometheus, and Grafana.
1) Why this matters
Dashboards tell you what is slow (p95/p99 spikes). Traces tell you why it was slow (DB lock, retry storm, cold cache, etc.).
Exemplars are the bridge: they attach a specific trace context to selected metric observations, so you can jump directly from a bad point on a graph to the corresponding trace.
2) Mental model (what an exemplar actually is)
From the OpenTelemetry metrics SDK spec, an exemplar records a measurement sample with:
- measured value,
- timestamp,
- measurement attributes not already preserved in the aggregate point,
- active trace/span context (trace_id, span_id).
Two-stage decision path:
- ExemplarFilter decides if a measurement is eligible (
trace_based,always_on,always_off). - ExemplarReservoir performs final sampling/storage for exemplars per timeseries.
Key default behavior to remember:
- OTel default exemplar filter is TraceBased.
- Env var:
OTEL_METRICS_EXEMPLAR_FILTER(trace_based/always_on/always_off).
3) Pipeline prerequisites (end-to-end)
Correlation only works if every hop preserves exemplar data:
- App instrumentation emits exemplars (typically on histograms).
- Scrape/protocol path uses OpenMetrics-capable exposition when needed.
- Metrics backend stores exemplars.
- UI is configured to link exemplar label (
trace_id) to tracing datasource (Tempo/Jaeger/etc.).
If any one of these is missing, you will see metrics and traces separately but no clickable bridge.
4) Practical implementation pattern
4.1 Instrument where exemplars are highest value
Prioritize latency histograms for user-facing critical paths:
http_request_duration_secondsdb_query_duration_secondsqueue_consume_duration_seconds
Why histograms first: you care most about outliers in tail buckets, and exemplars are perfect for drilling into those.
4.2 Ensure stable cross-signal identity labels
Keep service.name, environment, and (if relevant) cluster consistent across metrics/logs/traces. Correlation UX is much smoother when all signals share the same identity spine.
4.3 Prometheus storage and capacity knobs
- Enable exemplar storage (feature flag path documented by Prometheus):
--enable-feature=exemplar-storage
- Configure buffer size in config:
storage.exemplars.max_exemplars(default:100000)
- Prometheus notes rough memory of about ~100 bytes per exemplar with only
trace_idlabel.
Rule-of-thumb memory estimate:
max_exemplars * 100 bytes(minimum-ish base), plus overhead from extra labels.
4.4 Remote write forwarding
If forwarding to Grafana Cloud/Mimir-compatible backends, explicitly enable exemplar forwarding (send_exemplars: true) on remote write/export path.
5) Rollout strategy (safe in production)
Phase A — One service, one histogram
- Turn on trace-based exemplars for a single service and one latency histogram.
- Validate end-to-end click-through from dashboard point → trace.
- Confirm no scrape/parser incompatibility.
Phase B — Expand by SLO criticality
- Add ingress/API critical paths.
- Add top DB/cache operations tied to latency incidents.
- Keep exemplar label set minimal (
trace_idusually enough).
Phase C — Tighten cost controls
- Keep exemplar filter at
trace_basedunless debugging. - Avoid
always_onglobally in high-QPS paths. - Cap storage buffer intentionally (do not leave infinite expectations about retention).
6) Common failure modes and fixes
No diamonds in panel / no exemplars visible
- Verify backend storage enabled.
- Verify panel type supports exemplars (Grafana Time series panel).
- Verify query actually targets instrument with exemplars.
Diamonds exist but link is broken/404
- Check datasource correlation mapping (
trace_idlabel name, Tempo/Jaeger datasource selection). - Ensure trace backend retention still includes linked traces.
- Check datasource correlation mapping (
Exemplars exist but traces missing
- Tail sampling may have dropped the trace even though metric exemplar was emitted.
- Align tracing sampling/retention policy with exemplar expectations.
Cardinality/cost surprises
- Don’t attach many custom exemplar labels.
- Keep shared identity labels controlled and consistent.
Format/protocol mismatch
- Ensure scrape/export path supports exemplar-capable format (OpenMetrics/proto paths as appropriate in your stack).
7) Opinionated defaults (good starting point)
- Exemplar filter:
trace_based - Exemplar labels: only
trace_id(addspan_idonly if needed) - Start scope: p95/p99 latency histograms on top 3 critical services
- Prometheus exemplar buffer: start near default (
100k) and tune via observed usage - Operational check: once per deploy, verify one-click metric→trace still works
8) Bottom line
Exemplars are one of the highest-ROI observability upgrades because they remove the manual search step between "SLO graph looks bad" and "which exact request was bad?".
Treat exemplar correlation as a productized path (instrumentation + storage + UI mapping + sampling policy), not a one-off dashboard trick.
References
- OpenTelemetry Metrics SDK (Exemplar, ExemplarFilter, ExemplarReservoir): https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md
- OpenTelemetry SDK environment variables (
OTEL_METRICS_EXEMPLAR_FILTER): https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-environment-variables.md - OpenTelemetry .NET exemplars guide (trace-based exemplar usage): https://opentelemetry.io/docs/languages/dotnet/metrics/exemplars/
- OpenTelemetry metrics data model (transformations, temporality): https://opentelemetry.io/docs/specs/otel/metrics/data-model/
- Prometheus feature flags (exemplar storage, memory note): https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage
- Prometheus configuration (
storage.exemplars.max_exemplars): https://prometheus.io/docs/prometheus/latest/configuration/configuration/ - Prometheus exposition formats / OpenMetrics exemplar support: https://prometheus.io/docs/instrumenting/exposition_formats/#exemplars-experimental
- Grafana correlation setup (exemplars, derived fields, cross-signal labels): https://grafana.com/docs/grafana-cloud/telemetry-signals/use-signals-together/setup-correlations/
- Prometheus native histograms overview (mergeability/sparse buckets context): https://prometheus.io/docs/specs/native_histograms/