Event-Time Watermarks & Late Data Playbook
Date: 2026-02-27
Category: software
Purpose: Practical guide for designing stream pipelines that stay correct under out-of-order/late events without exploding latency or state cost.
1) Why this matters
Most real-time bugs are time bugs.
If you aggregate by processing time only, you get low latency but non-deterministic results during backpressure, retries, partition skew, or replay. If you aggregate by event time with no guardrails, you can get unbounded state, stalled outputs, or silent data drops.
The core engineering trade-off is:
- Correctness (include late/out-of-order events)
- Freshness (emit results quickly)
- Cost (state size, checkpoint/recovery overhead)
You can’t maximize all three simultaneously. Watermark strategy is the control surface.
2) Mental model (keep this in your head)
Event time vs processing time
- Event time: when the event actually happened at the source.
- Processing time: when your engine observes it.
A healthy pipeline treats processing time as runtime reality, and event time as analytical truth.
Watermark
Operationally, watermark is a statement like:
“I believe events older than T are mostly complete.”
Important: watermark is not a perfect truth guarantee. It is a bounded-risk contract.
Late data
Any event whose timestamp is older than current watermark is late. Late handling policy decides whether it is:
- Included via allowed lateness,
- Routed to side output / dead-letter stream,
- Dropped.
3) The 5 knobs that decide everything
1. Timestamp quality
- Use source event timestamp if trustworthy.
- If upstream clocks are noisy, add ingest timestamp too (dual timestamp schema).
- Monitor clock drift distribution (p50/p95/p99).
2. Watermark lag
- Larger lag -> better completeness, higher latency/state.
- Smaller lag -> lower latency, more late drops.
Rule of thumb: start from observed inter-arrival delay percentiles, not gut feeling.
3. Window type and size
- Tumbling for stable reporting,
- Sliding for smoother signals,
- Session for bursty user/activity behavior.
Window size multiplies with watermark lag to define how long state stays resident.
4. Allowed lateness
- Extra grace after watermark for updates.
- Should be short and explicit.
- Pair with update/upsert sink semantics when possible.
5. Late-data routing
Never drop silently. Route very-late records to:
late_eventstopic/table,- periodic correction batch,
- quality dashboard.
4) Architecture pattern (production default)
- Bronze (raw immutable): append-only source events with event_time + ingest_time.
- Silver (streaming aggregate): event-time windows + watermark + bounded lateness.
- Late lane: side output for too-late data.
- Correction job: periodic backfill/reconciliation from bronze + late lane.
- Gold serving: upsert/merge serving table with versioned updates.
This pattern separates low-latency serving from eventual correction, avoiding false “exactly right now” promises.
5) Failure modes you should expect
A) Watermark freeze (no progress)
Symptoms:
- Window outputs stop even though ingestion continues.
- Checkpoint size grows.
Likely causes:
- One partition/idle source pinning global watermark.
- Skewed key or stalled upstream shard.
Mitigations:
- Enable idle source handling.
- Per-partition watermarking where supported.
- Alert on watermark stagnation duration.
B) Catch-up replay drops historical truth
Symptoms:
- After outage replay, old events are discarded as “too late”.
Mitigations:
- Temporarily widen watermark/allowed lateness for replay mode,
- or run dedicated backfill pipeline and merge corrections.
C) Sink semantics mismatch
Symptoms:
- Late updates computed but sink only append-only -> duplicates/wrong totals.
Mitigations:
- Use upsert-capable sink for mutable windows,
- or explicitly model revision records (
window_id,version,is_final).
6) SLOs and observability checklist
Track these by stream and by key segment:
- Watermark lag (event_time_now - watermark)
- Late-event ratio (overall + by lateness bucket)
- Side-output late volume
- Window finalize delay
- State bytes / key cardinality
- Reprocessing correction delta (% changed records)
Set guardrails:
- Watermark lag p95 > threshold for N minutes -> page
- Late ratio spike > baseline x2 -> investigate upstream timing/drift
- Correction delta > tolerance -> data quality incident
7) Tuning playbook (step-by-step)
- Baseline delay distribution from raw stream (p50/p90/p99/p99.9 by source).
- Set initial watermark lag near p99 delay.
- Choose allowed lateness for business criticality (often smaller than you think).
- Define sink semantics (append vs upsert vs retract-aware).
- Run shadow pipeline with stricter and looser configs.
- Compare:
- freshness,
- late loss,
- cost (state/checkpoint).
- Promote config via staged rollout; keep replay mode documented.
8) Minimal policy template
- Event timestamp source:
event_time(fallback:ingest_time+ quality flag) - Watermark lag:
X minutes - Allowed lateness:
Y minutes - Very-late handling: side output
late_events - Sink mode: upsert by (
window_start,window_end,entity_key) - Reconciliation cadence: hourly/daily
- Incident trigger: watermark stagnation >
Z minutes
9) What “exactly-once” does not save you from
Exactly-once guarantees in stream engines usually protect state/sink consistency under retries/checkpoint recovery. They do not decide your event-time truth boundary.
You can still be exactly-once and wrong if watermark/lateness policies are misconfigured.
10) Practical defaults (good first production pass)
- Prefer event-time windows for analytics/business metrics.
- Keep watermark policy explicit in code and runbook.
- Treat late events as first-class data, not noise.
- Add correction lane early; retrofitting later is expensive.
- Review delay distribution quarterly (traffic patterns drift).
References
- Databricks Structured Streaming docs — watermarks and state/output semantics:
https://docs.databricks.com/aws/en/structured-streaming/watermarks - Confluent Cloud for Flink docs — event time, processing time, watermark behavior:
https://docs.confluent.io/cloud/current/flink/concepts/timely-stream-processing.html - Apache Flink docs — watermark strategy and window lifecycle/allowed lateness:
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/event-time/generating_watermarks/
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#allowed-lateness