HTTP 0-RTT + 425 Too Early β Replay-Safe API Gateway Playbook
Date: 2026-03-26
Category: knowledge
Audience: API platform / edge / security engineers running low-latency HTTPS services
1) Why this matters
TLS 1.3 and QUIC 0-RTT can reduce connect latency by letting clients send application data immediately on resumed sessions.
That speed comes with a trade-off: 0-RTT data can be replayed. If replayed requests hit state-changing endpoints, you can get duplicate side effects (duplicate orders, duplicate transfers, repeated writes).
For high-impact APIs, the right operating model is:
- allow 0-RTT only where replay is harmless,
- signal risk via
Early-Data: 1, - reject unsafe early requests with
425 Too Early, - enforce idempotency/dedup at the application layer.
2) Protocol truth table (what standards actually imply)
TLS / QUIC layer
- 0-RTT improves first-request latency on resumption.
- Replay risk is inherent to 0-RTT; anti-replay mechanisms reduce risk but do not eliminate it.
- QUIC inherits the same replay caveat for 0-RTT application data.
HTTP layer (RFC 8470)
- Intermediaries/origins can mark requests with
Early-Data: 1. - If a request is not safe to process in early data, server should reject with
425 Too Early. - Client retries after handshake completion (1-RTT path), where replay concern from early data no longer applies.
Practical implication
GET is not automatically safe in your system, and POST is not automatically unsafe if strictly idempotent.
The real classifier is: βCan replay of this exact request create harmful additional effects?β
3) Endpoint policy model (ship this first)
Define replay policy per endpoint:
ALLOW_0RTT
Safe reads / cache lookups / health checks.REQUIRE_1RTT
State changes, order placement, payment, inventory reservations, workflow transitions.ALLOW_0RTT_WITH_IDEMPOTENCY
Mutations only if strong idempotency key + dedup window is enforced.
Example policy table:
GET /instrumentsβALLOW_0RTTPOST /ordersβREQUIRE_1RTT(orALLOW_0RTT_WITH_IDEMPOTENCYif truly hardened)POST /transfersβREQUIRE_1RTTPUT /profileβALLOW_0RTT_WITH_IDEMPOTENCY
4) Gateway behavior blueprint
At edge/proxy:
- Detect early data signal (
Early-Data: 1from CDN/proxy, or local TLS early-data state). - Resolve endpoint replay policy.
- If request is early and policy forbids it:
- return
425 Too Earlyimmediately.
- return
- Otherwise, forward with replay metadata for observability.
Minimal NGINX-ish shape:
# TLS 1.3 early data enabled
ssl_early_data on;
# Pass signal to app
proxy_set_header Early-Data $ssl_early_data;
App-side logic (pseudo):
if header("Early-Data") == "1":
if policy(endpoint) == REQUIRE_1RTT:
return 425
if policy(endpoint) == ALLOW_0RTT_WITH_IDEMPOTENCY:
require Idempotency-Key
if duplicate_seen(key, hash(body), window=24h):
return cached_or_conflict_response
process_normally()
5) Idempotency design that survives replay
If you allow any mutation in 0-RTT, implement all of:
- Client-supplied idempotency key (high entropy, request scoped)
- Canonical request fingerprint (method + path + normalized body hash)
- Atomic dedup store (
SETNX/conditional write) with TTL window - Response pinning (same key returns same result contract)
- Mismatch handling (same key + different payload β explicit
409/error)
Without this, 0-RTT on mutating endpoints is usually an operational liability.
6) Observability: metrics to add on day one
Track separately for early-data traffic:
- request rate with
Early-Data: 1 425response rate by endpoint- retry success rate after
425 - duplicate-effect prevention count (idempotency hits)
- suspected replay indicators (same key/fingerprint across distinct connections/edges)
- latency deltas: 0-RTT accepted vs 1-RTT retried
Add structured logs fields:
early_data=true|falseendpoint_policyidempotency_key_presentdedup_decision=miss|hit|mismatch
7) Rollout plan (low-drama)
Phase 0 β Inventory
Classify all public endpoints into the three policy buckets.
Phase 1 β Safe-only enablement
Enable 0-RTT only for ALLOW_0RTT routes. Force 425 for all risky paths.
Phase 2 β Hardened mutation pilots
For one low-risk mutation endpoint, add strict idempotency + replay observability, then cautiously allow 0-RTT.
Phase 3 β Continuous verification
Run chaos/replay simulations in staging and periodically in prod canaries.
8) Common failure modes
- Treating HTTP method safety as sufficient policy
- Enabling 0-RTT globally at CDN without origin replay controls
- Missing dedup storage atomicity (race allows double execution)
- Returning
425but not verifying client retry behavior in SDKs - No endpoint-level dashboards, so replay symptoms look like random duplicates
9) Bottom line
0-RTT is a latency feature, not a free speed boost.
If your API has side effects, pair it with an explicit replay policy (ALLOW / REQUIRE_1RTT / ALLOW_WITH_IDEMPOTENCY), enforce 425 Too Early where needed, and prove correctness with dedup telemetry.
That turns 0-RTT from a security footgun into a controlled performance optimization.
References
- RFC 8446 β The Transport Layer Security (TLS) Protocol Version 1.3
https://www.rfc-editor.org/rfc/rfc8446 - RFC 8470 β Using Early Data in HTTP
https://www.rfc-editor.org/rfc/rfc8470 - RFC 9001 β Using TLS to Secure QUIC
https://www.rfc-editor.org/rfc/rfc9001 - NGINX docs β
ssl_early_dataand replay caveat
https://nginx.org/en/docs/http/ngx_http_ssl_module.html - Cloudflare blog β QUIC 0-RTT resumption,
Early-Data: 1, and425pattern
https://blog.cloudflare.com/even-faster-connection-establishment-with-quic-0-rtt-resumption/ - AWS NLB TLS listener note (0-RTT / early_data not implemented)
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-listeners.html