HTTP 0-RTT + 425 Too Early — Replay-Safe API Gateway Playbook

Date: 2026-03-26
Category: knowledge
Audience: API platform / edge / security engineers running low-latency HTTPS services

1) Why this matters

TLS 1.3 and QUIC 0-RTT can reduce connect latency by letting clients send application data immediately on resumed sessions.

That speed comes with a trade-off: 0-RTT data can be replayed. If replayed requests hit state-changing endpoints, you can get duplicate side effects (duplicate orders, duplicate transfers, repeated writes).

For high-impact APIs, the right operating model is:

allow 0-RTT only where replay is harmless,
signal risk via Early-Data: 1,
reject unsafe early requests with 425 Too Early,
enforce idempotency/dedup at the application layer.

2) Protocol truth table (what standards actually imply)

TLS / QUIC layer

0-RTT improves first-request latency on resumption.
Replay risk is inherent to 0-RTT; anti-replay mechanisms reduce risk but do not eliminate it.
QUIC inherits the same replay caveat for 0-RTT application data.

HTTP layer (RFC 8470)

Intermediaries/origins can mark requests with Early-Data: 1.
If a request is not safe to process in early data, server should reject with 425 Too Early.
Client retries after handshake completion (1-RTT path), where replay concern from early data no longer applies.

Practical implication

GET is not automatically safe in your system, and POST is not automatically unsafe if strictly idempotent.

The real classifier is: “Can replay of this exact request create harmful additional effects?”

3) Endpoint policy model (ship this first)

Define replay policy per endpoint:

ALLOW_0RTT
Safe reads / cache lookups / health checks.
REQUIRE_1RTT
State changes, order placement, payment, inventory reservations, workflow transitions.
ALLOW_0RTT_WITH_IDEMPOTENCY
Mutations only if strong idempotency key + dedup window is enforced.

Example policy table:

GET /instruments → ALLOW_0RTT
POST /orders → REQUIRE_1RTT (or ALLOW_0RTT_WITH_IDEMPOTENCY if truly hardened)
POST /transfers → REQUIRE_1RTT
PUT /profile → ALLOW_0RTT_WITH_IDEMPOTENCY

4) Gateway behavior blueprint

At edge/proxy:

Detect early data signal (Early-Data: 1 from CDN/proxy, or local TLS early-data state).
Resolve endpoint replay policy.
If request is early and policy forbids it:
- return 425 Too Early immediately.
Otherwise, forward with replay metadata for observability.

Minimal NGINX-ish shape:

# TLS 1.3 early data enabled
ssl_early_data on;

# Pass signal to app
proxy_set_header Early-Data $ssl_early_data;

App-side logic (pseudo):

if header("Early-Data") == "1":
  if policy(endpoint) == REQUIRE_1RTT:
    return 425
  if policy(endpoint) == ALLOW_0RTT_WITH_IDEMPOTENCY:
    require Idempotency-Key
    if duplicate_seen(key, hash(body), window=24h):
      return cached_or_conflict_response
process_normally()

5) Idempotency design that survives replay

If you allow any mutation in 0-RTT, implement all of:

Client-supplied idempotency key (high entropy, request scoped)
Canonical request fingerprint (method + path + normalized body hash)
Atomic dedup store (SETNX/conditional write) with TTL window
Response pinning (same key returns same result contract)
Mismatch handling (same key + different payload ⇒ explicit 409/error)

Without this, 0-RTT on mutating endpoints is usually an operational liability.

6) Observability: metrics to add on day one

Track separately for early-data traffic:

request rate with Early-Data: 1
425 response rate by endpoint
retry success rate after 425
duplicate-effect prevention count (idempotency hits)
suspected replay indicators (same key/fingerprint across distinct connections/edges)
latency deltas: 0-RTT accepted vs 1-RTT retried

Add structured logs fields:

early_data=true|false
endpoint_policy
idempotency_key_present
dedup_decision=miss|hit|mismatch

7) Rollout plan (low-drama)

Phase 0 — Inventory

Classify all public endpoints into the three policy buckets.

Phase 1 — Safe-only enablement

Enable 0-RTT only for ALLOW_0RTT routes. Force 425 for all risky paths.

Phase 2 — Hardened mutation pilots

For one low-risk mutation endpoint, add strict idempotency + replay observability, then cautiously allow 0-RTT.

Phase 3 — Continuous verification

Run chaos/replay simulations in staging and periodically in prod canaries.

8) Common failure modes

Treating HTTP method safety as sufficient policy
Enabling 0-RTT globally at CDN without origin replay controls
Missing dedup storage atomicity (race allows double execution)
Returning 425 but not verifying client retry behavior in SDKs
No endpoint-level dashboards, so replay symptoms look like random duplicates

9) Bottom line

0-RTT is a latency feature, not a free speed boost.

If your API has side effects, pair it with an explicit replay policy (ALLOW / REQUIRE_1RTT / ALLOW_WITH_IDEMPOTENCY), enforce 425 Too Early where needed, and prove correctness with dedup telemetry.

That turns 0-RTT from a security footgun into a controlled performance optimization.

References

RFC 8446 — The Transport Layer Security (TLS) Protocol Version 1.3
https://www.rfc-editor.org/rfc/rfc8446
RFC 8470 — Using Early Data in HTTP
https://www.rfc-editor.org/rfc/rfc8470
RFC 9001 — Using TLS to Secure QUIC
https://www.rfc-editor.org/rfc/rfc9001
NGINX docs — ssl_early_data and replay caveat
https://nginx.org/en/docs/http/ngx_http_ssl_module.html
Cloudflare blog — QUIC 0-RTT resumption, Early-Data: 1, and 425 pattern
https://blog.cloudflare.com/even-faster-connection-establishment-with-quic-0-rtt-resumption/
AWS NLB TLS listener note (0-RTT / early_data not implemented)
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-listeners.html