HTTP Caching Revalidation & Invalidation Playbook
Date: 2026-03-15
Category: knowledge
Why this matters
Most teams don't fail at HTTP caching because they forgot max-age.
They fail because cache behavior is inconsistent across browser, CDN, and origin, and because invalidation strategy is vague until incident day.
Typical symptoms:
- stale HTML after deploy,
- “works after hard refresh” bug reports,
- origin traffic spikes after broad cache purge,
- inconsistent locale/device variants due to weak cache keys.
This playbook gives a practical, production-first model for cache correctness.
Mental model: split cache policy by asset class
Treat these as different products, not one policy:
- Versioned static assets (hashed JS/CSS/images)
- HTML/document shells
- Personalized or sensitive responses
- API responses (public vs user-scoped)
If you apply one blanket policy, you'll either serve stale critical content or give up most cache efficiency.
Recommended baseline policies
1) Versioned static assets (best cache hit ratio)
Use filename/content hash and long TTL.
Example:
Cache-Control: public, max-age=31536000, immutable
Why:
- hashed URL guarantees new deploy => new URL,
immutable(RFC 8246) avoids needless revalidation while fresh,- origin load drops dramatically.
Rule: never mutate content behind the same hashed URL.
2) HTML / app shell (fast updates, still cache-friendly)
Use short freshness + revalidation + stale shields.
Example:
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=30, stale-if-error=600
ETag: "..."
Interpretation:
- browser freshness is short (
max-age=60), - shared caches/CDN can hold longer (
s-maxage=300), - stale window prevents request stampedes during refresh,
- conditional requests via
ETagkeep bytes low.
3) Personalized/sensitive responses
Use strict no-store for truly sensitive data.
Cache-Control: no-store
For user-scoped but cacheable in browser only:
Cache-Control: private, no-cache
(no-cache means revalidate before reuse, not “do not cache at all”).
4) Public API responses
For stable public data, combine moderate TTL + validators:
Cache-Control: public, max-age=120, s-maxage=300, stale-while-revalidate=30
ETag: "..."
Vary: Accept-Encoding
Avoid overusing Vary; each added dimension multiplies cache key cardinality.
Revalidation correctness checklist
A cache strategy is only as good as validator discipline.
- Emit strong validators where possible
- Prefer stable, deterministic ETag generation per representation.
- Honor conditional requests correctly
If-None-Match/If-Modified-Sinceshould return304only when representation truly unchanged.
- Keep validator scope aligned with variants
- If response varies by locale/device/encoding, validator must reflect that representation.
- Avoid weak timestamp-only logic for highly dynamic responses
- clock skew + coarse granularity can cause false 304s.
Invalidation strategy: purge surgically, not globally
Global purge is a recovery tool, not a deploy routine.
Prefer tag/key-based purge:
- Fastly:
Surrogate-Key - Cloudflare:
Cache-Tag
Pattern:
- tag responses by content entity (
post:123,author:42,category:music), - purge by tag on update,
- keep versioned assets immutable (usually no purge needed).
Benefits:
- small blast radius,
- reduced origin thundering-herd risk,
- easier auditing of what was purged and why.
Anti-patterns that cause incidents
no-storeeverywhere- correctness is easy, performance collapses.
- Long TTL on mutable HTML without validators
- stale deploys and support tickets.
- Unbounded
Varydimensions- cache fragmentation, low hit ratios.
- Purge-everything as default deploy step
- origin load spikes and unstable latency.
- ETag generated from non-deterministic metadata
- phantom cache misses and revalidation churn.
30-day rollout plan
Week 1 — Inventory + classification
- classify top endpoints into asset classes,
- map existing cache headers and hit ratios,
- identify stale bug hotspots.
Week 2 — Validator and policy hardening
- normalize ETag/Last-Modified behavior,
- apply class-based
Cache-Controldefaults, - reduce unnecessary
Varydimensions.
Week 3 — Tag-based purge adoption
- introduce cache tags/surrogate keys for mutable content,
- wire publish/update events to selective purge API calls,
- keep global purge as emergency fallback only.
Week 4 — Observability and guardrails
Track:
- CDN hit ratio by route class,
- conditional request rate and 304 ratio,
- stale serve rate (including SWR/SIE windows),
- purge volume (global vs tag-based),
- origin QPS spike after deploy.
Set guardrails:
- deploy blocked if HTML route lacks explicit cache policy,
- alert if global purge frequency exceeds threshold,
- alert on abrupt hit-ratio regression.
Practical header templates
Versioned static file
Cache-Control: public, max-age=31536000, immutable
SSR/HTML page (non-personalized)
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=30, stale-if-error=600
ETag: "<stable-representation-hash>"
Vary: Accept-Encoding
Authenticated profile page
Cache-Control: private, no-cache
Vary: Cookie, Accept-Encoding
Sensitive account/checkout endpoint
Cache-Control: no-store
Bottom line
Good HTTP caching is an operations contract:
- deterministic validators,
- class-specific freshness policy,
- surgical invalidation,
- clear observability around stale risk vs origin load.
Do this well and you get both speed and correctness. Skip it and your CDN becomes a randomizer.
References
- RFC 9111 — HTTP Caching (obsoletes RFC 7234)
https://www.rfc-editor.org/rfc/rfc9111.html - RFC 5861 — HTTP Cache-Control Extensions for Stale Content (
stale-while-revalidate,stale-if-error)
https://www.rfc-editor.org/rfc/rfc5861 - RFC 8246 — HTTP Immutable Responses (
immutable)
https://www.rfc-editor.org/rfc/rfc8246.html - MDN Cache-Control reference (practical directive behavior)
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Cache-Control - Cloudflare docs — Purge by cache tags (
Cache-Tag)
https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-tags/ - Fastly docs — Working with surrogate keys (
Surrogate-Key)
https://www.fastly.com/documentation/guides/full-site-delivery/purging/working-with-surrogate-keys/