Kubernetes NodeLocal DNSCache + CoreDNS Scaling Playbook

2026-03-29 · software

Kubernetes NodeLocal DNSCache + CoreDNS Scaling Playbook

How to stop DNS from becoming your hidden latency tax and intermittent outage source.

Why this matters

In many clusters, DNS is treated as “just plumbing” until one of these happens:

In practice, DNS failures are often fan-out multipliers: one small resolver issue can hit every service path.

This playbook gives an operator-focused pattern for:

  1. scaling CoreDNS safely,
  2. reducing cross-node DNS hops via NodeLocal DNSCache,
  3. tuning cache behavior to reduce backend load without serving stale answers forever.

1) Mental model: where DNS latency actually comes from

Without NodeLocal DNSCache, a pod’s DNS request usually goes:

Pod -> kube-dns Service IP -> kube-proxy translation -> CoreDNS pod (often remote node) -> upstream

With NodeLocal DNSCache:

Pod -> node-local-dns on same node -> (cache hit: done) OR (miss: CoreDNS/upstream)

Operational implications:


2) First principles for capacity planning

A) CoreDNS replicas (autoscaler)

Kubernetes DNS horizontal autoscaling commonly uses cluster-proportional-autoscaler (CPA).

Default linear model idea:

replicas = max( ceil(cores / coresPerReplica), ceil(nodes / nodesPerReplica) )

So:

Start conservative, then tune from observed saturation and SLOs.

B) CoreDNS memory sizing

A practical baseline from CoreDNS deployment guidance:

Treat these as starting priors, then calibrate on your workload/query mix.

C) NodeLocal DNSCache memory

NodeLocal runs per node (DaemonSet), so “small per-pod overhead” becomes cluster-wide overhead.

If NodeLocal pods OOMKill, you get brief DNS blackouts on affected nodes. Set realistic memory requests/limits from measured peaks.


3) Rollout strategy (safe sequence)

Phase 0 — Observe before changing

Collect at least:

Phase 1 — Stabilize CoreDNS first

Before NodeLocal rollout, make CoreDNS stable:

Phase 2 — Introduce NodeLocal DNSCache

Phase 3 — Tune cache behavior


4) CoreDNS cache tuning that usually works

CoreDNS cache plugin gives fine-grained controls:

Example pattern (illustrative):

cache 300 {
  success 20000 300 5
  denial 10000 60 5
  prefetch 20 1m 20%
  serve_stale 30s immediate
  servfail 5s
}

Guidance:


5) Query amplification trap: ndots and search domains

Typical pod resolv.conf includes search suffixes and options ndots:5.

Operationally, short external hostnames may trigger multiple suffix attempts before absolute resolution, amplifying QPS and NXDOMAIN volume.

Mitigations:


6) SLOs and alerts (minimum set)

Track these as first-class reliability signals:

  1. DNS lookup success rate (cluster and namespace critical paths)
  2. p95/p99 DNS latency (app-side + DNS-side)
  3. CoreDNS saturation (CPU throttling, memory pressure, restarts)
  4. NodeLocal pod OOM/restart rate
  5. cache hit ratio and stale-served rate
  6. SERVFAIL and timeout rate

If you only track request count and average latency, you will miss most real incidents.


7) Incident playbook snippets

Symptom: app timeouts spike, CoreDNS CPU pinned

Likely causes:

Actions:

  1. scale CoreDNS replicas up immediately (temporary safety)
  2. verify CPA config and target deployment
  3. inspect top QNAMEs / NXDOMAIN contributors
  4. patch hot clients to reduce resolver churn

Symptom: NodeLocal pods restarting/OOMKilled

Likely causes:

Actions:

  1. raise request/limit and redeploy
  2. temporarily reduce cache capacity if required
  3. validate no prolonged stale-serve side effects

Symptom: intermittent SERVFAIL during upstream issues

Actions:


8) Anti-patterns to avoid


9) Practical baseline checklist


10) References


If you do only one thing: stabilize CoreDNS autoscaling and add NodeLocal DNSCache with measured memory limits. That single move usually removes a surprising amount of tail-latency and timeout noise.