Kubernetes NodeLocal DNSCache + CoreDNS Scaling Playbook

How to stop DNS from becoming your hidden latency tax and intermittent outage source.

Why this matters

In many clusters, DNS is treated as “just plumbing” until one of these happens:

p95 app latency spikes without obvious CPU or DB saturation
random timeout storms appear during traffic peaks
node drains/upgrades trigger a wave of UnknownHost/SERVFAIL
CoreDNS CPU and restarts oscillate with cluster size

In practice, DNS failures are often fan-out multipliers: one small resolver issue can hit every service path.

This playbook gives an operator-focused pattern for:

scaling CoreDNS safely,
reducing cross-node DNS hops via NodeLocal DNSCache,
tuning cache behavior to reduce backend load without serving stale answers forever.

1) Mental model: where DNS latency actually comes from

Without NodeLocal DNSCache, a pod’s DNS request usually goes:

Pod -> kube-dns Service IP -> kube-proxy translation -> CoreDNS pod (often remote node) -> upstream

With NodeLocal DNSCache:

Pod -> node-local-dns on same node -> (cache hit: done) OR (miss: CoreDNS/upstream)

Operational implications:

fewer iptables/IPVS and conntrack side effects on hot paths
fewer cross-node round trips for repeated names
more stable latency under bursty name resolution patterns
node-level DNS metrics become visible (not just cluster aggregate)

2) First principles for capacity planning

A) CoreDNS replicas (autoscaler)

Kubernetes DNS horizontal autoscaling commonly uses cluster-proportional-autoscaler (CPA).

Default linear model idea:

replicas = max( ceil(cores / coresPerReplica), ceil(nodes / nodesPerReplica) )

So:

large-core clusters are dominated by coresPerReplica
many-small-node clusters are dominated by nodesPerReplica

Start conservative, then tune from observed saturation and SLOs.

B) CoreDNS memory sizing

A practical baseline from CoreDNS deployment guidance:

default config estimate: MB ~= (Pods + Services)/1000 + 54
with autopath: MB ~= (Pods + Services)/250 + 56

Treat these as starting priors, then calibrate on your workload/query mix.

C) NodeLocal DNSCache memory

NodeLocal runs per node (DaemonSet), so “small per-pod overhead” becomes cluster-wide overhead.

default CoreDNS cache size (10k entries) is often ~30MB when full (per server block)
query concurrency and cache policy can push usage higher

If NodeLocal pods OOMKill, you get brief DNS blackouts on affected nodes. Set realistic memory requests/limits from measured peaks.

3) Rollout strategy (safe sequence)

Phase 0 — Observe before changing

Collect at least:

CoreDNS CPU/memory/restarts
request rate and cache hit ratio
DNS error mix (NXDOMAIN, SERVFAIL, timeout)
app-side lookup latency and timeout counts

Phase 1 — Stabilize CoreDNS first

Before NodeLocal rollout, make CoreDNS stable:

set CPA min replicas (avoid singleton)
set requests/limits based on current QPS
verify disruption settings for kube-system workloads

Phase 2 — Introduce NodeLocal DNSCache

deploy manifest with a non-colliding local IP (link-local range is common)
in IPVS mode, update kubelet --cluster-dns as required
canary on a subset of nodes first

Phase 3 — Tune cache behavior

raise hit ratio without over-serving stale data
tune negative cache behavior based on your NXDOMAIN profile
validate tail latency and timeout reduction, not just average latency

4) CoreDNS cache tuning that usually works

CoreDNS cache plugin gives fine-grained controls:

success / denial capacities and TTL floors
prefetch for popular records before expiry
serve_stale for resilience during upstream blips
servfail cache duration (short)

Example pattern (illustrative):

cache 300 {
  success 20000 300 5
  denial 10000 60 5
  prefetch 20 1m 20%
  serve_stale 30s immediate
  servfail 5s
}

Guidance:

keep stale windows short unless you explicitly prioritize availability over freshness
be careful with aggressive denial caching if service discovery state changes quickly
avoid keepttl for recursive/caching use cases (it can propagate stale behavior downstream)

5) Query amplification trap: `ndots` and search domains

Typical pod resolv.conf includes search suffixes and options ndots:5.

Operationally, short external hostnames may trigger multiple suffix attempts before absolute resolution, amplifying QPS and NXDOMAIN volume.

Mitigations:

use fully qualified names for high-QPS external dependencies
where appropriate, consider trailing-dot absolute names for resolver-critical paths
monitor NXDOMAIN volume before/after app DNS config changes

6) SLOs and alerts (minimum set)

Track these as first-class reliability signals:

DNS lookup success rate (cluster and namespace critical paths)
p95/p99 DNS latency (app-side + DNS-side)
CoreDNS saturation (CPU throttling, memory pressure, restarts)
NodeLocal pod OOM/restart rate
cache hit ratio and stale-served rate
SERVFAIL and timeout rate

If you only track request count and average latency, you will miss most real incidents.

7) Incident playbook snippets

Symptom: app timeouts spike, CoreDNS CPU pinned

Likely causes:

sudden query amplification (retry storm, ndots/search expansion)
insufficient CoreDNS replicas
cache miss ratio jump due to low TTL/high cardinality names

Actions:

scale CoreDNS replicas up immediately (temporary safety)
verify CPA config and target deployment
inspect top QNAMEs / NXDOMAIN contributors
patch hot clients to reduce resolver churn

Symptom: NodeLocal pods restarting/OOMKilled

Likely causes:

too-low memory limits for cache/query concurrency
bursty per-node traffic patterns

Actions:

raise request/limit and redeploy
temporarily reduce cache capacity if required
validate no prolonged stale-serve side effects

Symptom: intermittent `SERVFAIL` during upstream issues

Actions:

ensure short servfail caching is enabled
use bounded serve_stale to absorb transient upstream flaps
verify upstream resolver health and packet loss path

8) Anti-patterns to avoid

Running CoreDNS as effectively singleton in production
Enabling NodeLocal everywhere without memory headroom testing
Tuning cache TTLs aggressively without stale/freshness policy
Ignoring NXDOMAIN trends (often early warning for app mis-resolution)
Treating DNS as “best effort” while holding strict app latency SLOs

9) Practical baseline checklist

CoreDNS autoscaling enabled (CPA or equivalent)
min replicas >= 2 for production clusters
NodeLocal DNSCache canary tested and rolled out
CoreDNS + NodeLocal resource limits calibrated from observed peak
cache hit ratio, stale rate, SERVFAIL, timeout, NXDOMAIN on dashboards
DNS-related alerts routed to oncall with runbook links
app teams documented DNS naming practices (FQDN, retry behavior)

10) References

Kubernetes: Using NodeLocal DNSCache in Kubernetes Clusters
https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/
Kubernetes: Autoscale the DNS Service in a Cluster
https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/
Kubernetes: DNS for Services and Pods
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
CoreDNS cache plugin docs
https://coredns.io/plugins/cache/
CoreDNS deployment sizing notes
https://github.com/coredns/deployment/blob/master/kubernetes/Scaling_CoreDNS.md

If you do only one thing: stabilize CoreDNS autoscaling and add NodeLocal DNSCache with measured memory limits. That single move usually removes a surprising amount of tail-latency and timeout noise.

Kubernetes NodeLocal DNSCache + CoreDNS Scaling Playbook

Kubernetes NodeLocal DNSCache + CoreDNS Scaling Playbook

Why this matters

1) Mental model: where DNS latency actually comes from

2) First principles for capacity planning

A) CoreDNS replicas (autoscaler)

B) CoreDNS memory sizing

C) NodeLocal DNSCache memory

3) Rollout strategy (safe sequence)

Phase 0 — Observe before changing

Phase 1 — Stabilize CoreDNS first

Phase 2 — Introduce NodeLocal DNSCache

Phase 3 — Tune cache behavior

4) CoreDNS cache tuning that usually works

5) Query amplification trap: ndots and search domains

6) SLOs and alerts (minimum set)

7) Incident playbook snippets

Symptom: app timeouts spike, CoreDNS CPU pinned

Symptom: NodeLocal pods restarting/OOMKilled

Symptom: intermittent SERVFAIL during upstream issues

8) Anti-patterns to avoid

9) Practical baseline checklist

10) References

5) Query amplification trap: `ndots` and search domains

Symptom: intermittent `SERVFAIL` during upstream issues