Linux Qdisc Selection Playbook (fq vs fq_codel vs CAKE)

2026-03-22 · software

Linux Qdisc Selection Playbook (fq vs fq_codel vs CAKE)

Date: 2026-03-22
Category: knowledge
Scope: Practical operator guide for choosing and tuning Linux qdiscs to control latency, fairness, and throughput at bottlenecks.


1) Why this matters

Many “random latency spikes” in production are simply queueing policy mistakes:

In practice, qdisc choice is a first-order SLO lever for p95/p99 network delay.


2) Mental model: what each qdisc is best at

fq (Fair Queue) — best for host egress pacing

Use when:

Key traits (from tc-fq(8)):

fq_codel — best general low-latency fairness without heavy shaping policy

Use when:

Key traits (from tc-fq_codel(8) and RFC 8290):

cake — best when you need shaping + fairness + practical ISP/link compensation

Use when:

Key traits (from tc-cake(8)):


3) Fast decision map

  1. Single server, locally generated egress, pacing quality is priority
    → start with fq.

  2. General gateway/edge, mixed traffic, want low latency + fairness with minimal policy
    → start with fq_codel.

  3. You must enforce a bandwidth ceiling at the bottleneck and account for link overhead
    → start with cake bandwidth ... (plus proper overhead mode).

  4. Datacenter ultra-low RTT microbursts
    → be careful with default Internet-scale constants; validate AQM constants against real RTT regime before rollout.


4) Baseline configs (safe starting points)

4.1 Host pacing baseline (fq)

tc qdisc replace dev eth0 root fq

When to add knobs:

4.2 Fairness + AQM baseline (fq_codel)

tc qdisc replace dev eth0 root fq_codel

If your bottleneck RTT is far from “internet-ish”, tune conservatively:

tc qdisc replace dev eth0 root fq_codel target 5ms interval 100ms ecn

(Defaults shown explicitly for auditability.)

4.3 Shaping baseline (cake)

tc qdisc replace dev eth0 root cake bandwidth 500Mbit diffserv3 triple-isolate

If Ethernet framing overhead matters at the true bottleneck:

tc qdisc replace dev eth0 root cake bandwidth 500Mbit ethernet diffserv3 triple-isolate

If NAT fairness is needed at this host:

tc qdisc replace dev eth0 root cake bandwidth 500Mbit diffserv3 triple-isolate nat

5) Observability: what to watch in tc -s qdisc

Run:

tc -s qdisc show dev eth0

For fq

For fq_codel

For cake

Operational rule: trend these metrics together with app p95/p99 latency and retransmit/ECN counters. Qdisc counters alone can look “healthy” while app tails burn.


6) Common failure modes (and fixes)

Failure mode A: “We enabled AQM but latency still spikes under load”

Likely causes:

Fix:

Failure mode B: “Throughput dropped after aggressive RTT profile”

Likely causes:

Fix:

Failure mode C: “One host still monopolizes link”

Likely causes:

Fix:

Failure mode D: “DiffServ policy looks right but user experience is wrong”

Likely causes:

Fix:


7) Rollout pattern (low risk)

  1. Canary interface/host with mirrored workload profile.
  2. Baseline capture: app latency, retransmits, tc -s, CPU.
  3. One change at a time (qdisc type first, then params).
  4. Guardrail thresholds: rollback if p99 latency or drop/mark ratio breaches limits.
  5. Document known-good profile per link type (DC LAN, metro WAN, consumer ISP edge).

8) Practical defaults (starting point, not dogma)

If you can’t explain where the true bottleneck queue lives, tuning qdisc knobs is mostly placebo.


9) References