Linux blk-mq I/O Scheduler Selection Playbook (none vs mq-deadline vs BFQ vs Kyber)

2026-03-17 · software

Linux blk-mq I/O Scheduler Selection Playbook (none vs mq-deadline vs BFQ vs Kyber)

Date: 2026-03-17
Category: knowledge

Why this matters

On modern Linux, storage performance failures are often tail-latency problems, not average-throughput problems.

Wrong scheduler choice can look like:

The right choice is workload-specific. This playbook is a practical way to choose, test, and roll out safely.


1) Quick mental model

With blk-mq, each device can use a pluggable scheduler per block queue.

Common options you’ll see in /sys/block/<dev>/queue/scheduler:

The active scheduler is the one in brackets.


2) Fast decision matrix

A) NVMe / SSD, single-tenant, throughput-first

Start with: none

Why:

B) NVMe / SSD, mixed read+sync-write with tail SLOs

Start with: mq-deadline

Why:

C) Multi-tenant/shared host where fairness matters

Start with: mq-deadline (server), bfq (desktop/interactive-heavy)

Why:

D) Desktop/workstation interactivity under heavy background I/O

Start with: bfq

Why:

E) You need latency targets as explicit knobs and Kyber is available

Try: kyber

Why:

Caveat: validate carefully in your kernel/distros; operational ecosystem is usually richer around none/mq-deadline.


3) Discovery commands (5 minutes)

# 1) list scheduler choices + current one
cat /sys/block/<dev>/queue/scheduler

# 2) rotational hint (1=HDD, 0=non-rotating)
cat /sys/block/<dev>/queue/rotational

# 3) queue depth-ish context
cat /sys/block/<dev>/queue/nr_requests

# 4) current i/o pressure and latency view (user-space)
iostat -x 1

Temporary switch (until reboot):

echo mq-deadline | sudo tee /sys/block/<dev>/queue/scheduler

4) Baseline-first policy (don’t tune blind)

For each candidate scheduler, collect the same KPI bundle:

Run at least 3 load shapes:

  1. read-heavy,
  2. mixed read/write,
  3. bursty sync-write + background scan (tail killer scenario).

Rule: if improvement appears only in synthetic throughput but hurts app p99, reject it.


5) Practical tuning anchors

mq-deadline

Key tunables (under /sys/block/<dev>/queue/iosched/):

Operational heuristics:

kyber

Primary knobs:

Interpretation: it throttles to hit target latency classes; too aggressive targets can tank throughput.

bfq

Use when interactive fairness is worth overhead.

none

No scheduler-level fairness/reordering policy.


6) ioprio reality check

Kernel docs note that I/O priorities are scheduler-dependent, currently supported by BFQ and mq-deadline.

Implication:


7) Persistent configuration pattern

Set scheduler persistently via udev rule (example):

# /etc/udev/rules.d/60-ioscheduler.rules
ACTION=="add|change", KERNEL=="nvme*n1", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sd*", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="mq-deadline"

Then reload and trigger:

sudo udevadm control --reload-rules
sudo udevadm trigger

Always verify post-boot state in CI/boot checks.


8) Rollout plan (safe)

  1. Shadow benchmark on representative hardware (same kernel + FS + mount options).
  2. Canary hosts (5–10%) with rollback-ready automation.
  3. Compare 24h diurnal traffic:
    • p99 latency,
    • timeout/retry rate,
    • CPU usage,
    • incident count.
  4. Promote gradually if p99 improves and no reliability regressions.
  5. Keep explicit rollback command + runbook.

Rollback must be one command, not a wiki adventure.


9) Common mistakes

  1. Treating one scheduler as universally best

    • Device type + workload + contention pattern decides.
  2. Benchmarking only max throughput

    • Real incidents are usually p99 latency + retry storms.
  3. Ignoring scheduler-dependent controls

    • ionice and fairness behavior change with scheduler choice.
  4. Forgetting persistence

    • Runtime echo test passes, reboot silently reverts.
  5. Tuning before establishing a clean baseline

    • You can’t optimize what you didn’t measure.

10) Minimal "good enough" defaults

If you need a pragmatic starting point:

Then measure and adapt. No static rule beats your own traces.


References