eBPF Event Channel Selection Playbook (Ring Buffer vs Perf Buffer)

Date: 2026-03-22
Category: knowledge
Scope: Practical operator guide for choosing between BPF_MAP_TYPE_RINGBUF and BPF_MAP_TYPE_PERF_EVENT_ARRAY when streaming eBPF events to userspace.

1) Why this matters

In many eBPF systems, your "event pipe" becomes the real bottleneck before probe cost does.

Typical production failures are not verifier failures—they are transport-shape failures:

global event ordering assumptions break,
per-CPU memory costs explode,
userspace wakeup strategy burns CPU or increases latency,
burst loss silently biases observability.

Choosing the wrong channel can make an otherwise good eBPF program look unstable.

2) Mental model

Perf buffer (`BPF_MAP_TYPE_PERF_EVENT_ARRAY`)

Think: per-CPU lanes backed by perf events.

Mature and widely used.
Natural per-CPU isolation.
Events from different CPUs can be observed out of global time order.
Requires perf-event setup/population in the map.

Ring buffer (`BPF_MAP_TYPE_RINGBUF`)

Think: shared MPSC queue across CPUs.

Single shared ring (not per-CPU by default).
Better memory efficiency for many workloads.
Preserves reservation order across producers (helps cross-CPU event ordering use cases).
Designed for simpler kernel→userspace event streaming.

3) Fast decision map

Need coherent cross-CPU event ordering (e.g., fork/exec/exit chains, cross-core causality traces)?
→ Prefer ring buffer.
Need strict per-CPU isolation and already have perf-event tooling/integration?
→ Perf buffer remains valid.
Memory pressure from per-CPU buffers is significant?
→ Prefer ring buffer.
Large existing codebase already stable on perf buffer and no ordering pain?
→ Keep perf buffer unless clear KPI upside justifies migration.

4) Core API differences (operator-relevant)

4.1 Kernel-side write path

Perf buffer

Helper: bpf_perf_event_output()
Destination map type: BPF_MAP_TYPE_PERF_EVENT_ARRAY
Important caveat: writing to a perf event on a different CPU fails (-EOPNOTSUPP), so BPF_F_CURRENT_CPU is usually safest.

Ring buffer

Map type: BPF_MAP_TYPE_RINGBUF
Two write styles:
- bpf_ringbuf_output() (copy path; allows more flexible record sizing),
- bpf_ringbuf_reserve() + bpf_ringbuf_submit() / bpf_ringbuf_discard() (zero-copy style; reservation then commit/discard).
reserve() path requires verifier-checkable bounded access; practical pattern is fixed-size structs.

4.2 Userspace consume path (libbpf)

Perf buffer

perf_buffer__new(...)
perf_buffer__poll(...)
Typically register sample and lost callbacks.

Ring buffer

ring_buffer__new(...)
ring_buffer__poll(...)
Single callback style for consumed records.

5) Capacity planning and latency trade-offs

5.1 Perf buffer sizing

Capacity is effectively multiplied by CPU count (per-CPU rings).
Good for per-core burst isolation.
Bad for memory footprint at high CPU counts.
Cross-CPU "global timeline" reconstruction must be done in userspace.

5.2 Ring buffer sizing

max_entries defines one shared ring size (power-of-2; key/value size must be 0).
Better aggregate memory efficiency.
Shared queue can experience contention/hot bursts from many producers.
Large records can create head-of-line effects for downstream consumers if userspace falls behind.

6) Notification strategy (easy to get wrong)

Ring buffer supports adaptive wakeups by default; can be overridden with:

BPF_RB_NO_WAKEUP
BPF_RB_FORCE_WAKEUP

Practical guidance:

Start with default adaptive behavior.
Use NO_WAKEUP only if you intentionally batch and can prove bounded latency.
Use FORCE_WAKEUP only when low-latency signaling beats CPU cost in your SLO.

7) Failure modes and mitigations

Failure mode A: "Events seem reordered"

Common with perf buffer across CPUs.
Mitigation: add monotonic timestamps + CPU id + sequence fields; or migrate to ring buffer if global ordering is operationally required.

Failure mode B: "Invisible drops under burst"

Root cause: userspace cannot drain fast enough; reservations/output fail.
Mitigation:
- add explicit dropped counters in BPF maps,
- monitor userspace consume lag,
- increase buffer size and/or reduce event volume (sample/aggregate/filter in-kernel).

Failure mode C: "CPU burn from polling"

Busy polling for ultra-low latency can overheat cores.
Mitigation: begin with epoll/poll timeout strategy; only move to tight busy loops with measured SLO wins.

Failure mode D: "Migration breaks verifier assumptions"

reserve/submit migration can fail when variable-size record paths were previously used.
Mitigation: use bpf_ringbuf_output() first for compatibility, then optimize hot paths to fixed-size reserve/submit patterns.

8) Migration playbook (perf → ring, low risk)

Dual-instrument in staging: keep perf path, add ring path counters.
Schema freeze for event payloads (version field + backward-compatible parser).
Shadow consume ring buffer without acting on it; compare event rate, lag, drop counters.
Canary cutover by host slice; track CPU, loss, end-to-end event latency.
Rollback rule: auto-revert if loss/lag breaches threshold for N consecutive windows.

9) Minimal implementation skeletons

Ring buffer map (BPF side)

struct {
  __uint(type, BPF_MAP_TYPE_RINGBUF);
  __uint(max_entries, 1 << 24); // example: 16 MiB
} events SEC(".maps");

Perf event array map (BPF side)

struct {
  __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
  __uint(key_size, sizeof(__u32));
  __uint(value_size, sizeof(__u32));
  __uint(max_entries, 0); // loader usually sets to nr_cpus
} events SEC(".maps");

(Exact sizing and loader behavior should be standardized in your team template.)

10) Operational recommendation

For new event-streaming eBPF projects, default to ring buffer unless you have a specific per-CPU perf-event requirement.

For legacy stable perf-buffer deployments, migrate only when one of these is true:

cross-CPU ordering correctness matters,
per-CPU memory overhead is painful,
measurable throughput/latency gain is expected from ring path.

11) References

Linux kernel docs — BPF ring buffer (design, semantics, implementation)
https://docs.kernel.org/6.6/bpf/ringbuf.html
Linux kernel docs — BPF maps overview
https://docs.kernel.org/next/bpf/maps.html
eBPF docs — BPF_MAP_TYPE_RINGBUF
https://docs.ebpf.io/linux/map-type/BPF_MAP_TYPE_RINGBUF/
eBPF docs — BPF_MAP_TYPE_PERF_EVENT_ARRAY
https://docs.ebpf.io/linux/map-type/BPF_MAP_TYPE_PERF_EVENT_ARRAY/
eBPF helper docs — bpf_perf_event_output
https://docs.ebpf.io/linux/helper-function/bpf_perf_event_output/
eBPF helper docs — bpf_ringbuf_output / bpf_ringbuf_reserve
https://docs.ebpf.io/linux/helper-function/bpf_ringbuf_output/
https://docs.ebpf.io/linux/helper-function/bpf_ringbuf_reserve/
libbpf userspace API docs
https://libbpf.readthedocs.io/en/latest/api.html
libbpf userspace function notes (ring_buffer__poll, perf_buffer__poll)
https://docs.ebpf.io/ebpf-library/libbpf/userspace/ring_buffer__poll/
https://docs.ebpf.io/ebpf-library/libbpf/userspace/perf_buffer__poll/

eBPF Event Channel Selection Playbook (Ring Buffer vs Perf Buffer)

eBPF Event Channel Selection Playbook (Ring Buffer vs Perf Buffer)

1) Why this matters

2) Mental model

Perf buffer (BPF_MAP_TYPE_PERF_EVENT_ARRAY)

Ring buffer (BPF_MAP_TYPE_RINGBUF)

3) Fast decision map

4) Core API differences (operator-relevant)

4.1 Kernel-side write path

Perf buffer

Ring buffer

4.2 Userspace consume path (libbpf)

Perf buffer

Ring buffer

5) Capacity planning and latency trade-offs

5.1 Perf buffer sizing

5.2 Ring buffer sizing

6) Notification strategy (easy to get wrong)

7) Failure modes and mitigations

Failure mode A: "Events seem reordered"

Failure mode B: "Invisible drops under burst"

Failure mode C: "CPU burn from polling"

Failure mode D: "Migration breaks verifier assumptions"

8) Migration playbook (perf → ring, low risk)

9) Minimal implementation skeletons

Ring buffer map (BPF side)

Perf event array map (BPF side)

10) Operational recommendation

11) References

Perf buffer (`BPF_MAP_TYPE_PERF_EVENT_ARRAY`)

Ring buffer (`BPF_MAP_TYPE_RINGBUF`)