Execution Simulator Fidelity Ladder Validation Playbook

2026-03-15 ยท finance

Execution Simulator Fidelity Ladder Validation Playbook

Date: 2026-03-15
Category: knowledge
Audience: small quant teams deploying live execution logic with limited incident budget


Why this playbook exists

Most teams either:

  1. trust backtests too much (fills are unrealistically easy), or
  2. overbuild simulation too early (high complexity, low decision value).

The practical answer is a fidelity ladder: use the simplest simulator that can falsify the decision you are currently making, then climb only when error signals justify it.


Core principle

Use simulation as a risk filter, not a PnL oracle.

For each strategy/tactic candidate, ask:

If not, do not promote.


Fidelity ladder (L0 -> L4)

L0 โ€” Deterministic Cost Skeleton

What it is:

Good for:

Do NOT use for:


L1 โ€” Replay with Causal Latency Injection

What it is:

Good for:

Promotion gate to L2:


L2 โ€” Queue-Aware MBP Simulator

What it is:

Good for:

Known limitation:


L3 โ€” Agent-Based Regime Simulator

What it is:

Good for:

Anti-footgun:


L4 โ€” Hybrid Digital Twin (Replay + Synthetic Shocks)

What it is:

Good for:


Minimum metrics per ladder run

Track at least:

For stress runs also track:


Promotion policy (practical)

A candidate moves upward only if all are true:

  1. Tail control: p95 slippage within budget in current ladder tier
  2. Completion safety: completion ratio above floor
  3. Operational stability: no runaway retry/cancel loops
  4. Robustness: ranking remains acceptable across seed/regime perturbations

A candidate moves to limited live canary only if:


Data contract checklist

Without this, simulator fidelity claims are mostly theater:


Common failure modes

  1. Single-regime overfitting

    • Fix: force calm/fragile/panic scenario suite in every promotion cycle.
  2. Mean-only validation

    • Fix: gate on q95/CVaR-style tails, not just average bps.
  3. Ignoring no-fill branches

    • Fix: include opportunity cost and deadline penalties explicitly.
  4. Unreproducible simulator runs

    • Fix: seed control + immutable manifests.
  5. Policy complexity outpacing observability

    • Fix: keep action space small until diagnostics are reliable.

30-day rollout template

Week 1:

Week 2:

Week 3:

Week 4:


Bottom line

Execution simulation should evolve like safety engineering:

If your simulator cannot predict every fill, that is fine. If it cannot expose brittle behavior before live capital does, it is not doing its job.


References (starting points)