Microprice + Order-Book Imbalance: A Practical Modeling Playbook
If you only look at mid-price, you miss the queue pressure that often drives the next short-horizon move.
For intraday execution and market making, microprice + imbalance features are a compact, high-signal baseline before going full deep-learning.
One-Line Intuition
Mid-price tells you where price is; microprice + imbalance tell you where price pressure is.
Core Definitions You Actually Need
Let:
- Best bid price/size: ((P_b, Q_b))
- Best ask price/size: ((P_a, Q_a))
- Mid-price: (M = (P_b + P_a)/2)
- Spread: (S = P_a - P_b)
1) Top-of-book imbalance
Two common forms:
[ I_t^{(ratio)} = \frac{Q_b}{Q_b + Q_a}, \qquad I_t^{(signed)} = \frac{Q_b - Q_a}{Q_b + Q_a} ]
Interpretation:
- Higher bid-side size (relative to ask) -> upward short-horizon pressure
- Higher ask-side size -> downward pressure
2) Classic level-1 microprice
A common weighted estimator:
[ \mu_t = \frac{P_a Q_b + P_b Q_a}{Q_b + Q_a} ]
Equivalent view:
[ \mu_t = M_t + \frac{S_t}{2} \cdot I_t^{(signed)} ]
So microprice is mid-price adjusted by spread and queue imbalance.
3) Order-flow imbalance (OFI)
At event level, track queue changes (new/cancel/execute) rather than static snapshot only:
[ \text{OFI}_{[t,t+\Delta]} = \sum e_n ]
where (e_n) is signed contribution from best bid/ask queue updates. In practice this often explains immediate (\Delta P) better than plain traded volume.
Empirical Facts Worth Building Around
- Short-horizon price change is strongly linked to order-flow imbalance (linear first-order approximation is surprisingly useful).
- Depth matters as a scaling denominator: same OFI has smaller impact in deeper books.
- Volume alone is noisier than queue-aware flow metrics at short horizons.
- State dependence is real: spread regime, queue shape, and event intensity alter the mapping from imbalance to future move.
Practical Feature Set (Strong Baseline)
For horizons like 100ms / 500ms / 1s:
- L1 snapshot: (S_t, I_t^{(signed)}, \mu_t - M_t)
- Multi-level imbalance (L1~L5 or L10)
- OFI windows: 50ms/100ms/500ms rolling
- Event intensity: updates/sec, trades/sec, cancels/sec
- Queue turnover: cancel-to-add ratio by side
- Volatility proxy: microprice variance in rolling micro-window
- Time-of-day phase (open/close/midday)
Minimal engineered target examples:
- Classification: (\operatorname{sign}(M_{t+h} - M_t))
- Regression: (M_{t+h} - M_t)
- Execution-oriented: expected markout at (h)
Modeling Ladder (Start Simple, Then Escalate)
Stage A — Fast linear baseline
- Ridge/Lasso on ([I, \text{OFI}, S, depth])
- Separate models by spread regime (1 tick vs >1 tick)
- Refit frequently (intraday/rolling)
Stage B — Regime-aware nonlinear model
- Gradient boosting or small MLP
- Interaction terms: (\text{OFI}/\text{depth}), (I \times S), cancel intensity × spread
- Calibrate per symbol cluster (liquid/illiquid buckets)
Stage C — Sequence model
- CNN/LSTM/Transformer over event stream or LOB tensors
- Keep microprice/OFI features as explicit channels (helps robustness/debuggability)
- Use strict walk-forward + latency-aware inference budget
Calibration & Monitoring (Where Most Systems Fail)
1) Calibration
- Check reliability by predicted move-probability deciles
- Maintain horizon-specific calibration (100ms model calibration rarely transfers to 1s)
2) Drift diagnostics
Monitor at least:
- hit-rate by spread regime
- signed error conditional on imbalance decile
- realized markout vs predicted markout
- feature distribution drift (PSI/KS on (I), OFI, spread, depth)
3) Risk gating
Downweight or suspend signal when:
- spread widens abruptly
- cancellation burst exceeds threshold
- market data gap / sequence uncertainty
- auction or halt-transition regime detected
Execution Integration (Decision Layer)
Use forecast as one input, not sole trigger:
- If upward microprice pressure + low adverse-selection score -> lean passive bid / delay aggressive buy
- If downward pressure while long inventory -> accelerate passive unwind or cross smaller slices
- Tie decision to inventory, urgency, and remaining schedule budget
A practical control form:
[ \text{Aggressiveness} = f(\text{forecast edge}, \text{inventory risk}, \text{time urgency}, \text{slippage budget left}) ]
Common Mistakes
- Training on snapshot imbalance only; ignoring event-flow dynamics
- Mixing horizons (features at 50ms, labels at 2s) without explicit rationale
- Using random CV instead of chronological walk-forward
- Ignoring queue position and fill uncertainty in execution evaluation
- Treating paper alpha as deployable alpha without inference-latency accounting
Minimal Implementation Checklist
- Build event-time LOB reconstruction with deterministic replay
- Compute L1/Lk imbalance + OFI features
- Train per-horizon baseline (linear + tree)
- Evaluate with walk-forward and markout-oriented metrics
- Add calibration + drift monitor
- Integrate with execution policy under risk gates
- Promote via champion/challenger rollout
One-Sentence Summary
Microprice and order-book imbalance are low-latency, high-value priors for short-horizon direction and execution timing; the edge comes from regime-aware calibration, drift control, and disciplined execution integration—not from a fancy model alone.
References (Starter Set)
- Cont, R., Kukanov, A., Stoikov, S. (2014). The Price Impact of Order Book Events. Journal of Financial Econometrics. arXiv:1011.6402 — https://arxiv.org/abs/1011.6402
- Huang, W., Lehalle, C.-A., Rosenbaum, M. (2015). Simulating and Analyzing Order Book Data: The Queue-Reactive Model. JASA. arXiv:1312.0563 — https://arxiv.org/abs/1312.0563
- Stoikov, S. (2018). The Micro-Price: A High-Frequency Estimator of Future Prices. SSRN 2970694 — https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2970694
- Zhang, Z., Zohren, S., Roberts, S. (2019). DeepLOB. IEEE TSP. arXiv:1808.03668 — https://arxiv.org/abs/1808.03668
- Blakely, C. D. (2024). High resolution microprice estimates... arXiv:2411.13594 — https://arxiv.org/abs/2411.13594