Corporate Actions + Point-in-Time Price Series Playbook
If a trading stack gets corporate actions wrong, everything downstream lies: backtests, risk, fills attribution, even position PnL explain.
This playbook is a practical guide for building point-in-time-correct price/volume histories and keeping live + research behavior aligned.
1) Why this is operationally critical
Corporate actions are not “data cleanup.” They are state transitions in the tradable instrument:
- Splits / reverse splits change share count and price scale.
- Cash dividends affect total-return reality and some benchmark logic.
- Stock dividends / bonus issues / rights / spin-offs alter economics and continuity assumptions.
- Ticker/symbol changes can break joins while the legal instrument remains continuous.
- Delistings / relistings break survivorship-biased datasets if mishandled.
If adjustments are inconsistent, you get fake signals:
- phantom momentum breaks,
- false gap events,
- wrong volatility regimes,
- bad slippage benchmarks from mis-scaled historical prices.
2) Golden rules
- Store raw and adjusted data both. Never throw raw away.
- Point-in-time first. Any query at
Tmust only use info known atT. - Version action events. Corrections happen; event history must be auditable.
- Separate instrument identity from display symbol. Symbol is a label, not identity.
- Deterministic replays. Same input snapshot + code hash => same adjusted output.
3) Data model (minimal but production-safe)
A. Instrument identity table
instrument_id(stable internal key)primary_symbolisin/cusip(if available)listing_venue- validity range (
valid_from,valid_to)
B. Corporate actions ledger (bitemporal)
action_idinstrument_idaction_type(split, cash_dividend, rights, spin_off, symbol_change, etc.)effective_date(market-effective date)announcement_timeingested_at(system-time)- payload fields (ratio, amount, currency, parent-child mapping)
revision/supersedes_action_idstatus(active, canceled, corrected)
C. Price bars (raw)
instrument_idtsopen/high/low/closevolumevwap(optional)- source metadata (
vendor,batch_id, checksum)
D. Adjustment factors table
instrument_idts(bar timestamp)price_factorvolume_factorfactor_version- provenance (
built_from_action_snapshot)
4) Adjustment algebra (practical conventions)
Define adjusted values as:
adj_price = raw_price * price_factoradj_volume = raw_volume * volume_factor
For a split ratio N-for-M (e.g., 2-for-1):
- historical prices before effective date usually get multiplied by
M/N - historical volumes before effective date usually get multiplied by
N/M
Keep conventions explicit and immutable in docs/code.
Cash dividends
Two common tracks:
- Price-only adjusted (split-adjusted, no dividend total-return math)
- Total-return adjusted (dividends reinvested convention)
Do not mix these silently. Label series type in schema/API.
5) Point-in-time query contract
Given as_of_time = T, data retrieval must enforce:
- action records with
ingested_at <= T - latest active revision as of
T - only bars with
ts <= query_end - factor version derived from action snapshot known at
T
This prevents hindsight leakage when vendors backfill or correct actions later.
6) Live trading vs research consistency
Use one shared contract:
- Same adjustment engine/library for backtest and live analytics.
- Same identity mapping rules (instrument_id first, symbol second).
- Same event precedence when multiple actions coincide.
If live uses raw while research uses adjusted (or vice versa), your TCA and signal attribution diverge fast.
7) Edge cases that break systems
- Multiple actions same date (split + dividend + symbol change)
- Late corrections from vendor after market close
- Spin-off with partial historical reconstruction
- Cross-listing with venue-specific effective timing
- Fractional entitlements / odd-lot handling
- Delist/relist identity continuity confusion
Build explicit policy per case; “default guessing” causes silent model drift.
8) Quality controls (must-have monitors)
Structural checks
- Missing factor rows by date/instrument
- Non-positive adjusted prices
- Split-date discontinuity beyond tolerance
- Volume inversions after ratio events
Statistical checks
- Abnormal return spikes concentrated on action dates
- Cross-vendor factor disagreement rate
- Universe-level adjustment revision count per day
Operational checks
- Rebuild reproducibility hash mismatch alerts
- Action-ledger correction rate (7d rolling)
- Point-in-time replay diff count (should be zero for frozen snapshots)
9) Rollout pattern
- Shadow mode: produce adjusted series in parallel, no trading impact.
- Diff dashboards: compare legacy vs new for return, vol, factor exposures, TCA baselines.
- Guardrail thresholds: pause promotion if drift exceeds limits.
- Canary universe: enable for subset of symbols/strategies.
- Full cutover with reversible toggle + snapshot pinning.
10) Practical implementation checklist
- Stable instrument identity table exists
- Corporate actions ledger is bitemporal and versioned
- Raw bars immutable and checksummed
- Factors rebuilt deterministically from action snapshot
- Point-in-time API supports
as_of_time - Series type is explicit (
raw,split_adj,total_return_adj) - Replay tests cover late correction scenarios
- Daily monitors + on-call runbook documented
Bottom line
Corporate actions handling is not a data-engineering side quest; it is core trading infrastructure.
If you enforce identity continuity + bitemporal actions + deterministic factors + point-in-time queries, you can trust that a backtest date and a live date mean the same market reality.