Model Validation

Statistical proof that our edge is real — not luck. Bootstrap confidence intervals, calibration analysis, and risk metrics across 3 sports.

Generated 2026-03-24 | Model v4

Snapshot notice This validation export is 19 days old. Use live results for the latest production performance.
KEY RESULTS

At a Glance

1000
Backtested Trades
+12.9c
Avg c/Trade
73.0%
Win Rate
2/3
Statistically Significant
3/3
Profitable Sports

All results reflect the current exported validation snapshot for the deployed strategy configuration, including any pregame filters baked into the backtest.

This snapshot covers 3 sports (NBA, NCAAMB, NHL) and 1,000 trades from model v4. For current production outcomes, see live results. For access to the live edge feed, see pricing.

Performance Summary

Trades169
Win Rate70.4%
Avg c/Trade+12.6c
Total P&L$21.22
Backtest Typepoly_realistic

Statistical Significance

Bootstrap Mean+12.6c
95% CI[+5.4c, +19.6c]
99% CI[+3.1c, +21.5c]
p-value0.0007
InterpretationHighly significant

Risk Metrics

4.21
Sharpe Ratio
-624.0c
Max Drawdown
1.75
Profit Factor
33
Best Streak
12
Worst Streak

Equity Curve (Cumulative c)

Calibration (Predicted vs Actual) Brier: 0.2148

Performance by Edge Size

Edge Trades Win Rate Avg c/Trade
5-10c 43 74.4% +6.7c
10-15c 42 69.0% +5.0c
15-20c 25 72.0% +12.9c
20+c 59 67.8% +22.1c

Pregame Filter Impact (all metrics above use the filtered set)

Without Filter
299 trades | 64.2% WR | +6.4c
With Pregame ≥ 55c Filter
169 trades | 70.4% WR | +12.6c

Example Trades

Matchup Side Score Market Model Fair Edge Result P&L
CHI @ HOU home 57-61 73.0c 87.1c +14.1c WIN +27.0c
PHX @ DET home 62-66 54.0c 68.1c +14.1c WIN +46.0c
IND @ PHI home 50-55 58.0c 72.1c +14.1c WIN +42.0c
WSH vs LAC away 72-67 53.0c 67.2c +14.2c WIN +47.0c
CHI @ DET home 57-61 57.0c 72.1c +15.1c WIN +43.0c

Performance by Month

Month Trades Win Rate Avg c/Trade Total P&L
2026-01 161 72.7% +14.7c +2368c

Model Architecture

ArchitectureSplit-Phase XGBoost (early-game + clutch-time models)
Features14 engineered features
CalibrationIsotonic regression
Training Data5,285 games

Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_court, pregame_wp, score_diff_x_tf, score_diff_sq, total_score, score_diff_x_elo, pace_diff, ortg_diff, drtg_diff

Performance Summary

Trades781
Win Rate73.5%
Avg c/Trade+13.5c
Total P&L$105.45
Backtest Typepoly_realistic

Statistical Significance

Bootstrap Mean+13.5c
95% CI[+10.3c, +16.6c]
99% CI[+9.3c, +17.6c]
p-value0.0000
InterpretationHighly significant

Risk Metrics

4.74
Sharpe Ratio
-1122.0c
Max Drawdown
1.88
Profit Factor
66
Best Streak
9
Worst Streak

Equity Curve (Cumulative c)

Calibration (Predicted vs Actual) Brier: 0.1903

Performance by Edge Size

Edge Trades Win Rate Avg c/Trade
5-10c 204 72.5% +3.9c
10-15c 240 71.7% +4.3c
15-20c 97 71.1% +9.7c
20+c 240 77.1% +32.4c

Example Trades

Matchup Side Score Market Model Fair Edge Result P&L
DAY @ LAS home 51-41 61.0c 72.3c +11.3c WIN +37.0c
UGA @ TEX home 53-49 68.0c 79.3c +11.3c WIN +30.0c
GMU @ URI home 53-48 61.0c 72.3c +11.3c WIN +37.0c
WEB @ MTST home 45-42 68.0c 79.3c +11.3c WIN +30.0c
HC @ COLG home 37-40 61.0c 72.3c +11.3c WIN +37.0c

Performance by Month

Month Trades Win Rate Avg c/Trade Total P&L
2026-01 781 73.5% +13.5c +10545c

Model Architecture

ArchitectureSplit-Phase XGBoost (early-game + clutch-time models)
Features14 engineered features
CalibrationIsotonic regression
Training Data12,285 games

Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_court, pregame_wp, score_diff_x_tf, score_diff_sq, total_score, score_diff_x_elo, pace_diff, ortg_diff, drtg_diff

Performance Summary

Trades50
Win Rate74.0%
Avg c/Trade+5.2c
Total P&L$2.60
Backtest Typepoly_realistic

Statistical Significance

Bootstrap Mean+5.2c
95% CI[-7.5c, +17.0c]
99% CI[-11.8c, +20.9c]
p-value0.2000
InterpretationNot significant

Risk Metrics

1.87
Sharpe Ratio
-320.0c
Max Drawdown
1.30
Profit Factor
12
Best Streak
2
Worst Streak

Equity Curve (Cumulative c)

Calibration (Predicted vs Actual) Brier: 0.2053

Performance by Edge Size

Edge Trades Win Rate Avg c/Trade
5-10c 11 81.8% +8.5c
10-15c 22 72.7% +1.0c
15-20c 11 81.8% +16.9c
20+c 6 50.0% -7.0c

Pregame Filter Impact (all metrics above use the filtered set)

Without Filter
147 trades | 64.6% WR | -1.2c
With Pregame ≥ 55c Filter
50 trades | 74.0% WR | +5.2c

Example Trades

Matchup Side Score Market Model Fair Edge Result P&L
NYR @ WSH home 2-1 65.0c 77.7c +12.7c WIN +35.0c
ANA vs TB away 0-1 75.0c 88.0c +13.0c WIN +25.0c
ANA @ EDM home 3-2 78.0c 91.0c +13.0c WIN +22.0c
CBJ @ VGK home 3-2 76.0c 89.1c +13.1c WIN +24.0c
NYR @ LA home 3-2 65.0c 79.3c +14.3c WIN +35.0c

Performance by Month

Month Trades Win Rate Avg c/Trade Total P&L
2026-01 42 73.8% +5.3c +221c

Model Architecture

ArchitectureXGBoost + Isotonic calibration
Features12 engineered features
CalibrationIsotonic regression
Training Data4,225 games

Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_ice, pregame_wp, score_diff_x_tf, score_diff_sq, pace_diff, ortg_diff, drtg_diff

Benchmark Comparisons

How our model performs vs naive strategies. A model that can't beat simple baselines isn't worth using.

NBA

Strategy Win Rate c/Trade
Our Model 70.4% +12.6c
Random (50/50) 50.0% -2.0c
Market-Efficient 57.9% +0.0c

NCAAMB

Strategy Win Rate c/Trade
Our Model 73.5% +13.5c
Random (50/50) 50.0% -2.0c
Market-Efficient 58.5% -0.0c

NHL

Strategy Win Rate c/Trade
Our Model 74.0% +5.2c
Random (50/50) 50.0% -2.0c
Market-Efficient 68.8% +0.0c

Academic Foundation

Our approach is grounded in peer-reviewed research on sports prediction markets and probabilistic forecasting.

Beating the bookies with their own numbers - and how the online sports betting market is rigged
Kaunitz, Zhong, Kreiner (2017)
CLV validation - demonstrates that a positive closing line value strategy yields positive long-term returns
Verification of forecasts expressed in terms of probability
Brier, Glenn W. (1950)
Foundation for calibration analysis - Brier score measures probabilistic prediction accuracy
Using random forests to estimate win probability before each play of an NFL game
Lock, Dennis; Nettleton, Dan (2014)
In-game WP modeling methodology - random forests on game state features for real-time prediction
Why are gambling markets organised so differently from financial markets?
Levitt, Steven D. (2004)
Market efficiency analysis - sports markets exhibit inefficiencies exploitable by informed bettors
Optimal betting odds against insider traders
Shin, Hyun Song (1991)
Theoretical foundation for bookmaker pricing models and adverse selection in betting markets
A Brownian motion model for the progress of sports scores
Stern, Hal (1994)
Score-diff as Brownian motion - theoretical underpinning for WP models based on score differential and time

Methodology & Anti-Overfitting Safeguards

Training / test split — Models are trained on historical ESPN game-state snapshots (multiple seasons), then tested on held-out recent-season data using real Polymarket prices the model never saw during training. No future data leaks into features.

Realistic backtesting — Poly-price backtests use actual market snapshots and modeled execution costs rather than idealized fills. Entry prices reflect live market conditions at the time of the backtest snapshot.

Bootstrap confidence intervals — 10,000 resamples with replacement. The p-value is the fraction of bootstrap means ≤ 0, testing H0: "the model has no edge." A p < 0.05 means we're 95%+ confident the edge is real.

Calibration — Predictions are bucketed into 5%-wide bins (min 5 trades each). A well-calibrated model's dots land on the diagonal; points below the line indicate overconfidence.

Sharpe ratio — Annualized (sqrt(252) scaling) on per-trade P&L. Values above 1.0 indicate strong risk-adjusted returns; above 3.0 is exceptional.

Profit factor — Gross wins / gross losses. Above 1.25 = profitable. Above 1.5 = strong. Above 2.0 = excellent.

Fee assumptions — Results are shown net of the execution-cost assumptions used when this validation export was generated. Live exchange fees and market microstructure can change over time.

Pregame filter — For NBA and NHL, the deployed strategy requires the pregame market price to agree with the model's bet side at ≥55c. This filters out trades where the model disagrees with market consensus, reducing adverse selection.

Turn proof into a next step

Use the live API, review current trading results, or learn how the models are built.