Statistical proof that our edge is real — not luck. Bootstrap confidence intervals, calibration analysis, and risk metrics across 7 sports.
Generated 2026-04-13 | Model v4
1562 resolved trades · 48.5% win rate · 1543 with measured CLV (98.8% coverage)
Backtest snapshot below is the formal validation slice used for bootstrap CIs and significance tests. The live ledger above is what's actually running in production right now.
All results reflect the current exported validation snapshot for the deployed strategy configuration, including any pregame filters baked into the backtest.
This snapshot covers 7 sports (ATP, CS2, LOL, MLB, NBA, NHL, SOCCER) and 361 trades from model v4. For current production outcomes, see live results. For access to the live edge feed, see pricing.
Backtests show what the model would have done. The CLV gap shows whether it's beating the market now. Across 950 trades in our public gap-analysis subset, CLV-positive entries win 89.9%; CLV-negative entries win 11.2%. Two-proportion Z = 24.27, p ≈ 10−130. Reproducible from the public dataset.
What does each page show?
| Trades | 30 |
| Win Rate | 60.0% |
| Avg c/Trade | +11.3c |
| Total P&L | $3.38 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +11.3c |
| 95% CI | [-5.9c, +27.5c] |
| 99% CI | [-11.2c, +32.5c] |
| p-value | 0.0941 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 10 | 90.0% | +34.4c |
| 10-15c | 9 | 22.2% | -19.4c |
| 15-20c | 3 | 66.7% | +14.7c |
| 20+c | 8 | 62.5% | +15.6c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +8.9c | WIN | +38.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +9.1c | WIN | +50.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +9.1c | WIN | +40.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +9.7c | WIN | +63.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +18.1c | WIN | +49.0c |
| Trades | 82 |
| Win Rate | 42.7% |
| Avg c/Trade | -6.2c |
| Total P&L | $-5.07 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | -6.2c |
| 95% CI | [-16.0c, +3.7c] |
| 99% CI | [-18.9c, +7.1c] |
| p-value | 0.8893 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 16 | 50.0% | -11.8c |
| 10-15c | 15 | 46.7% | -12.5c |
| 15-20c | 22 | 50.0% | -4.0c |
| 20+c | 29 | 31.0% | -1.5c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.5c | WIN | +36.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +14.0c | WIN | +30.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.3c | WIN | +0.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.5c | WIN | +47.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.6c | WIN | +60.0c |
| Trades | 38 |
| Win Rate | 52.6% |
| Avg c/Trade | +7.6c |
| Total P&L | $2.90 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +7.6c |
| 95% CI | [-6.0c, +21.1c] |
| 99% CI | [-10.7c, +25.1c] |
| p-value | 0.1368 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 7 | 57.1% | +3.4c |
| 10-15c | 9 | 55.6% | +14.6c |
| 15-20c | 8 | 75.0% | +24.4c |
| 20+c | 14 | 35.7% | -4.3c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.6c | WIN | +28.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.8c | WIN | +44.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.8c | WIN | +53.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.3c | WIN | +40.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.9c | WIN | +32.0c |
| Trades | 105 |
| Win Rate | 68.6% |
| Avg c/Trade | +6.9c |
| Total P&L | $7.23 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +6.9c |
| 95% CI | [-2.6c, +15.6c] |
| 99% CI | [-5.3c, +18.5c] |
| p-value | 0.0724 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 54 | 72.2% | +7.0c |
| 10-15c | 23 | 69.6% | +6.3c |
| 15-20c | 5 | 20.0% | -41.4c |
| 20+c | 23 | 69.6% | +17.7c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +11.5c | WIN | +25.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.1c | WIN | +42.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.1c | WIN | +47.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.4c | WIN | +35.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.8c | WIN | +55.0c |
| Trades | 15 |
| Win Rate | 46.7% |
| Avg c/Trade | -14.7c |
| Total P&L | $-2.21 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | -14.7c |
| 95% CI | [-37.9c, +8.4c] |
| 99% CI | [-43.9c, +14.3c] |
| p-value | 0.8970 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 5 | 60.0% | -7.2c |
| 10-15c | 4 | 25.0% | -40.0c |
| 15-20c | 4 | 50.0% | -7.2c |
| 20+c | 2 | 50.0% | +2.0c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +8.2c | WIN | +45.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +10.1c | WIN | +22.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +17.2c | WIN | +31.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +17.4c | WIN | +33.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +24.3c | WIN | +39.0c |
| Architecture | Split-Phase XGBoost (early-game + clutch-time models) |
| Features | 14 engineered features |
| Calibration | Isotonic regression |
| Training Data | 5,285 games |
Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_court, pregame_wp, score_diff_x_tf, score_diff_sq, total_score, score_diff_x_elo, pace_diff, ortg_diff, drtg_diff
| Trades | 66 |
| Win Rate | 66.7% |
| Avg c/Trade | +5.8c |
| Total P&L | $3.84 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +5.8c |
| 95% CI | [-5.3c, +16.5c] |
| 99% CI | [-8.6c, +19.6c] |
| p-value | 0.1529 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 15 | 66.7% | +4.0c |
| 10-15c | 20 | 65.0% | +0.9c |
| 15-20c | 17 | 82.4% | +20.5c |
| 20+c | 14 | 50.0% | -3.1c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.5c | WIN | +36.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.5c | WIN | +39.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +13.8c | WIN | +28.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +14.6c | WIN | +25.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.5c | WIN | +0.0c |
| Architecture | XGBoost + Isotonic calibration |
| Features | 12 engineered features |
| Calibration | Isotonic regression |
| Training Data | 4,225 games |
Features: score_diff, time_fraction, home_elo, away_elo, elo_diff, home_ice, pregame_wp, score_diff_x_tf, score_diff_sq, pace_diff, ortg_diff, drtg_diff
| Trades | 11 |
| Win Rate | 45.5% |
| Avg c/Trade | +0.6c |
| Total P&L | $0.07 |
| Backtest Type | poly_realistic |
| Bootstrap Mean | +0.6c |
| 95% CI | [-25.6c, +28.6c] |
| 99% CI | [-32.7c, +38.0c] |
| p-value | 0.5028 |
| Interpretation | Not significant |
| Edge | Trades | Win Rate | Avg c/Trade |
|---|---|---|---|
| 5-10c | 4 | 50.0% | +6.8c |
| 10-15c | 3 | 33.3% | -2.3c |
| 15-20c | 2 | 50.0% | +15.0c |
| 20+c | 2 | 50.0% | -21.5c |
| Matchup | Side | Score | Market | Model Fair | Edge | Result | P&L |
|---|---|---|---|---|---|---|---|
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +8.2c | WIN | +58.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +8.5c | WIN | +62.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +12.8c | WIN | +66.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +15.9c | WIN | +58.0c |
| ? vs ? | BUY | ?-? | 0.0c | 0.0c | +22.5c | WIN | +0.0c |
How our model performs vs naive strategies. A model that can't beat simple baselines isn't worth using.
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 60.0% | +11.3c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 42.7% | -6.2c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 52.6% | +7.6c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 68.6% | +6.9c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 46.7% | -14.7c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 100.0% | +27.2c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 75.0% | +20.0c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 66.7% | +5.8c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 45.5% | +0.6c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
| Strategy | Win Rate | c/Trade |
|---|---|---|
| Our Model | 100.0% | +35.0c |
| Random (50/50) | 50.0% | -2.0c |
| Market-Efficient | 0.0% | +0.0c |
Our approach is grounded in peer-reviewed research on sports prediction markets and probabilistic forecasting.
Training / test split — Models are trained on historical ESPN game-state snapshots (multiple seasons), then tested on held-out recent-season data using real Polymarket prices the model never saw during training. No future data leaks into features.
Realistic backtesting — Poly-price backtests use actual market snapshots and modeled execution costs rather than idealized fills. Entry prices reflect live market conditions at the time of the backtest snapshot.
Bootstrap confidence intervals — 10,000 resamples with replacement. The p-value is the fraction of bootstrap means ≤ 0, testing H0: "the model has no edge." A p < 0.05 means we're 95%+ confident the edge is real.
Calibration — Predictions are bucketed into 5%-wide bins (min 5 trades each). A well-calibrated model's dots land on the diagonal; points below the line indicate overconfidence.
Sharpe ratio — Annualized (sqrt(252) scaling) on per-trade P&L. Values above 1.0 indicate strong risk-adjusted returns; above 3.0 is exceptional.
Profit factor — Gross wins / gross losses. Above 1.25 = profitable. Above 1.5 = strong. Above 2.0 = excellent.
Fee assumptions — Results are shown net of the execution-cost assumptions used when this validation export was generated. Live exchange fees and market microstructure can change over time.
Pregame filter — For NBA and NHL, the deployed strategy requires the pregame market price to agree with the model's bet side at ≥55c. This filters out trades where the model disagrees with market consensus, reducing adverse selection.
Turn proof into a next step
Use the live API, review current trading results, or learn how the models are built.