Module 4: Backtesting for Sports Betting
**Build a Polymarket Prediction Bot from Scratch**
---
This is where most people blow up. They build a model that looks incredible in testing, backtest it, see 75% win rates, go live, and proceed to lose money for three straight weeks.
The problem is almost never the model. It's the backtest.
A bad backtest doesn't just give you wrong numbers -- it gives you *confidence* in wrong numbers. You size up, you run it longer, you double down when the losses start because "the backtest said 73% win rate." By the time you realize the backtest was flawed, you've lost real money.
This module covers:
1. How sports betting backtests differ from standard ML evaluation
2. The hold-to-settlement strategy and its economics
3. Building a rigorous backtester from scratch
4. The five mistakes that make every backtest look amazing (and lose money live)
5. Analyzing results the right way
6. Parameter sensitivity -- finding the real sweet spot vs. overfitting
7. Execution cost modeling -- what your backtest forgets
How Sports Betting Backtesting Differs from ML Evaluation
In ML, you care about accuracy, precision, recall, AUC, Brier score. You split train/test, evaluate, done.
In sports betting, **a model with worse Brier score can make more money**. This is not a paradox -- it's because you don't bet on every game. You only bet when your model disagrees with the market by enough to cover costs. The question isn't "how accurate is the model overall?" It's "how accurate is the model *on the subset of games where it disagrees with the market?*"
A model that's slightly miscalibrated but identifies genuine edges will crush a perfectly calibrated model that agrees with the market on everything.
This means:
**Train/test split isn't enough.** You need to simulate the actual trading logic: filters, position limits, edge thresholds.
**Accuracy on the full dataset is irrelevant.** Only accuracy on *triggered trades* matters.
**You must model costs.** A 3c edge is real in a backtest and fake in production after fees + slippage.
# ── Install required packages (run this cell first!) ──────────────────────────
# Uncomment the line below and run if you haven't installed these yet:
# !pip install pandas numpy scikit-learn xgboost matplotlib
How to use AI with this notebook
**New to Python? No problem.** Every cell in this notebook is designed to work with AI coding assistants.
If you get stuck on any cell:
1. **Copy the cell** into Claude, ChatGPT, or any AI assistant
2. **Ask:** "Explain this code line by line"
3. **To customize:** "Help me modify this for soccer instead of NBA"
4. **To debug:** Paste the error message and ask "How do I fix this?"
5. **To extend:** "Add a feature that tracks home/away win streaks"
Think of the AI as a patient tutor sitting next to you. The notebooks give you working code — the AI helps you understand and extend it.
> **Pro tip:** If a cell is confusing, ask the AI: "Explain this to me like I've never written Python before." It will break down every line.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import SplineTransformer
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.isotonic import IsotonicRegression
import warnings
warnings.filterwarnings('ignore')
plt.rcParams['figure.dpi'] = 120
plt.rcParams['font.size'] = 11
plt.rcParams['axes.grid'] = True
plt.rcParams['grid.alpha'] = 0.3
print("Module 4: Backtesting for Sports Betting")
print("Imports loaded.")
---
1. The Hold-to-Settlement Strategy
On Polymarket, sports contracts resolve to **$0.00 or $1.00**. There is no partial payout. This is fundamentally different from stock trading where you exit at some intermediate price.
**The economics are simple:**
You BUY a contract at some price (the "entry"), e.g., **63 cents**
If the team wins: contract resolves to $1.00, you profit **(100 - 63) = 37 cents**
If the team loses: contract resolves to $0.00, you lose **63 cents**
No exit fees, no slippage on resolution -- settlement is free and exact
**Expected Value:**
$$\text{EV} = p_{\text{fair}} \times (1 - \text{price}) - (1 - p_{\text{fair}}) \times \text{price} = p_{\text{fair}} - \text{price}$$
If your model says the true probability is **72%** and you buy at **63 cents**:
$$\text{EV} = 0.72 - 0.63 = +0.09 = +9\text{c per share}$$
That's the edge. It's additive and it's the only number that matters.
Why Hold-to-Settlement?
You could try to trade in and out -- buy at 63c, sell at 70c when the game swings. But this introduces:
Exit slippage (orderbook may be thin)
Taker fee on exit (~2c)
Timing risk (when do you sell?)
The need to model *price movement*, not just *outcome probability*
Hold-to-settlement removes all of these. You only need to answer one question: **"What is the true probability that this team wins?"** If you're right more often than the market, you make money. Period.
# Demonstrate the hold-to-settlement EV math
def calculate_ev(fair_wp, entry_price):
"""Calculate expected value in cents.
fair_wp: model's estimated probability (0-1)
entry_price: what we pay in cents (0-100)
"""
entry_frac = entry_price / 100
ev = fair_wp - entry_frac
profit_if_win = 100 - entry_price
loss_if_lose = -entry_price
ev_check = fair_wp * profit_if_win + (1 - fair_wp) * loss_if_lose
return ev * 100, ev_check # both in cents, should match
# Example scenarios
scenarios = [
(0.72, 63, "Model says 72%, buy at 63c"),
(0.72, 72, "Model says 72%, buy at 72c (no edge)"),
(0.72, 80, "Model says 72%, buy at 80c (negative EV!)"),
(0.55, 45, "Model says 55%, buy at 45c"),
(0.85, 75, "Model says 85%, buy at 75c"),
]
print("Hold-to-Settlement Economics")
print("=" * 70)
print(f"{'Scenario':<42} {'EV (c)':>8} {'Win P/L':>8} {'Lose P/L':>9}")
print("-" * 70)
for fair, entry, desc in scenarios:
ev, _ = calculate_ev(fair, entry)
win_pl = 100 - entry
lose_pl = -entry
marker = " <-- edge" if ev > 0 else (" <-- NO edge" if ev == 0 else " <-- LOSING")
print(f"{desc:<42} {ev:>+7.1f}c {win_pl:>+7.1f}c {lose_pl:>+8.1f}c{marker}")
# Visualize: How edge scales with fair_wp - market_price
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Left: EV surface
fair_wps = np.arange(0.50, 0.95, 0.01)
entry_prices = np.arange(40, 85, 1)
FW, EP = np.meshgrid(fair_wps, entry_prices)
EV = (FW - EP / 100) * 100 # in cents
ax = axes[0]
c = ax.contourf(FW * 100, EP, EV, levels=np.arange(-30, 35, 5), cmap='RdYlGn')
plt.colorbar(c, ax=ax, label='EV (cents/share)')
ax.contour(FW * 100, EP, EV, levels=[0], colors='black', linewidths=2)
ax.set_xlabel('Model Fair WP (cents)')
ax.set_ylabel('Entry Price (cents)')
ax.set_title('Expected Value per Share')
ax.annotate('Break-even line\n(fair_wp = entry)', xy=(70, 70), fontsize=9,
ha='center', bbox=dict(boxstyle='round', fc='white', alpha=0.8))
# Right: Profit distribution for a realistic edge
ax = axes[1]
np.random.seed(42)
n_trades = 500
fair_wp_sim = 0.72
entry_sim = 63
outcomes = np.random.binomial(1, fair_wp_sim, n_trades)
profits = np.where(outcomes == 1, 100 - entry_sim, -entry_sim)
ax.hist(profits, bins=[-65, -60, 35, 40], color=['#d32f2f', '#d32f2f', '#388e3c', '#388e3c'],
edgecolor='white', rwidth=0.6)
ax.set_xlabel('Profit per Trade (cents)')
ax.set_ylabel('Count')
ax.set_title(f'Profit Distribution (fair=72c, entry=63c, n={n_trades})')
ax.axvline(x=np.mean(profits), color='blue', linestyle='--', linewidth=2,
label=f'Avg: {np.mean(profits):.1f}c')
ax.legend()
plt.tight_layout()
plt.show()
print(f"\nSimulation: {n_trades} trades at 72% fair / 63c entry")
print(f" Wins: {outcomes.sum()} ({outcomes.mean()*100:.1f}%)")
print(f" Total PnL: {profits.sum():.0f}c ({profits.mean():.1f}c/trade)")
print(f" Theoretical EV: {(fair_wp_sim - entry_sim/100)*100:.1f}c/trade")
---
2. Building the Backtester
The backtester simulates exactly what the live bot does:
1. For each in-game snapshot, predict the fair win probability
2. Compare model's estimate to the "market price" (we use ESPN WP as a proxy)
3. If the edge exceeds our threshold AND passes all filters, log a trade
4. The trade resolves to +profit or -loss based on who actually won
**Key design decisions:**
We check **both sides** (home AND away) on every snapshot. If the model says home is 72%, that also means away is 28%. If the market says home is 80%, we have edge on away (28c fair vs 20c market = 8c edge).
We limit trades per game (`max_per_game`) to avoid overconcentration
We only trade from `min_period` onward (Period 1 data is too noisy in most sports)
We apply `min_fair_wp` to avoid underdog bets (more on why later)
Using ESPN WP as a Market Proxy
We don't have historical Polymarket orderbook data at second-level granularity. But ESPN publishes a real-time win probability for every game, and Polymarket prices track ESPN WP closely (correlation ~0.95 for NBA/NCAAMB). It's not perfect, but it's the best available proxy for backtesting.
This is 8 of 45 cells. The full module continues with hands-on exercises and working code.
Get All 6 Modules — $49