Every NBA broadcast shows a "win probability" graphic that updates after each play. Behind it is a machine learning model that takes the current game state (score, time remaining, who has the ball) and outputs the probability that each team wins.
In this tutorial, you'll build one from scratch in Python. By the end, you'll have a model that takes any NBA game state and outputs a calibrated win probability that you can use for analysis, betting, or building a live dashboard.
What We're Building
Input:
predict_wp(score_diff=8, seconds_remaining=720, period=3, elo_diff=150)
# → 0.78 (home team has 78% chance of winning)
Output: a number between 0 and 1 that represents the true probability the home team wins from this game state. Not just "who's ahead" but "how likely is the outcome" — calibrated so that when the model says 70%, the team actually wins 70% of the time.
Step 1: Get the Training Data
NBA win probability models are trained on in-game snapshots — thousands of (game_state, did_home_win) pairs from historical games.
Option A: Use ESPN Play-by-Play Data
import requests
import json
from datetime import datetime
def fetch_espn_pbp(game_id: str) -> list:
"""Fetch play-by-play data from ESPN's public API."""
url = f"https://site.api.espn.com/apis/site/v2/sports/basketball/nba/summary"
resp = requests.get(url, params={"event": game_id})
resp.raise_for_status()
data = resp.json()
plays = []
for play in data.get("plays", []):
plays.append({
"clock": play.get("clock", {}).get("displayValue", ""),
"period": play.get("period", {}).get("number", 0),
"home_score": play.get("homeScore", 0),
"away_score": play.get("awayScore", 0),
"text": play.get("text", ""),
})
return plays
Option B: Use Pre-Built Training Parquets
If you want to skip the data collection and jump to modeling, ZenHodl publishes training-ready parquet files with 5 seasons of NBA snapshot data (2021-22 through 2025-26, ~2M rows).
import pandas as pd
# Load training data (each row = one game state snapshot)
df = pd.read_parquet("wp_training_NBA_2024-25.parquet")
print(f"Rows: {len(df)}, Columns: {list(df.columns)}")
Step 2: Feature Engineering
The raw play-by-play data needs to be transformed into features the model can learn from:
import numpy as np
def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
"""Transform raw game snapshots into model-ready features."""
# Core game state
df["score_diff"] = df["home_score"] - df["away_score"]
df["total_score"] = df["home_score"] + df["away_score"]
# Time features
df["time_fraction"] = 1 - (df["seconds_remaining"] / 2880) # 48 min = 2880 sec
df["score_diff_x_tf"] = df["score_diff"] * df["time_fraction"]
df["score_diff_sq"] = df["score_diff"] ** 2
# Elo interaction (pre-game strength × current lead)
df["score_diff_x_elo"] = df["score_diff"] * df["elo_diff"]
return df
The Most Important Features (by XGBoost importance)
From our production model:
| Feature | Importance | Why it matters |
|---|---|---|
score_diff |
28% | The lead is the single strongest signal |
time_fraction |
19% | Same lead means different things in Q1 vs Q4 |
score_diff_x_tf |
15% | Interaction: a 10-point lead at 90% elapsed is near-certain |
elo_diff |
12% | Better team is more likely to come back from a deficit |
score_diff_sq |
8% | Nonlinear: a 20-point lead is more than 2x as safe as a 10-point lead |
ortg_diff |
5% | Team offensive efficiency differential |
pace_diff |
4% | Faster pace → more possessions → more variance → deficit less safe |
Step 3: Train the Model
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import brier_score_loss, roc_auc_score
FEATURE_COLS = [
"score_diff", "seconds_remaining", "period", "time_fraction",
"elo_diff", "score_diff_x_tf", "score_diff_sq", "total_score",
"score_diff_x_elo",
]
# Prepare data
X = df[FEATURE_COLS].values
y = df["home_win"].values # 1 if home team won, 0 if away
# Walk-forward split (train on older seasons, test on newest)
# DON'T use random split — it causes temporal leakage
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, shuffle=False # shuffle=False preserves time order
)
# Train XGBoost
model = XGBClassifier(
n_estimators=300,
max_depth=5,
learning_rate=0.05,
subsample=0.8,
colsample_bytree=0.8,
objective="binary:logistic",
eval_metric="logloss",
random_state=42,
)
model.fit(
X_train, y_train,
eval_set=[(X_test, y_test)],
verbose=50,
)
# Evaluate
y_pred = model.predict_proba(X_test)[:, 1]
brier = brier_score_loss(y_test, y_pred)
auc = roc_auc_score(y_test, y_pred)
print(f"Brier score: {brier:.4f}")
print(f"ROC-AUC: {auc:.4f}")
A good NBA win probability model should achieve: - Brier score < 0.15 (ours is 0.124 after calibration) - ROC-AUC > 0.85 (ours is 0.897)
Step 4: Calibrate (The Critical Step)
Raw XGBoost outputs are NOT well-calibrated. The model might say "70%" but the team actually wins 75% of the time in that range. This miscalibration costs real money if you're trading on the probabilities.
from sklearn.isotonic import IsotonicRegression
# Split off a calibration set (separate from train and test)
X_cal = X_test[:len(X_test)//2]
y_cal = y_test[:len(y_test)//2]
X_final_test = X_test[len(X_test)//2:]
y_final_test = y_test[len(y_test)//2:]
# Get raw predictions on calibration set
raw_cal_probs = model.predict_proba(X_cal)[:, 1]
# Fit isotonic regression (maps raw probs → calibrated probs)
calibrator = IsotonicRegression(y_min=0.005, y_max=0.995, out_of_bounds="clip")
calibrator.fit(raw_cal_probs, y_cal)
# Apply calibration to test set
raw_test_probs = model.predict_proba(X_final_test)[:, 1]
calibrated_probs = calibrator.transform(raw_test_probs)
# Compare
brier_raw = brier_score_loss(y_final_test, raw_test_probs)
brier_cal = brier_score_loss(y_final_test, calibrated_probs)
print(f"Raw Brier: {brier_raw:.4f}")
print(f"Calibrated Brier: {brier_cal:.4f}")
print(f"Improvement: {(brier_raw - brier_cal) / brier_raw * 100:.1f}%")
Measuring Calibration: ECE
Expected Calibration Error (ECE) measures how well your predicted probabilities match reality:
def compute_ece(y_true, y_pred, n_bins=10):
"""Expected Calibration Error — lower is better."""
bins = np.linspace(0, 1, n_bins + 1)
ece = 0.0
for i in range(n_bins):
mask = (y_pred >= bins[i]) & (y_pred < bins[i + 1])
if mask.sum() > 0:
bin_pred = y_pred[mask].mean()
bin_true = y_true[mask].mean()
ece += abs(bin_pred - bin_true) * mask.sum() / len(y_true)
return ece
ece_raw = compute_ece(y_final_test, raw_test_probs)
ece_cal = compute_ece(y_final_test, calibrated_probs)
print(f"Raw ECE: {ece_raw:.4f}")
print(f"Calibrated ECE: {ece_cal:.4f}")
Target: ECE < 0.01. Our production model achieves 0.002.
Step 5: Save and Use in Production
import pickle
model_package = {
"model": model,
"calibrator": calibrator,
"feature_names": FEATURE_COLS,
"metrics": {"brier": brier_cal, "auc": auc, "ece": ece_cal},
}
with open("wp_model_NBA.pkl", "wb") as f:
pickle.dump(model_package, f)
# Load and predict
def predict_wp(score_diff, seconds_remaining, period, elo_diff=0):
"""Predict home team win probability from a game state."""
with open("wp_model_NBA.pkl", "rb") as f:
pkg = pickle.load(f)
time_fraction = 1 - (seconds_remaining / 2880)
features = {
"score_diff": score_diff,
"seconds_remaining": seconds_remaining,
"period": period,
"time_fraction": time_fraction,
"elo_diff": elo_diff,
"score_diff_x_tf": score_diff * time_fraction,
"score_diff_sq": score_diff ** 2,
"total_score": 0, # approximate
"score_diff_x_elo": score_diff * elo_diff,
}
X = np.array([[features[f] for f in pkg["feature_names"]]])
raw_prob = pkg["model"].predict_proba(X)[0][1]
calibrated = pkg["calibrator"].transform([raw_prob])[0]
return round(calibrated, 4)
# Example
wp = predict_wp(score_diff=8, seconds_remaining=720, period=3, elo_diff=150)
print(f"Home team win probability: {wp:.1%}")
Step 6: Add Live Data Overlays
A static model is good. A model with live adjustments is better. Three overlays that matter:
Injury Adjustment
# If a star player is out, adjust the win probability
STAR_IMPACT = {
"Nikola Jokic": 0.12, # Jokic out → team loses ~12% win prob
"Luka Doncic": 0.11,
"Giannis Antetokounmpo": 0.11,
"Jayson Tatum": 0.09,
# ... 58 players tracked in our production system
}
Team Stats (Offensive/Defensive Efficiency)
# ORtg = points per 100 possessions (offense)
# DRtg = points per 100 possessions allowed (defense)
# Pace = possessions per game
team_stats = {
"BOS": {"ortg": 117.2, "drtg": 110.0, "pace": 97.7},
"DET": {"ortg": 114.3, "drtg": 106.5, "pace": 102.9},
}
# A team with higher ORtg and lower DRtg is better
# The differential feeds into the model as additional features
Live Recalibration
# Rolling isotonic refit on the last 500 resolved predictions
# Auto-corrects calibration drift without full retraining
from sklearn.isotonic import IsotonicRegression
class LiveRecalibrator:
def __init__(self, buffer_size=500):
self.preds = []
self.outcomes = []
self.calibrator = None
def update(self, predicted_prob, actual_outcome):
self.preds.append(predicted_prob)
self.outcomes.append(actual_outcome)
if len(self.preds) > 500:
self.preds.pop(0)
self.outcomes.pop(0)
if len(self.preds) >= 50:
self.calibrator = IsotonicRegression(out_of_bounds="clip")
self.calibrator.fit(self.preds, self.outcomes)
def calibrate(self, raw_prob):
if self.calibrator:
return self.calibrator.transform([raw_prob])[0]
return raw_prob
What This Model Can Do
Once built, this model powers:
- Live dashboards — show real-time win probability curves during games
- Trading bots — compare your fair probability to Polymarket/Kalshi prices and trade the gap
- Research — analyze which game situations are most commonly mispriced
- Content — "the model gives X a 73% chance with 4 minutes left" makes great commentary
Next Steps
- Get training data: Download from ESPN's public API or use ZenHodl's pre-built parquets
- Try the pre-built model: ZenHodl's API serves live calibrated probabilities for NBA + 6 other sports. 7-day free trial.
- Take the full course: Our 6-module course walks through this entire pipeline with working code, from ESPN scraping to live deployment on Polymarket. $49 one-time.
- See live results: zenhodl.net/results shows how this model performs in real-money trading.