← Back to blog

How We Predict NBA Games with Machine Learning

2026-04-13 nba machine-learning behind-the-scenes xgboost calibration

Our NBA prediction model processes live game states and outputs a calibrated win probability — updated every score change, every timeout, every substitution. Here's exactly how it works under the hood.

The Architecture

The prediction pipeline has four layers, each correcting the one above it:

  1. Base XGBoost model — trained on 40,000+ NBA game snapshots (2021-2026)
  2. Team stats overlay — live offensive/defensive ratings (ORtg, DRtg, pace)
  3. Injury overlay — 58 tracked star players via ESPN's injury API
  4. Isotonic calibration — post-hoc probability correction (ECE = 0.002)

When LeBron gets ruled out 30 minutes before tip-off, our system automatically adjusts the Lakers' win probability by his impact factor (8%) before the game even starts.

The Training Data

We train on every NBA game from 2021-22 through 2025-26 — approximately 5,285 games with ~300 snapshots each. Each snapshot captures:

The model is a Split-Phase XGBoost — it learns different patterns for early-game (where Elo dominance matters most) and late-game (where score differential dominates).

Why Calibration Matters More Than Accuracy

A model can be highly accurate (picks the winner 75% of the time) but terribly calibrated (when it says 70%, teams actually win 55%). For trading on prediction markets like Polymarket, calibration is everything — because you're not picking winners, you're pricing probabilities.

Our calibration metric (Expected Calibration Error) is 0.002 — meaning when the model says 70%, teams win between 69.8% and 70.2% of the time. We achieve this through:

  1. Post-hoc isotonic regression on a held-out 2024-25 season calibration set
  2. Live rolling recalibration that auto-corrects every 25 resolved predictions

If the model starts drifting (maybe a rule change shifts scoring patterns), the live recalibrator catches it within days and corrects automatically. No manual intervention needed.

Real-Time Injury Adjustments

We track 58 NBA star players through ESPN's injury API with a 10-minute cache. Each player has a pre-computed impact factor:

When a player is listed as OUT, the model subtracts their impact from the team's win probability. When they're QUESTIONABLE, the adjustment is halved. The total adjustment per team is capped at ±15% to prevent extreme swings.

The Results

Over 175 live moneyline bot trades:

The model is genuinely better at predicting NBA games than the Polymarket crowd in certain scenarios: injury-adjusted situations, late-game states with large leads, and games involving teams with extreme offensive/defensive rating differentials.

Try It Yourself

The full prediction API is available at zenhodl.net/docs with a 7-day free trial. You get:

Or build your own from scratch with our free 6-module bot course — Module 3 covers the exact XGBoost training pipeline described here.

See our live trading results for verified, on-chain performance across all 10 sports.

Get ZenHodl Weekly

One weekly email with live results, one model insight, and product updates.

Tuesday mornings. No spam.

Want to build this yourself?

The ZenHodl course teaches you to build a complete prediction market bot in 6 notebooks.

Join the community

Discuss strategies, share results, get help.

Join Discord