Our NBA prediction model processes live game states and outputs a calibrated win probability — updated every score change, every timeout, every substitution. Here's exactly how it works under the hood.
The Architecture
The prediction pipeline has four layers, each correcting the one above it:
- Base XGBoost model — trained on 40,000+ NBA game snapshots (2021-2026)
- Team stats overlay — live offensive/defensive ratings (ORtg, DRtg, pace)
- Injury overlay — 58 tracked star players via ESPN's injury API
- Isotonic calibration — post-hoc probability correction (ECE = 0.002)
When LeBron gets ruled out 30 minutes before tip-off, our system automatically adjusts the Lakers' win probability by his impact factor (8%) before the game even starts.
The Training Data
We train on every NBA game from 2021-22 through 2025-26 — approximately 5,285 games with ~300 snapshots each. Each snapshot captures:
- Score differential (home - away)
- Time remaining (seconds + period)
- Elo rating difference (our continuously-updated team strength metric)
- Pregame win probability (pre-tip baseline)
- Team stats: offensive rating, defensive rating, pace
- Momentum features: scoring runs over last 2 and 5 minutes
- Interaction terms: score_diff × time_fraction (a 10-point lead matters more with 2 minutes left than with 40 minutes left)
The model is a Split-Phase XGBoost — it learns different patterns for early-game (where Elo dominance matters most) and late-game (where score differential dominates).
Why Calibration Matters More Than Accuracy
A model can be highly accurate (picks the winner 75% of the time) but terribly calibrated (when it says 70%, teams actually win 55%). For trading on prediction markets like Polymarket, calibration is everything — because you're not picking winners, you're pricing probabilities.
Our calibration metric (Expected Calibration Error) is 0.002 — meaning when the model says 70%, teams win between 69.8% and 70.2% of the time. We achieve this through:
- Post-hoc isotonic regression on a held-out 2024-25 season calibration set
- Live rolling recalibration that auto-corrects every 25 resolved predictions
If the model starts drifting (maybe a rule change shifts scoring patterns), the live recalibrator catches it within days and corrects automatically. No manual intervention needed.
Real-Time Injury Adjustments
We track 58 NBA star players through ESPN's injury API with a 10-minute cache. Each player has a pre-computed impact factor:
- Nikola Jokic: 10% impact (MVP-level)
- LeBron James: 8%
- Stephen Curry: 9%
- Luka Doncic: 9%
- Role players: 2% default
When a player is listed as OUT, the model subtracts their impact from the team's win probability. When they're QUESTIONABLE, the adjustment is halved. The total adjustment per team is capped at ±15% to prevent extreme swings.
The Results
Over 175 live moneyline bot trades:
- Win rate: 66.3%
- Average edge at entry: 8-12 cents
- Best performing entry range: 55-70c (slight favorites)
- Worst performing range: 45-55c (toss-ups — the model has no edge here)
The model is genuinely better at predicting NBA games than the Polymarket crowd in certain scenarios: injury-adjusted situations, late-game states with large leads, and games involving teams with extreme offensive/defensive rating differentials.
Try It Yourself
The full prediction API is available at zenhodl.net/docs with a 7-day free trial. You get:
- Live win probabilities for every NBA game via
/v1/games?sport=NBA - Edge detection signals via
/v1/edges - Historical prediction accuracy via
/v1/predictions/latest
Or build your own from scratch with our free 6-module bot course — Module 3 covers the exact XGBoost training pipeline described here.
See our live trading results for verified, on-chain performance across all 10 sports.