We run 7 automated prediction bots on Polymarket. They trade NBA, NFL, MLB, NHL, tennis, CS2, and LoL markets using calibrated win probability models. Every trade is settled on-chain and verifiable on PolygonScan.
This post covers our actual results, the strategy architecture, what works, and what we learned losing money before we started making it.
The Numbers
As of April 2026, across all sports:
- 938+ resolved trades (shadow mode and backfill excluded)
- 58% win rate (calibrated, meaning when we predict 60%, the team wins ~60% of the time)
- +$2,400 cumulative P&L from a $500 starting bankroll
- +5.4 cents per trade average edge (net of Polymarket fees)
- 7 sports actively traded
These aren't backtest numbers. Every trade has a Polymarket order ID and an on-chain transaction hash. You can verify them at zenhodl.net/results or directly on the blockchain.
The Strategy: Calibrated Edge Detection
Most sports bettors think about "picking winners." That's the wrong frame. On a prediction market, the question isn't "who wins?" but "is this price wrong?"
How We Find Mispricings
- Build a win probability model for each sport using historical play-by-play data (40,000+ games across all sports)
- Calibrate it so that predictions match reality (when the model says 70%, the team actually wins 70% of the time)
- Compare our fair probability to the market price in real time
- Trade when the gap exceeds our minimum edge (typically 5-8 cents depending on the sport)
The key insight: calibration matters more than accuracy. A model with 65% accuracy but poor calibration will lose money because it confidently mispredicts the probabilities. A model with 58% accuracy but excellent calibration will make money because every probability estimate is trustworthy.
We learned this the hard way. Our NBA model had 65% accuracy on backtests and was losing money in live trading. The fix wasn't a better model. It was fixing a calibration bug that had our ECE (Expected Calibration Error) at 0.074 instead of 0.002. After recalibration, the same model went from losing money to our best performer. We wrote about this in detail: Calibration Beats Accuracy.
The Edge Lifecycle
Every trade goes through this lifecycle:
1. ESPN/API detects a live game state change (score update, injury report)
2. Our model recalculates fair win probability in <100ms
3. We compare to Polymarket's current YES/NO token prices
4. If edge > minimum threshold → place a limit order
5. If filled → hold to settlement (we never sell early)
6. Game ends → Polymarket resolves → we collect $1 per winning token, $0 per losing
Hold to settlement is a deliberate strategy choice. We tested selling early when our model updates mid-game, but the transaction costs and timing risk ate more edge than they captured. Holding to settlement is simpler and more profitable for our edge profile.
What Works By Sport
NBA (Best performer)
Our NBA model uses 16 features including live score differential, time remaining, Elo ratings, and team-level offensive/defensive efficiency (ORtg, DRtg, pace). The injury overlay tracks 58 specific players and adjusts probabilities when key stars sit out.
- Brier score: 0.124 (after isotonic recalibration)
- ECE: 0.002 (nearly perfect calibration)
- The model excels in the 3rd and 4th quarters when the market is slowest to react to momentum shifts
NFL (Seasonal)
NFL games are fewer but higher-value. Each game has more market liquidity because of public interest. Our edge comes from in-game adjustments that the market is slow to price: turnovers, field position after punts, and red zone efficiency.
MLB (Growing edge)
Baseball is our newest sport. The model incorporates starting pitcher ERA, bullpen quality (30-team rolling ERA), park factors (Coors Field vs. Oracle Park makes a real difference), and the run expectancy matrix from the Tango tables.
Esports (CS2 + LoL)
Esports markets are less efficient than traditional sports because there are fewer sophisticated traders. Our CS2 model includes team map-specific win rates, recent form, and head-to-head history. LoL includes dragon soul probability, inhibitor state, and gold differential (the single most predictive feature at 39% importance).
What We Learned Losing Money
Lesson 1: Backtests Lie (Sometimes)
Our first 3 months were unprofitable despite strong backtests. The problem: backtests used historical closing prices, but live trading faces slippage and timing risk. A 5-cent edge on paper can become a 2-cent edge (or negative) after:
- Slippage: the price moves between signal and fill
- Market impact: your order itself moves the price (especially in thin esports markets)
- Stale data: ESPN score updates can lag 10-30 seconds behind reality
The fix: we added slippage modeling to our backtests and raised our minimum edge threshold from 3 cents to 5-8 cents. Fewer trades, but profitable ones.
Lesson 2: Calibration Drift Is Real
A model trained on 2023-24 data slowly becomes less accurate as teams change rosters, coaching strategies evolve, and league rules shift. We now run a live recalibrator that continuously refits an isotonic regression on the most recent 500 resolved predictions. This auto-corrects drift without full retraining.
Lesson 3: Injuries Are Alpha
The market consistently underreacts to injury news. When a star player is ruled out 30 minutes before game time, the Polymarket price adjusts slowly (sometimes over 10-15 minutes). Our injury overlay detects ESPN injury report changes within 60 seconds and recalculates the fair probability immediately. That 10-minute window is where a lot of our edge lives.
Lesson 4: Don't Trade Everything
We initially traded every signal above 3 cents of edge. This led to lots of low-conviction trades that collectively lost money. The fix was simple: raise the bar. We now require:
- NBA: minimum 5c edge
- NHL: minimum 8c edge (less liquid, wider spreads)
- MLB: minimum 6c edge
- Esports: minimum 8c edge (thinner markets)
Higher thresholds mean fewer trades but dramatically better win rates and P&L per trade.
The Technical Stack
For those interested in the engineering:
- Models: XGBoost + isotonic calibration, retrained monthly, live recalibrator runs continuously
- Data: ESPN live scores (5-second polling), The Odds API (real-time sportsbook lines), team stats cache (updated daily)
- Execution: Python asyncio bots running on a single VPS ($48/month)
- Settlement: Hold to settlement, no early exit
- Verification: Every trade logged to an append-only JSONL ledger, wallet address public on PolygonScan
The entire system runs on about 200 lines of core trading logic per sport. The hard part isn't the code. It's the calibration.
Can You Do This?
Yes, with caveats:
What you need: 1. A prediction model with ECE < 0.01 (calibration is non-negotiable) 2. Real-time data feeds (ESPN API, odds feeds) 3. A Polymarket account with USDC deposited 4. Patience to paper-trade for 50+ signals before risking real money
What you don't need: - A PhD in statistics (XGBoost + isotonic regression is ~20 lines of scikit-learn) - A powerful GPU (our models train on CPU in 5-10 minutes) - A team (we're a solo operation)
Two ways to get started:
-
Build it yourself: Our 6-module course walks through the full pipeline from ESPN scraping to live deployment. $49, one-time purchase, all code included.
-
Use the API: ZenHodl's prediction API provides real-time fair probabilities for all 7 sports. Plug them into your own execution layer. 7-day free trial, $49/month after.
See our full live results: zenhodl.net/results — updated every 60 seconds, verified on-chain.