← Back to blog

Betting Strategies With Calibrated ML Probabilities (Without Going Broke)

2026-05-11 strategy machine-learning kelly-criterion calibration prediction-markets intermediate

Most bettors who get their hands on machine-learning probabilities still lose money. The model is not the problem. The strategy on top of it is.

This post is the playbook we use to convert calibrated probabilities into trades on prediction markets. Nothing here is exotic. All of it is the result of pruning the strategies that did not work and ruthlessly keeping the ones that did. If you have a probability source you trust — yours or one of ours — and you want to know how to actually bet with it, this is the order of operations.

Step Zero: Trust the Probability Before You Bet On It

If your probability source is not calibrated, no strategy will save you. Calibration means that when the model says 62%, the team actually wins 62 out of 100 times. You verify this with Expected Calibration Error (ECE) — a single number that summarizes how much the predicted probabilities deviate from observed frequencies.

Our threshold is ECE under 0.05. Below that, the model is safe to size positions against. Above that, the probability is misleading and Kelly sizing will over-bet — which, as the Kelly Criterion post explains, is the single most common way to go broke even with a real edge.

If you are using a third-party API, ask the provider for their ECE per sport. If they do not publish it, assume it is not calibrated and treat the probabilities as point predictions only — useful for picking sides, useless for sizing.

Step One: Compute Edge Honestly

Edge is fair probability minus market price (in cents). If the model says 62% and the market is asking 54 cents, the edge is 8 cents per share.

That definition is correct in a vacuum. In practice you have to subtract costs:

effective_edge = fair_prob - market_price - taker_fee - expected_slippage

On Polymarket, the taker fee is around 2 cents on most markets and the typical slippage is 1-3 cents depending on market depth. So an 8-cent raw edge becomes a 3-5 cent effective edge after costs. Anything under 5 cents raw is rarely worth trading.

This is why our bots all have a min_edge_c floor between 5 and 12 cents depending on sport. Below the floor, the costs eat the edge faster than the edge accrues.

Step Two: Filter With an Edge Band, Not Just a Floor

A 30-cent detected edge sounds great. It is usually a model error.

We learned this the painful way. Our tennis bot was happily entering high-edge trades for months — only to discover that ATP edges between 20 and 25 cents were running 31% win rate at -$10 P&L over 13 trades, while edges between 15 and 20 cents were running 59% WR at +$24. The high-edge bucket was full of stale model outputs from the moments when the live scoreboard had not yet caught up to a swing.

The fix was a max_edge_c ceiling. Below the floor: not enough margin. Above the ceiling: probably the model being wrong. The middle band is where real edge lives.

Every sport has a different sweet spot. Our published CLV dashboard shows the edge band that is currently profitable for each sport. As of recent data:

The shape is consistent: positive CLV in the middle, negative on both wings.

Step Three: Size With Quarter-Kelly, Not Full Kelly

Kelly Criterion gives you the bet size that maximizes long-term geometric growth. Full Kelly is mathematically optimal and practically suicidal — it assumes your probability is exactly correct, which it never is.

Quarter-Kelly (multiply the Kelly fraction by 0.25) is the professional standard. You give up 25% of theoretical growth to eliminate roughly 90% of the ruin risk. The full math is in the Kelly post; the short version is:

def quarter_kelly_fraction(fair_prob, market_price):
    edge = fair_prob - market_price
    if edge <= 0:
        return 0.0
    profit_per_dollar = 1.0 - market_price
    return 0.25 * edge / profit_per_dollar

bankroll = 1000
fraction = quarter_kelly_fraction(0.62, 0.54)  # 0.037
bet_size = bankroll * fraction                  # $37

If your gut says bet $200, quarter-Kelly probably says $40. Trust the math.

Step Four: Pick a Strategy Archetype

Once you have an edge and a size, you still have to pick how to be in the market. Four archetypes work on prediction markets:

Pre-game value betting. Take the model's pre-game probability, compare it to the line a few hours before tipoff, fire if the edge clears your threshold, and hold to settlement. Lowest-touch strategy. Works best for sports where the model has a stable pre-game edge — NCAAMB and NHL in our data.

In-play moneyline. Poll live game state every 5-30 seconds. The model updates its win probability as the score and time evolve. When the market overreacts to a momentum swing — a long run, a goal, a turnover — the gap between fair probability and market price widens. Buy the side the market is leaving behind. Hold to settlement. Most of our 11 production bots run this archetype.

Hold-to-settlement. Once you enter, do not exit until the contract resolves to 0 or 100. Eliminates exit-side adverse selection (you are not trying to sell on Polymarket's thin late-game books) and sidesteps the temptation to cut winners early. This is our default. We have tested mean-reversion exits and they consistently lose to hold-to-settlement on the same entries.

Two-sided sync (hedge accumulator). Buy BOTH sides of a contract when both have dipped from a trailing high within a short window. If your pair cost is under 100 cents, you are guaranteed profit at resolution regardless of outcome. Specialty strategy. Our hedge accumulator bot earns $7-13 per session this way on NCAAMB.

Pick one. Master it. Then add a second.

Step Five: Risk Controls That Compound

Edge filters and position sizing are not enough. You also need controls that limit how bad a bad day can get.

Drawdown-aware sizing. When session P&L is negative, scale position sizes down. Recovery rate stays the same; the size of the hole you can dig stays bounded. We use a piecewise scaler — full size above $0, 0.75x below -$10, 0.5x below -$25.

Circuit breaker. When rolling 30-day ROI for a sport drops below -5%, automatically disable the sport. Re-enable when it recovers above 0%. Self-healing. Our public circuit breaker dashboard shows current state.

Daily kill-switch. Hard cutoff if a single session loses more than a defined amount. Pairs with the circuit breaker — one is intra-session, the other is multi-day.

CLV monitoring. Closing Line Value is the leading indicator that win rate confirms in retrospect. If your 7-day CLV drops below your 30-day CLV by more than 2 cents, your model is degrading. Retrain or recalibrate before P&L confirms it.

Step Six: Stop Manual Where Bots Can Run

Once a strategy works, automate it. Manual execution introduces three failure modes:

You miss signals. Markets move in milliseconds; humans react in seconds. By the time you click, the price is gone or worse.

You override. The single biggest source of bot underperformance among our trial users is humans flipping the wrong switch — closing winners early, doubling down on losers, skipping signals because they "have a bad feeling." The strategy is the strategy. The whole point of writing it down was to remove the discretion.

You burn out. Watching markets for hours per day is unsustainable. The bots that run 24/7 do not get tired, do not eat dinner, do not need to sleep.

We sell a $49 bot course that walks through this exact pipeline if you want to build your own.

What Does Not Work

We have killed more strategies than we run. The graveyard:

Mean-reversion taker bots. Buy dips, sell spikes. Negative EV on prediction markets because the moves are usually information-driven, not noise.

Compression sniping. Wait for the spread to compress, fade the breakout. Adverse selection. The traders compressing the spread know more than you do.

Double-down on losers. Martingale variants are catastrophe machines. Always.

SPREAD/TOTAL taker. Mean-reversion exits do not work on SPREAD/TOTAL because the underlying score is permanent — a 10-point swing is not noise. Fix: hold-to-settlement on SPREAD only (excluded TOTAL after backtest showed -5.2c per trade on NBA).

Trading without published ECE. If you do not know your model's calibration, you are gambling with extra steps.

The Bottom Line

A calibrated probability is necessary but not sufficient. The strategy on top of it has to respect costs, cap edge bands at both ends, size with quarter-Kelly, hold to settlement, and survive bad days through circuit breakers and drawdown scalers. The strategies that survive are usually boring. The exciting ones lose money.

If you have a probability source you trust, work this checklist top to bottom. The compounding starts when the entire stack is in place — not when any single piece is perfect.


Live calibrated probabilities for 11 sports at zenhodl.net/v1/try. Real-time CLV monitoring at zenhodl.net/clv. Full bot course included with every API plan at zenhodl.net/pricing.

Get ZenHodl Weekly

One weekly email with live results, one model insight, and product updates.

Tuesday mornings. No spam.

Want to build this yourself?

The ZenHodl course teaches you to build a complete prediction market bot in 6 notebooks.

Join the community

Discuss strategies, share results, get help.

Join Discord