Collecting · Pre-committed · NBA

ZenHodl vs Polymarket Consensus

ZenHodl vs Polymarket Consensus — NBA Playoffs 2026 Calibration Benchmark tracks 2026 Conference Semifinals through Finals (inclusive). Same eligible games, same snapshot rule, same metrics. Updated as predictions and results are written.

Sibling benchmark on the same NBA games: ZenHodl vs Pinnacle Closing Line →

Pre-commitment proof

Served file hash matches on-chain commit
Manifest SHA-256 (served right now)
e7b5870fbc6e961a717d25aec9625c0ee050414fb909ebf030702c1411335b2e

Reproduce: curl -s /benchmarks/nba-playoffs-2026/manifest.json | sha256sum

On-chain commitment
0x4cb91882e88052b05fec02a9d65859ca3b570f470da4693683dc14f5969042fb ↗
on-chain SHA: e7b5870fbc6e961a717d25aec9625c0ee050414fb909ebf030702c1411335b2e

Block 85956686 on Polygon. Broadcast 2026-04-24T12:23:34 UTC. The hash above appears in the tx's data field.

Live leaderboard

n=22 resolved · raw=24 · last refresh 00:35:01 UTC
ZenHodl
Production NBA model · ZenHodl pregame win probability via internal SignalEngine.get_pregame_predictions('NBA')
🟢
ECE (lower is better)
0.235
95% CI [0.058, 0.397]
Brier
0.230
Log loss
0.651
Accuracy
72.7%
POLYMARKET CONSENSUS
Live mid-price · the wisdom of every smart-money trader on the venue
🌀
ECE (lower is better)
0.197
95% CI [0.121, 0.368]
Brier
0.192
Log loss
0.562
Accuracy
68.2%

Reliability diagram

Predicted probability vs actual home-win rate, binned by 10. Diagonal = perfect calibration.

Each marker is one bin's average. Marker size scales with the number of games in the bin. Points above the diagonal mean predictions in that bucket were too pessimistic; below means too confident. The closer the points hug the diagonal across the chart, the better calibrated the model. A tiny y-jitter (±0.012) is applied so ZenHodl (offset up) and Polymarket Consensus (offset down) markers remain distinguishable when both bins share the same observed rate; hover any point for the true value.

Resolved games

Game ZenHodl WP Polymarket Consensus WP Outcome ZH Brier Polymarket Consensus Brier
CLE @ NYK
2026-05-20 · 104-115
0.478 ✗ 0.675 ✓ NYK W 0.273 0.106
SAS @ OKC
2026-05-19 · 122-115
0.294 ✓ 0.675 ✗ SAS W 0.086 0.456
CLE @ DET
2026-05-18 · 125-94
0.456 ✓ 0.635 ✗ CLE W 0.207 0.403
SAS @ MIN
2026-05-16 · 139-109
0.456 ✓ 0.325 ✓ SAS W 0.208 0.106
DET @ CLE
2026-05-15 · 115-94
0.456 ✓ 0.615 ✗ DET W 0.208 0.378
CLE @ DET
2026-05-13 · 117-113
0.470 ✓ 0.615 ✗ CLE W 0.221 0.378
MIN @ SAS
2026-05-12 · 97-126
0.459 ✗ 0.485 ✗ SAS W 0.292 0.265
OKC @ LAL
2026-05-12 · 115-110
0.459 ✓ 0.175 ✓ OKC W 0.211 0.031
DET @ CLE
2026-05-12 · 103-112
0.460 ✗ 0.585 ✓ CLE W 0.292 0.172
SAS @ MIN
2026-05-10 · 109-114
0.468 ✗ 0.355 ✗ MIN W 0.283 0.416
NYK @ PHI
2026-05-10 · 144-114
0.484 ✓ 0.475 ✓ NYK W 0.234 0.226
OKC @ LAL
2026-05-10 · 131-108
0.433 ✓ 0.235 ✓ OKC W 0.188 0.055
DET @ CLE
2026-05-09 · 108-116
0.480 ✗ 0.615 ✓ CLE W 0.270 0.148
SAS @ MIN
2026-05-08 · 115-108
0.484 ✓ 0.345 ✓ SAS W 0.234 0.119
NYK @ PHI
2026-05-08 · 108-94
0.484 ✓ 0.555 ✗ NYK W 0.234 0.308
OKC @ LAL
2026-05-07 · 125-107
0.433 ✓ 0.125 ✓ OKC W 0.188 0.016
DET @ CLE
2026-05-07 · 107-97
0.480 ✓ 0.395 ✓ DET W 0.231 0.156
NYK @ PHI
2026-05-06 · 108-102
0.484 ✓ 0.375 ✓ NYK W 0.234 0.141
SAS @ MIN
2026-05-06 · 133-95
0.484 ✓ 0.225 ✓ SAS W 0.234 0.051
DET @ CLE
2026-05-05 · 111-101
0.480 ✓ 0.455 ✓ DET W 0.231 0.207
OKC @ LAL
2026-05-05 · 108-90
0.433 ✓ 0.115 ✓ OKC W 0.188 0.013
NYK @ PHI
2026-05-05 · 137-98
0.552 ✗ 0.285 ✓ NYK W 0.305 0.081

Excluded games (1)

Per the manifest's tie-handling rule, any game where either source was unavailable at snapshot time is excluded from BOTH model's metrics. Listed here so you can see the rule was applied, not silently hidden.

Game Tip-off ZenHodl WP Polymarket Consensus WP Reason
MIN @ SAS 2026-05-17T04:00 0.450 polymarket_unavailable · midpoint_fetch_failed_home

Snapshotted, awaiting result (1)

SAS @ OKC tip 2026-05-21T00:30 ZH 0.294 PM 0.685
Read the full manifest (the rules) ↓
{
  "metrics": {
    "auxiliary": [
      "Brier score",
      "Log loss",
      "Accuracy"
    ],
    "confidence_interval": "95% bootstrap CI on ECE with 1000 resamples, published alongside point estimate",
    "ece_formula": "Sum over bins of |bin_avg_pred - bin_avg_outcome| weighted by bin sample fraction",
    "headline": "Expected Calibration Error (ECE), 10 equal-width bins"
  },
  "model_versioning": {
    "policy": "ZenHodl model weights as deployed at T-60 of each game are what counts. Each prediction row in raw.jsonl includes the model version ID so post-hoc retrains do not invalidate prior predictions.",
    "retrains_during_window": "Permitted. Disclosed in the per-game row\u0027s model_version field."
  },
  "publication": {
    "live_url": "https://zenhodl.net/benchmarks/nba-playoffs-2026",
    "manifest_file": "https://zenhodl.net/benchmarks/nba-playoffs-2026/manifest.json",
    "raw_data_jsonl": "https://zenhodl.net/benchmarks/nba-playoffs-2026/raw.jsonl",
    "we_publish_when_we_lose": true
  },
  "published_at": "2026-05-04T20:00:00Z",
  "rule_changes": "Once this manifest\u0027s SHA-256 hash is broadcast on Polygon, the rules above are frozen. If ZenHodl edits this file at any later point, the on-chain hash will not match the served file. Anyone can verify by hashing the served manifest.json and comparing to the on-chain transaction data field.",
  "scope": {
    "first_eligible_game_after": "2026-05-05T00:00:00Z",
    "last_eligible_game_before": "2026-06-25T00:00:00Z",
    "sport": "NBA",
    "window": "2026 Conference Semifinals through Finals (inclusive)"
  },
  "snapshot": {
    "matching": "Each NBA game matched to its Polymarket market by team names + game date. Matching script published in this repo so the join is auditable.",
    "polymarket_source": "Polymarket NBA game-winner market YES-side mid price (best bid + best ask) / 2, fetched from clob.polymarket.com",
    "tie_handling": "If either source is unavailable at T-60, the game is excluded from BOTH model\u0027s metrics. Recorded with status=\u0027polymarket_unavailable\u0027 or \u0027zenhodl_unavailable\u0027 in the public raw.jsonl.",
    "timing": "Both predictions captured no later than T-60 minutes before official tip-off",
    "zenhodl_source": "ZenHodl pregame win probability via internal SignalEngine.get_pregame_predictions(\u0027NBA\u0027)"
  },
  "title": "ZenHodl vs Polymarket Consensus \u2014 NBA Playoffs 2026 Calibration Benchmark",
  "version": "1.1",
  "why_polymarket": "Polymarket\u0027s market price represents the consensus probability of every smart-money trader actively wagering real capital. Beating it on calibration is the canonical hedge-fund-grade benchmark for any sports forecasting model, equivalent to closing-line value (CLV) in traditional sports analytics."
}

Why this benchmark?

Polymarket's market price represents the consensus probability of every smart-money trader actively wagering real capital. Beating it on calibration is the canonical hedge-fund-grade benchmark for any sports forecasting model, equivalent to closing-line value (CLV) in traditional sports analytics.

The point is to find out transparently. We may win, we may lose, but the rows and scoring rules stay published.

If we lose, the loss appears here, in the same row, with the same Brier score. The manifest commits us to publishing that outcome — there is no edit path that changes it without invalidating the on-chain hash.

Try the same model live →

7-day free trial. Same NBA pregame WP feed. Same calibration we're being judged on right here.