Collecting · Pre-committed · NBA

ZenHodl vs Polymarket Consensus

ZenHodl vs Polymarket Consensus — NBA Playoffs 2026 Calibration Benchmark tracks 2026 Conference Semifinals through Finals (inclusive). Same eligible games, same snapshot rule, same metrics. Updated as predictions and results are written.

Sibling benchmark on the same NBA games: ZenHodl vs Pinnacle Closing Line →

Pre-commitment proof

Served file hash matches on-chain commit

Manifest SHA-256 (served right now)

e7b5870fbc6e961a717d25aec9625c0ee050414fb909ebf030702c1411335b2e

Reproduce: curl -s /benchmarks/nba-playoffs-2026/manifest.json | sha256sum

On-chain commitment

0x4cb91882e88052b05fec02a9d65859ca3b570f470da4693683dc14f5969042fb ↗

on-chain SHA: e7b5870fbc6e961a717d25aec9625c0ee050414fb909ebf030702c1411335b2e

Block 85956686 on Polygon. Broadcast 2026-04-24T12:23:34 UTC. The hash above appears in the tx's data field.

manifest.json ↗ hash-check.json ↗ raw.jsonl (predictions) ↗ results.jsonl ↗

Live leaderboard

n=33 resolved · raw=34 · last refresh 07:05:10 UTC

ZenHodl

Production NBA model · ZenHodl pregame win probability via internal SignalEngine.get_pregame_predictions('NBA')

🟢

ECE (lower is better)

0.192

95% CI [0.083, 0.347]

Brier

0.254

Log loss

0.702

Accuracy

57.6%

POLYMARKET CONSENSUS

Live mid-price · the wisdom of every smart-money trader on the venue

🌀

ECE (lower is better)

0.178

95% CI [0.105, 0.326]

Brier

0.211

Log loss

0.605

Accuracy

63.6%

Reliability diagram

Predicted probability vs actual home-win rate, binned by 10. Diagonal = perfect calibration.

Each marker is one bin's average. Marker size scales with the number of games in the bin. Points above the diagonal mean predictions in that bucket were too pessimistic; below means too confident. The closer the points hug the diagonal across the chart, the better calibrated the model. A tiny y-jitter (±0.012) is applied so ZenHodl (offset up) and Polymarket Consensus (offset down) markers remain distinguishable when both bins share the same observed rate; hover any point for the true value.

Resolved games

Game	ZenHodl WP	Polymarket Consensus WP	Outcome	ZH Brier	Polymarket Consensus Brier
NYK @ SAS 2026-06-06 · 105-104	0.533 ✗	0.675 ✗	NYK W	0.284	0.456
NYK @ SAS 2026-06-04 · 105-95	0.533 ✗	0.635 ✗	NYK W	0.284	0.403
SAS @ OKC 2026-05-31 · 111-103	0.294 ✓	0.575 ✗	SAS W	0.086	0.331
OKC @ SAS 2026-05-29 · 91-118	0.404 ✗	0.595 ✓	SAS W	0.355	0.164
SAS @ OKC 2026-05-27 · 114-127	0.294 ✗	0.595 ✓	OKC W	0.499	0.164
NYK @ CLE 2026-05-26 · 130-93	0.453 ✓	0.465 ✓	NYK W	0.205	0.216
OKC @ SAS 2026-05-25 · 82-103	0.412 ✗	0.575 ✓	SAS W	0.346	0.181
NYK @ CLE 2026-05-24 · 121-108	0.539 ✗	0.555 ✗	NYK W	0.291	0.308
OKC @ SAS 2026-05-23 · 123-108	0.360 ✓	0.535 ✗	OKC W	0.130	0.286
CLE @ NYK 2026-05-22 · 93-109	0.413 ✗	0.645 ✓	NYK W	0.344	0.126
SAS @ OKC 2026-05-21 · 113-122	0.294 ✗	0.685 ✓	OKC W	0.499	0.099
CLE @ NYK 2026-05-20 · 104-115	0.478 ✗	0.675 ✓	NYK W	0.273	0.106
SAS @ OKC 2026-05-19 · 122-115	0.294 ✓	0.675 ✗	SAS W	0.086	0.456
CLE @ DET 2026-05-18 · 125-94	0.456 ✓	0.635 ✗	CLE W	0.207	0.403
SAS @ MIN 2026-05-16 · 139-109	0.456 ✓	0.325 ✓	SAS W	0.208	0.106
DET @ CLE 2026-05-15 · 115-94	0.456 ✓	0.615 ✗	DET W	0.208	0.378
CLE @ DET 2026-05-13 · 117-113	0.470 ✓	0.615 ✗	CLE W	0.221	0.378
MIN @ SAS 2026-05-12 · 97-126	0.459 ✗	0.485 ✗	SAS W	0.292	0.265
OKC @ LAL 2026-05-12 · 115-110	0.459 ✓	0.175 ✓	OKC W	0.211	0.031
DET @ CLE 2026-05-12 · 103-112	0.460 ✗	0.585 ✓	CLE W	0.292	0.172
SAS @ MIN 2026-05-10 · 109-114	0.468 ✗	0.355 ✗	MIN W	0.283	0.416
NYK @ PHI 2026-05-10 · 144-114	0.484 ✓	0.475 ✓	NYK W	0.234	0.226
OKC @ LAL 2026-05-10 · 131-108	0.433 ✓	0.235 ✓	OKC W	0.188	0.055
DET @ CLE 2026-05-09 · 108-116	0.480 ✗	0.615 ✓	CLE W	0.270	0.148
SAS @ MIN 2026-05-08 · 115-108	0.484 ✓	0.345 ✓	SAS W	0.234	0.119
NYK @ PHI 2026-05-08 · 108-94	0.484 ✓	0.555 ✗	NYK W	0.234	0.308
OKC @ LAL 2026-05-07 · 125-107	0.433 ✓	0.125 ✓	OKC W	0.188	0.016
DET @ CLE 2026-05-07 · 107-97	0.480 ✓	0.395 ✓	DET W	0.231	0.156
NYK @ PHI 2026-05-06 · 108-102	0.484 ✓	0.375 ✓	NYK W	0.234	0.141
SAS @ MIN 2026-05-06 · 133-95	0.484 ✓	0.225 ✓	SAS W	0.234	0.051
DET @ CLE 2026-05-05 · 111-101	0.480 ✓	0.455 ✓	DET W	0.231	0.207
OKC @ LAL 2026-05-05 · 108-90	0.433 ✓	0.115 ✓	OKC W	0.188	0.013
NYK @ PHI 2026-05-05 · 137-98	0.552 ✗	0.285 ✓	NYK W	0.305	0.081

Excluded games (1)

Per the manifest's tie-handling rule, any game where either source was unavailable at snapshot time is excluded from BOTH model's metrics. Listed here so you can see the rule was applied, not silently hidden.

Game	Tip-off	ZenHodl WP	Polymarket Consensus WP	Reason
MIN @ SAS	2026-05-17T04:00	0.450	—	polymarket_unavailable · midpoint_fetch_failed_home

Read the full manifest (the rules) ↓

{
  "metrics": {
    "auxiliary": [
      "Brier score",
      "Log loss",
      "Accuracy"
    ],
    "confidence_interval": "95% bootstrap CI on ECE with 1000 resamples, published alongside point estimate",
    "ece_formula": "Sum over bins of |bin_avg_pred - bin_avg_outcome| weighted by bin sample fraction",
    "headline": "Expected Calibration Error (ECE), 10 equal-width bins"
  },
  "model_versioning": {
    "policy": "ZenHodl model weights as deployed at T-60 of each game are what counts. Each prediction row in raw.jsonl includes the model version ID so post-hoc retrains do not invalidate prior predictions.",
    "retrains_during_window": "Permitted. Disclosed in the per-game row\u0027s model_version field."
  },
  "publication": {
    "live_url": "https://zenhodl.net/benchmarks/nba-playoffs-2026",
    "manifest_file": "https://zenhodl.net/benchmarks/nba-playoffs-2026/manifest.json",
    "raw_data_jsonl": "https://zenhodl.net/benchmarks/nba-playoffs-2026/raw.jsonl",
    "we_publish_when_we_lose": true
  },
  "published_at": "2026-05-04T20:00:00Z",
  "rule_changes": "Once this manifest\u0027s SHA-256 hash is broadcast on Polygon, the rules above are frozen. If ZenHodl edits this file at any later point, the on-chain hash will not match the served file. Anyone can verify by hashing the served manifest.json and comparing to the on-chain transaction data field.",
  "scope": {
    "first_eligible_game_after": "2026-05-05T00:00:00Z",
    "last_eligible_game_before": "2026-06-25T00:00:00Z",
    "sport": "NBA",
    "window": "2026 Conference Semifinals through Finals (inclusive)"
  },
  "snapshot": {
    "matching": "Each NBA game matched to its Polymarket market by team names + game date. Matching script published in this repo so the join is auditable.",
    "polymarket_source": "Polymarket NBA game-winner market YES-side mid price (best bid + best ask) / 2, fetched from clob.polymarket.com",
    "tie_handling": "If either source is unavailable at T-60, the game is excluded from BOTH model\u0027s metrics. Recorded with status=\u0027polymarket_unavailable\u0027 or \u0027zenhodl_unavailable\u0027 in the public raw.jsonl.",
    "timing": "Both predictions captured no later than T-60 minutes before official tip-off",
    "zenhodl_source": "ZenHodl pregame win probability via internal SignalEngine.get_pregame_predictions(\u0027NBA\u0027)"
  },
  "title": "ZenHodl vs Polymarket Consensus \u2014 NBA Playoffs 2026 Calibration Benchmark",
  "version": "1.1",
  "why_polymarket": "Polymarket\u0027s market price represents the consensus probability of every smart-money trader actively wagering real capital. Beating it on calibration is the canonical hedge-fund-grade benchmark for any sports forecasting model, equivalent to closing-line value (CLV) in traditional sports analytics."
}

Why this benchmark?

Polymarket's market price represents the consensus probability of every smart-money trader actively wagering real capital. Beating it on calibration is the canonical hedge-fund-grade benchmark for any sports forecasting model, equivalent to closing-line value (CLV) in traditional sports analytics.

The point is to find out transparently. We may win, we may lose, but the rows and scoring rules stay published.

If we lose, the loss appears here, in the same row, with the same Brier score. The manifest commits us to publishing that outcome — there is no edit path that changes it without invalidating the on-chain hash.

Try the same model live →

7-day free trial. Same NBA pregame WP feed. Same calibration we're being judged on right here.