Live · Pre-committed · NBA Playoffs 2026

ZenHodl vs Pinnacle Closing Line

A pre-committed, on-chain-anchored head-to-head calibration benchmark covering every NBA Conference Semifinal, Conference Final, and NBA Finals game from May 5 through June 25, 2026. Our model versus the live betting market. Same games, same instant, same metrics. Updated after every game.

Sibling benchmark on the same NBA games: ZenHodl vs Polymarket Consensus →

Pre-commitment proof

Served file hash matches on-chain commit

Manifest SHA-256 (served right now)

7cc81b53e066d0820e2ae72e862fe3a94a35020881431646c50acc9a95c56d21

Reproduce: curl -s /benchmarks/nba-playoffs-2026-vs-pinnacle/manifest.json | sha256sum

On-chain commitment

0x985b9e77083a6b504a0e4b0a849d5b0fae545ed0bf5a09f2d542d032c99a678a ↗

on-chain SHA: 7cc81b53e066d0820e2ae72e862fe3a94a35020881431646c50acc9a95c56d21

Block 86552366 on Polygon. Broadcast 2026-05-08T02:12:27 UTC. The hash above appears in the tx's data field.

manifest.json ↗ hash-check.json ↗ raw.jsonl (predictions) ↗ results.jsonl ↗

Live leaderboard

n=6 resolved · last refresh 10:35:01 UTC

Preliminary — n=6 of 10 required. Point estimates are shown but bootstrap confidence intervals are statistical theatre below this threshold and are hidden until enough games resolve.

ZenHodl

Production NBA model · ECE 4.39% on NCAAMB regular season holdout

🟢

ECE (lower is better)

0.136

CI hidden · n < 10

Brier

0.238

Log loss

0.669

Accuracy

66.7%

PINNACLE

the canonically-sharpest sportsbook's closing line

🌀

ECE (lower is better)

0.330

CI hidden · n < 10

Brier

0.218

Log loss

0.623

Accuracy

66.7%

Reliability diagram

Predicted probability vs actual home-win rate, binned by 10. Diagonal = perfect calibration.

Each marker is one bin's average. Marker size scales with the number of games in the bin. Points above the diagonal mean predictions in that bucket were too pessimistic; below means too confident. The closer the points hug the diagonal across the chart, the better calibrated the model. A tiny y-jitter (±0.012) is applied so ZenHodl (offset up) and Polymarket (offset down) markers remain distinguishable when both bins share the same observed rate; hover any point for the true value.

Resolved games

Game	ZenHodl WP	Polymarket WP	Outcome	ZH Brier	PM Brier
SAS @ MIN 2026-05-10 · 109-114	0.468 ✗	0.351 ✗	MIN W	0.283	0.421
NYK @ PHI 2026-05-10 · 144-114	0.468 ✓	0.483 ✓	NYK W	0.219	0.234
OKC @ LAL 2026-05-10 · 131-108	0.433 ✓	0.238 ✓	OKC W	0.188	0.057
DET @ CLE 2026-05-09 · 108-116	0.480 ✗	0.624 ✓	CLE W	0.270	0.141
SAS @ MIN 2026-05-09 · 115-108	0.484 ✓	0.345 ✓	SAS W	0.234	0.119
NYK @ PHI 2026-05-08 · 108-94	0.484 ✓	0.580 ✗	NYK W	0.234	0.336

Read the full manifest (the rules) ↓

{
  "metrics": {
    "auxiliary": [
      "Brier score",
      "Log loss",
      "Accuracy"
    ],
    "confidence_interval": "95% bootstrap CI on ECE with 1000 resamples, published alongside point estimate",
    "ece_formula": "Sum over bins of |bin_avg_pred - bin_avg_outcome| weighted by bin sample fraction",
    "headline": "Expected Calibration Error (ECE), 10 equal-width bins"
  },
  "model_versioning": {
    "policy": "ZenHodl model weights as deployed at T-60 of each game are what counts. Each prediction row in raw.jsonl includes the model version ID so post-hoc retrains do not invalidate prior predictions.",
    "retrains_during_window": "Permitted. Disclosed in the per-game row\u0027s model_version field."
  },
  "opponent": "pinnacle",
  "opponent_display_name": "Pinnacle",
  "opponent_tagline": "the canonically-sharpest sportsbook\u0027s closing line",
  "publication": {
    "live_url": "https://zenhodl.net/benchmarks/nba-playoffs-2026-vs-pinnacle",
    "manifest_file": "https://zenhodl.net/benchmarks/nba-playoffs-2026-vs-pinnacle/manifest.json",
    "raw_data_jsonl": "https://zenhodl.net/benchmarks/nba-playoffs-2026-vs-pinnacle/raw.jsonl",
    "we_publish_when_we_lose": true
  },
  "published_at": "2026-05-08T01:30:00Z",
  "rule_changes": "Once this manifest\u0027s SHA-256 hash is broadcast on Polygon, the rules above are frozen. If ZenHodl edits this file at any later point, the on-chain hash will not match the served file. Anyone can verify by hashing the served manifest.json and comparing to the on-chain transaction data field.",
  "scope": {
    "first_eligible_game_after": "2026-05-08T02:00:00Z",
    "last_eligible_game_before": "2026-06-25T00:00:00Z",
    "sport": "NBA",
    "window": "2026 Conference Semifinals through Finals (inclusive)"
  },
  "snapshot": {
    "matching": "Each NBA game matched to its Pinnacle market by team-name resolution (Odds API full team name \u2192 ZenHodl tricode via core/api/benchmarks/sports.py team_map). Matching script published in this repo so the join is auditable.",
    "pinnacle_source": "Pinnacle two-way moneyline (game-winner H2H market) via The Odds API at https://api.the-odds-api.com/v4. Decimal odds are multiplicatively devigged to remove Pinnacle\u0027s overround (typically 2-5pp on NBA), producing a true implied probability for the home side.",
    "tie_handling": "If either source is unavailable at T-60, the game is excluded from BOTH model\u0027s metrics. Recorded with status=\u0027zenhodl_unavailable\u0027 or \u0027polymarket_unavailable\u0027 (used as the generic opponent-unavailable status flag) in the public raw.jsonl, with the actual cause in the row\u0027s opponent_error field.",
    "timing": "Both predictions captured no later than T-60 minutes before official tip-off",
    "zenhodl_source": "ZenHodl pregame win probability via internal SignalEngine.get_pregame_predictions(\u0027NBA\u0027)"
  },
  "title": "ZenHodl vs Pinnacle Closing Line \u2014 NBA Playoffs 2026 Calibration Benchmark",
  "version": "1.0",
  "why_pinnacle": "Pinnacle is the canonically-sharpest sportsbook in the industry \u2014 every CLV claim in traditional sports analytics is implicitly benchmarked against Pinnacle\u0027s closing line. Their volume is dominated by syndicate bettors and their margins are the lowest in the business. Beating Pinnacle on calibration is the gold-standard demonstration of forecasting skill in a way actual bettors recognize."
}

Why benchmark against the market?

Polymarket's mid-price is the consensus probability of every smart-money trader actively wagering real capital on the outcome. It's the toughest opponent we could pick — not because it's a smarter model, but because it aggregates every smart model.

A solo operator beating the market on calibration would be the canonical hedge-fund-grade demonstration that the underlying model has edge. We may not. The point is to find out, transparently, in front of you.

If we lose, the loss appears here, in the same row, with the same Brier score. The manifest commits us to publishing that outcome — there is no edit path that changes it without invalidating the on-chain hash.

Try the same model live →

7-day free trial. Same NBA pregame WP feed. Same calibration we're being judged on right here.