Collecting · Pre-committed · NHL

ZenHodl vs Pinnacle Closing Line

ZenHodl vs Pinnacle Closing Line — NHL Playoffs 2026 Calibration Benchmark tracks 2026 Stanley Cup Playoffs Conference Semifinals through Stanley Cup Finals (inclusive). Same eligible games, same snapshot rule, same metrics. Updated as predictions and results are written.

Sibling benchmark on the same NHL games: ZenHodl vs Polymarket Consensus →

Pre-commitment proof

Served file hash matches on-chain commit
Manifest SHA-256 (served right now)
1df8da6b88916e380e546c565cdbc91a41c5da0566c1828ea5a2c700d62ffd54

Reproduce: curl -s /benchmarks/nhl-playoffs-2026-vs-pinnacle/manifest.json | sha256sum

On-chain commitment
0x903d9f40b28124ac6c11fd4f7a764cd981c4c6ab2cf601dbc10b6e2e3293ddb8 ↗
on-chain SHA: 1df8da6b88916e380e546c565cdbc91a41c5da0566c1828ea5a2c700d62ffd54

Block 86576225 on Polygon. Broadcast 2026-05-08T13:48:21 UTC. The hash above appears in the tx's data field.

Live leaderboard

n=27 resolved · raw=27 · last refresh 10:35:14 UTC
ZenHodl
Production NHL model · ZenHodl NHL pregame win probability via internal SignalEngine.get_pregame_predictions('NHL')
🟢
ECE (lower is better)
0.129
95% CI [0.077, 0.326]
Brier
0.230
Log loss
0.653
Accuracy
55.6%
PINNACLE
the canonically-sharpest sportsbook's closing line
🌀
ECE (lower is better)
0.127
95% CI [0.067, 0.327]
Brier
0.249
Log loss
0.690
Accuracy
48.1%

Reliability diagram

Predicted probability vs actual home-win rate, binned by 10. Diagonal = perfect calibration.

Each marker is one bin's average. Marker size scales with the number of games in the bin. Points above the diagonal mean predictions in that bucket were too pessimistic; below means too confident. The closer the points hug the diagonal across the chart, the better calibrated the model. A tiny y-jitter (±0.012) is applied so ZenHodl (offset up) and Pinnacle (offset down) markers remain distinguishable when both bins share the same observed rate; hover any point for the true value.

Resolved games

Game ZenHodl WP Pinnacle WP Outcome ZH Brier Pinnacle Brier
CAR @ VGK
2026-06-10 · 5-3
0.512 ✗ 0.491 ✓ CAR W 0.262 0.241
CAR @ VGK
2026-06-07 · 4-5
0.512 ✓ 0.494 ✗ VGK W 0.238 0.256
VGK @ CAR
2026-06-05 · 3-4
0.539 ✓ 0.595 ✓ CAR W 0.212 0.164
VGK @ CAR
2026-06-03 · 5-4
0.539 ✗ 0.590 ✗ VGK W 0.291 0.348
MTL @ CAR
2026-05-30 · 1-6
0.743 ✓ 0.692 ✓ CAR W 0.066 0.095
CAR @ MTL
2026-05-28 · 4-0
0.366 ✓ 0.415 ✓ CAR W 0.134 0.173
COL @ VGK
2026-05-27 · 1-2
0.531 ✓ 0.481 ✗ VGK W 0.220 0.270
CAR @ MTL
2026-05-26 · 3-2
0.366 ✓ 0.412 ✓ CAR W 0.134 0.170
COL @ VGK
2026-05-25 · 3-5
0.531 ✓ 0.427 ✗ VGK W 0.220 0.328
MTL @ CAR
2026-05-23 · 2-3
0.743 ✓ 0.641 ✓ CAR W 0.066 0.129
VGK @ COL
2026-05-23 · 3-1
0.539 ✗ 0.610 ✗ VGK W 0.291 0.372
MTL @ CAR
2026-05-22 · 6-2
0.743 ✗ 0.641 ✗ MTL W 0.552 0.411
VGK @ COL
2026-05-21 · 4-2
0.539 ✗ 0.609 ✗ VGK W 0.291 0.371
MTL @ BUF
2026-05-18 · 3-2
0.539 ✗ 0.517 ✗ MTL W 0.291 0.267
BUF @ MTL
2026-05-17 · 8-3
0.512 ✗ 0.613 ✗ BUF W 0.262 0.376
VGK @ ANA
2026-05-15 · 5-1
0.201 ✓ 0.506 ✗ VGK W 0.041 0.256
MTL @ BUF
2026-05-14 · 6-3
0.538 ✗ 0.518 ✗ MTL W 0.289 0.268
MIN @ COL
2026-05-14 · 3-4
0.538 ✓ 0.664 ✓ COL W 0.213 0.113
ANA @ VGK
2026-05-13 · 2-3
0.799 ✓ 0.598 ✓ VGK W 0.040 0.162
BUF @ MTL
2026-05-12 · 3-2
0.508 ✗ 0.576 ✗ BUF W 0.259 0.332
COL @ MIN
2026-05-12 · 5-2
0.484 ✓ 0.430 ✓ COL W 0.235 0.185
VGK @ ANA
2026-05-11 · 3-4
0.201 ✗ 0.519 ✓ ANA W 0.638 0.231
BUF @ MTL
2026-05-10 · 2-6
0.508 ✓ 0.533 ✓ MTL W 0.242 0.218
COL @ MIN
2026-05-10 · 1-5
0.484 ✗ 0.452 ✗ MIN W 0.266 0.300
CAR @ PHI
2026-05-09 · 3-2
0.370 ✓ 0.367 ✓ CAR W 0.137 0.135
VGK @ ANA
2026-05-09 · 6-2
0.201 ✓ 0.497 ✓ VGK W 0.041 0.247
MTL @ BUF
2026-05-08 · 5-1
0.538 ✗ 0.551 ✗ MTL W 0.289 0.303
Read the full manifest (the rules) ↓
{
  "metrics": {
    "auxiliary": [
      "Brier score",
      "Log loss",
      "Accuracy"
    ],
    "confidence_interval": "95% bootstrap CI on ECE with 1000 resamples, published alongside point estimate",
    "ece_formula": "Sum over bins of |bin_avg_pred - bin_avg_outcome| weighted by bin sample fraction",
    "headline": "Expected Calibration Error (ECE), 10 equal-width bins"
  },
  "model_versioning": {
    "policy": "ZenHodl model weights as deployed at T-60 of each game are what counts. Each prediction row in raw.jsonl includes the model version ID so post-hoc retrains do not invalidate prior predictions.",
    "retrains_during_window": "Permitted. Disclosed in the per-game row\u0027s model_version field."
  },
  "opponent": "pinnacle",
  "opponent_display_name": "Pinnacle",
  "opponent_tagline": "the canonically-sharpest sportsbook\u0027s closing line",
  "publication": {
    "live_url": "https://zenhodl.net/benchmarks/nhl-playoffs-2026-vs-pinnacle",
    "manifest_file": "https://zenhodl.net/benchmarks/nhl-playoffs-2026-vs-pinnacle/manifest.json",
    "raw_data_jsonl": "https://zenhodl.net/benchmarks/nhl-playoffs-2026-vs-pinnacle/raw.jsonl",
    "we_publish_when_we_lose": true
  },
  "published_at": "2026-05-08T13:50:00Z",
  "rule_changes": "Once this manifest\u0027s SHA-256 hash is broadcast on Polygon, the rules above are frozen. If ZenHodl edits this file at any later point, the on-chain hash will not match the served file. Anyone can verify by hashing the served manifest.json and comparing to the on-chain transaction data field.",
  "scope": {
    "first_eligible_game_after": "2026-05-08T14:00:00Z",
    "last_eligible_game_before": "2026-06-25T00:00:00Z",
    "sport": "NHL",
    "window": "2026 Stanley Cup Playoffs Conference Semifinals through Stanley Cup Finals (inclusive)"
  },
  "snapshot": {
    "matching": "Each NHL game matched to its Pinnacle market by team-name resolution (Odds API full team name \u2192 ZenHodl tricode via core/api/benchmarks/sports.py NHL_TEAM_MAP). Matching script published in this repo so the join is auditable.",
    "pinnacle_source": "Pinnacle two-way moneyline (game-winner H2H market) via The Odds API at https://api.the-odds-api.com/v4. Decimal odds are multiplicatively devigged to remove Pinnacle\u0027s overround (typically 2-5pp on NHL), producing a true implied probability for the home side.",
    "tie_handling": "If either source is unavailable at T-60, the game is excluded from BOTH model\u0027s metrics. Recorded with status=\u0027zenhodl_unavailable\u0027 or \u0027polymarket_unavailable\u0027 (used as the generic opponent-unavailable status flag) in the public raw.jsonl, with the actual cause in the row\u0027s opponent_error field.",
    "timing": "Both predictions captured no later than T-60 minutes before official puck drop",
    "zenhodl_source": "ZenHodl NHL pregame win probability via internal SignalEngine.get_pregame_predictions(\u0027NHL\u0027)"
  },
  "title": "ZenHodl vs Pinnacle Closing Line \u2014 NHL Playoffs 2026 Calibration Benchmark",
  "version": "1.0",
  "why_pinnacle": "Pinnacle is the canonically-sharpest sportsbook in the industry \u2014 every CLV claim in traditional sports analytics is implicitly benchmarked against Pinnacle\u0027s closing line. Their volume is dominated by syndicate bettors and their margins are the lowest in the business. Beating Pinnacle on calibration is the gold-standard demonstration of forecasting skill in a way actual hockey bettors recognize."
}

Why this benchmark?

Pinnacle is the canonically-sharpest sportsbook in the industry — every CLV claim in traditional sports analytics is implicitly benchmarked against Pinnacle's closing line. Their volume is dominated by syndicate bettors and their margins are the lowest in the business. Beating Pinnacle on calibration is the gold-standard demonstration of forecasting skill in a way actual hockey bettors recognize.

The point is to find out transparently. We may win, we may lose, but the rows and scoring rules stay published.

If we lose, the loss appears here, in the same row, with the same Brier score. The manifest commits us to publishing that outcome — there is no edit path that changes it without invalidating the on-chain hash.

Try the same model live →

7-day free trial. Same NHL pregame WP feed. Same calibration we're being judged on right here.