Empirical evidence that ZenHodl's pre-game model carries information beyond luck. When our model identifies value (CLV-positive entry, defined precisely below), trades win 89.9%. When our value call is wrong (CLV-negative entry), trades win 11.2%. The 78.8-percentage-point gap is a direct test of whether the model carries information beyond luck, and we publish both halves.
The full verification script (verify_clv_gap.py) is published. It reads trades.jsonl and reproduces every number on this page.
| Bucket | n | Win rate | Mean CLV |
|---|---|---|---|
| CLV+ (close − entry > +0.5c) | 457 | 89.9% | +28.3c |
| CLV= (|close − entry| ≤ 0.5c) | 5 | 60.0% | +0.2c |
| CLV− (close − entry < −0.5c) | 493 | 11.2% | −31.1c |
Sample composition: 1,642 trades on the public ledger have a settled outcome. 955 of those (~58.2%) had a recorded closing price; the gap is computed on the 950 with a non-trivial price move (excluding 5 trades with |close − entry| ≤ 0.5c). The Z-test value above tests the hypothesis that the two win rates are equal on the aggregate dataset; it is rejected at any conventional level.
Every trade has an entry price (the market price when we fired) and a closing price (the market's final price right before the event resolved). CLV is the difference, measured in cents on a 0–100 implied-probability scale. We bucket each measured trade into one of three groups using a ±0.5c threshold:
CLV is a widely-used skill metric in sports analytics because it is judged by the market itself, not by who won the game. A bettor who consistently beats the close is identifying mispriced bets — even when those specific bets lose. A bettor who consistently loses to the close is paying retail prices regardless of how often the games go their way.
A common objection to a single headline number: "what if one sport is carrying the average?" The breakdown below shows the eight sports with at least 30 trades in the gap (CLV+ ∪ CLV− ≥ 30). Sports below this cutoff and sports with thinner pipeline coverage are surfaced separately in the selection-bias section below.
| Sport | CLV+ n | CLV+ WR | CLV− n | CLV− WR | Gap | Ratio |
|---|---|---|---|---|---|---|
| MLB | 103 | 98.1% | 92 | 12.0% | +86 pp | 8.2× |
| WTA | 59 | 91.5% | 80 | 8.8% | +83 pp | 10.5× |
| CS2 | 87 | 86.2% | 114 | 10.5% | +76 pp | 8.2× |
| LoL | 47 | 83.0% | 45 | 6.7% | +76 pp | 12.4× |
| ATP | 62 | 88.7% | 76 | 14.5% | +74 pp | 6.1× |
| NBA small-sample | 16 | 93.8% | 19 | 0.0% | +94 pp | n/a |
| NHL | 56 | 91.1% | 41 | 22.0% | +69 pp | 4.1× |
| Soccer | 15 | 73.3% | 20 | 10.0% | +63 pp | 7.3× |
Win-rate ratios across the seven sports with n_CLV− ≥ 20 range from 4.1× (NHL: 91% / 22%) to 12.4× (LoL: 83% / 7%). NBA's "n/a" ratio reflects 0 wins on n=19 CLV− trades — a small-sample artifact rather than a defined ratio. The gap is positive in every individual sport at this sample size; no single sport is carrying the headline 78.8-pp average. Per-sport win rates on samples this size carry meaningful uncertainty; the formal Z-test is computed on the aggregate dataset only, not on each sport separately.
If the 78.8-pp gap were an artifact of where we set the CLV± threshold, the gap would shrink as we move the boundary further from zero. It does the opposite — trades that beat the close by larger margins win more reliably; trades that miss the close by larger margins lose more reliably. The trend is consistent rather than perfectly monotonic at the per-bucket win-rate level (89.9% → 89.9% → 90.0% → 90.9% on the CLV+ side as the threshold tightens), but the gap grows from +78.2pp at ±0c to +82.2pp at ±5c.
| Threshold | CLV+ n | CLV+ WR | CLV− n | CLV− WR | Gap |
|---|---|---|---|---|---|
| ±0.0c | 460 | 89.6% | 494 | 11.3% | +78.2 pp |
| ±0.5c (default) | 457 | 89.9% | 493 | 11.2% | +78.8 pp |
| ±1.0c | 456 | 89.9% | 490 | 10.6% | +79.3 pp |
| ±2.0c | 451 | 90.0% | 487 | 10.5% | +79.5 pp |
| ±5.0c | 429 | 90.9% | 457 | 8.8% | +82.2 pp |
A pattern consistent with real signal rather than threshold-induced bucketing: large CLV magnitudes predict outcomes more decisively than small ones.
Of 1,642 settled trades on the public ledger, 955 (~58.2%) had a recorded closing price at settlement; the remaining 687 did not. The headline gap is computed on the measured subset only. Two questions follow:
1. Do measured and unmeasured trades win at different rates overall?
No detectable aggregate difference. Measured: 49.1% WR (n=955). Unmeasured: 48.3% WR (n=687). Two-proportion Z-test on the WR delta yields z = 0.31, p = 0.754 — well above any conventional significance threshold. The missingness mechanism does not preferentially keep winners or losers in the measured subset at the aggregate level.
2. Is missingness sport-correlated?
Yes, substantially. Coverage varies by sport. We define a sport as well-covered when ≥75% of its settled trades have a recorded closing price; partially covered when 25–75%; essentially absent when below 25%.
| Sport | Measured | Unmeasured | Coverage | Status |
|---|---|---|---|---|
| SOCCER | 35 | 0 | 100.0% | well-covered |
| UNKNOWN | 5 | 0 | 100.0% | well-covered |
| ATP | 139 | 1 | 99.3% | well-covered |
| WTA | 140 | 4 | 97.2% | well-covered |
| NHL | 97 | 8 | 92.4% | well-covered |
| LoL | 93 | 10 | 90.3% | well-covered |
| MLB | 196 | 60 | 76.6% | well-covered |
| CS2 | 201 | 192 | 51.1% | partially covered |
| TENNIS (uncat.) | 5 | 6 | 45.5% | partially covered |
| NBA | 36 | 95 | 27.5% | partially covered |
| NCAAWB | 4 | 80 | 4.8% | essentially absent |
| NCAAMB | 4 | 231 | 1.7% | essentially absent |
| TOTAL | 955 | 687 | 58.2% |
Seven sports are well-covered (≥75% coverage), three are partially covered (25–75%), two are essentially absent (<25%). NCAAMB and NCAAWB fall out of the headline analysis because the closing-line capture pipeline came online after most of our NCAA trading window had already settled. The headline 78.8-pp gap is therefore evidence of skill on the measured subset; we cannot make the same claim for NCAA from this dataset.
The full trade dataset and the verification script are published. The headline numbers are reproducible in seconds; the per-sport table, the formal Z-tests, the threshold-robustness table, and the selection-bias analysis above all come from the same script.
curl -s https://zenhodl.net/api/trades.jsonl > trades.jsonl
python3 -c '
import json
plus = minus = wp = wm = 0
for line in open("trades.jsonl"):
r = json.loads(line)
if r.get("won") is None: continue
if r.get("closing_price_c") is None or r.get("entry_price_c") is None:
continue
diff = r["closing_price_c"] - r["entry_price_c"]
won = bool(r["won"])
if diff > 0.5: plus += 1; wp += int(won)
if diff < -0.5: minus += 1; wm += int(won)
print(f"CLV+: {wp}/{plus} = {wp/plus*100:.1f}%")
print(f"CLV-: {wm}/{minus} = {wm/minus*100:.1f}%")
print(f"Gap : {(wp/plus - wm/minus)*100:.1f} pp")
'
curl -O https://zenhodl.net/api/verify_clv_gap.py
curl -O https://zenhodl.net/api/trades.jsonl
python3 verify_clv_gap.py trades.jsonl
The dataset is updated continuously, so the exact numbers you see when you run the script will drift slightly from this page's snapshot. The page is dated and we re-publish a fresh snapshot roughly monthly. If your run produces materially different conclusions, please tell us.
What it proves
What it does not prove
ZenHodl. "The 78-Point Gap: Empirical Evidence of Forecasting Skill."
Snapshot 2026-05-08. Accessed 2026-05-08.
https://zenhodl.net/clv-evidence
Page content licensed CC BY 4.0. Underlying dataset is public.