Drawdown-Aware Position Sizing: When Kelly Tells You To Bet Less

Most position-sizing math assumes your edge is real. Kelly does. Fractional-Kelly does. They both take the edge as input and tell you what fraction of bankroll to commit. That's fine when the model is working. The problem is the day the model stops working — and you don't know yet whether the model has stopped working or you're just getting unlucky.

Six straight losing days in late April taught us this lesson cleanly. Our calibrated win-probability models were posting nominal edge. The Kelly sizer was happily allocating 4-6% of session bankroll per trade. Win rate held at 58% — well within the noise band of a healthy model. But the realized P&L was deeply red, and we had no good in-loop response: the circuit breaker hadn't tripped yet (sample size still below threshold), and the bot kept entering full-size positions until something gave.

This post is about what we built to fill that gap — a smoother sizing ramp that sits between full Kelly and a hard circuit-breaker stop, so the bot can keep trading (and keep generating data) at reduced exposure while a drawdown is still ambiguous. The code is short, the implementation took an afternoon, and the production behavior is meaningfully better than either the binary stop or the unmodified Kelly fraction.

The Gap Between Kelly and the Kill Switch

Production trading systems typically have two sizing tools:

A Kelly fraction (often quarter-Kelly or eighth-Kelly to handle estimation error in the edge) that scales position size up with edge confidence
A kill switch or circuit breaker that disables trading entirely when some loss threshold is crossed

These are at opposite ends of the response curve. Kelly says trade. The kill switch says stop. Between them is a wide region — the early stages of a drawdown — where neither tool fires. Kelly keeps sizing as if the edge is fully real. The kill switch waits for a multi-day pattern before deciding the strategy is broken. In that gap, the bot continues to bet full size into a regime that is, statistically, more likely to be adverse than a healthy one.

You can argue this gap should be narrower (lower kill-switch threshold) or that it doesn't matter (just take the bleed). We disagree on both. Lowering the circuit-breaker threshold means more false trips on noise-driven losing streaks, which costs you the strategy's actual edge during normal variance. Taking the full bleed means your bankroll absorbs the worst case while you wait for confirmation, and bankroll is the resource that keeps you alive long enough to get more confirmation. The right shape isn't a step function; it's a ramp.

The Ramp

Here is the production schedule we settled on, after walk-forward testing across a year of trading data:

session $ P&L              size multiplier
> 0                        1.00   (full size — winning today)
0  to -10                  0.70   (cool off — lukewarm)
-10 to -25                 0.40   (real drawdown — protect bankroll)
< -25                      0.00   (effectively CB-tripped)

Three observations on why this shape works:

1. The first cooling tier kicks in at zero, not at a loss. The moment the session's running P&L crosses below zero, the multiplier drops to 0.7. This sounds aggressive — most traders would only respond after a meaningful loss — but the mathematical justification is clean: a session that has crossed below zero is, by definition, in the worse half of the distribution of sessions. Sizing at full Kelly there is sizing as if you're in the upper half. The 30% haircut adjusts for the fact that you've already learned something.

2. The middle tier is where most of the work happens. Between -$10 and -$25 of session P&L is the danger zone — too small for the kill switch, too large to ignore. A 40% multiplier means the bot is still in the market, still generating signals and CLV records, but each individual trade has a quarter the dollar exposure of a full-Kelly trade. If the model is fine and we're just unlucky, the small-size trades will quickly recover into the upper tier. If the model is genuinely degraded, the small-size trades minimize how much we lose before the circuit breaker confirms the regime change.

3. The bottom tier is zero, not "circuit breaker." When session P&L drops below -$25, the multiplier becomes zero, which functionally pauses entries. This is intentional overlap with the sport-level circuit breaker — they catch different timescales (intra-session vs multi-session) and the redundancy is a feature. If the per-session ramp blocks first, the circuit breaker has more time to evaluate whether to suspend the sport entirely.

The Code

The whole helper is about 50 lines. Here's the live production version, lightly trimmed for readability:

"""Drawdown-aware position sizing helper.

Adds a smooth ramp between Kelly and the binary circuit breaker. When
the session is in drawdown, scale position size down before the CB
fully trips — keeps signal flowing while limiting bleed.
"""
from typing import Optional

DEFAULT_RAMP = (
    (0.0, 1.00),     # winning today — full size
    (-10.0, 0.70),   # mild drawdown — cool off
    (-25.0, 0.40),   # real drawdown — protect bankroll
    (-1e9, 0.00),    # deep drawdown — effectively CB-tripped
)


def scale_for_drawdown(
    base_size: float,
    session_pnl_usd: float,
    ramp: Optional[tuple] = None,
) -> float:
    """Multiply Kelly-sized position by a drawdown-dependent factor."""
    if base_size <= 0:
        return 0.0
    schedule = ramp or DEFAULT_RAMP
    for floor, mult in schedule:
        if session_pnl_usd >= floor:
            return float(base_size) * float(mult)
    return 0.0

The integration into the trading loop is one line. After Kelly produces a base size, the drawdown scaler runs over it:

from drawdown_scaler import scale_for_drawdown

base_size = kelly_size(edge, fair_prob, bankroll)
size = scale_for_drawdown(base_size, session_pnl_usd=self._session_pnl)

if size <= 0:
    return None  # ramp closed the entry — log and skip

submit_order(token, side, size)

self._session_pnl is tracked by every bot already (it's needed for the kill switch and for logging). No extra plumbing.

What It Caught

The afternoon we deployed this, one of our moneyline bots was three trades into a session that was running -$8.20 — not enough to trip anything, but the ramp had already cooled the multiplier to 0.7. The next signal that fired was a 6c-edge MLB entry, base size 28 shares, ramp-scaled to 19 shares. Without the ramp it would have been 28 shares of a trade that resolved against us; the ramp saved roughly $2.50 on that single trade.

That sounds small in isolation. The point isn't the $2.50 — it's that across hundreds of trades during ambiguous-drawdown sessions, the ramp consistently subtracts dollar-cost from the worst part of the equity curve. We backtested the ramp against a year of trade data: on healthy days it costs essentially nothing (the multiplier is 1.0 most of the time, since most sessions are positive or only mildly negative); on the worst-decile sessions it cut realized losses by 22-31% with no measurable impact on upside. That's the asymmetric trade we wanted.

Why Ramps Beat Thresholds

Three practical reasons:

Smoother behavior under noise. Threshold-based response (size 100% above -$10, 0% below) creates a discontinuity right where the bot is most sensitive. A session bouncing between -$9 and -$11 P&L would be alternately full-size and zero-size, generating jittery operator dashboards and brittle log analysis. A ramp avoids the discontinuity entirely.

Information preservation. A ramp at 40% size still places trades. Those trades resolve. We learn whether the model is broken or just unlucky. A hard threshold at the same point would block every entry, leaving you to guess. Information costs us 60% of the position size; it buys us evidence we use later.

Operator psychology. Hard cutoffs invite override. Operators see the bot stop trading, second-guess the threshold, and reach for the manual unlock. A ramp doesn't generate the same pressure — the bot is still trading, just smaller. We've had zero operator overrides of the ramp in production. We had several of the binary kill switches it replaced.

Tuning Per Sport

The default ramp uses absolute dollar P&L, which is reasonable for a single-bankroll bot. For multi-sport setups where each sport has its own session-cap (some sports get $100 of session risk, others $400), the breakpoints should scale to the cap. We expose this via the ramp argument:

# Higher-allocation sport, ramp scaled to 4x default
NHL_RAMP = (
    (0.0, 1.00),
    (-40.0, 0.70),
    (-100.0, 0.40),
    (-1e9, 0.00),
)

size = scale_for_drawdown(base_size, session_pnl, ramp=NHL_RAMP)

Two heuristics for picking breakpoints on a new sport:

First-tier floor at roughly 5% of the sport's daily session cap. This is the threshold at which "today is meaningfully losing" stops being noise.
Hard-zero floor at roughly 25% of the cap. Below this, the smaller of the ramp or the circuit breaker decides; either way the bot stops adding exposure.

Don't overengineer the middle. Two intermediate tiers (cool-off and drawdown) are enough — adding more tiers gives you false precision over a noisy signal.

What Not To Do

A few patterns we tried and dropped:

Don't reset the ramp after a single winning trade. Recovery should require sustained positive P&L, not one trade. We initially had the multiplier rebound the moment session P&L crossed back above any threshold; this caused the bot to alternate between cool-off and full-size every other trade as the running P&L oscillated. Now the ramp uses a hard hysteresis: once you're below -$10, you have to climb back above -$5 before the multiplier increases. Same pattern as the sport circuit breaker.

Don't apply the ramp to exit sizing. The ramp scales entries, not exits. If the bot already holds a position when the session goes deep red, the right move isn't to scale down the exit — that locks in more loss. Hold the existing position to settlement (which is the normal hold-to-settlement strategy) or take a deliberate stop, but don't let the ramp interfere with already-committed exposure.

Don't share the ramp across bots. Each bot tracks its own session P&L and runs its own ramp. A multi-bot portfolio that pools ramps would have weird coupling — a winning soccer bot would be subsidizing a losing tennis bot's risk appetite, and the failure mode would only show up when both started losing simultaneously. Per-bot ramps localize the failure.

Pairing With the Circuit Breaker

The drawdown ramp and the sport-level circuit breaker work on different time horizons and complement each other:

Mechanism	Time horizon	Trigger	Action
Drawdown ramp	Intra-session (hours)	Running session P&L	Scale entry size 100% → 0%
Sport circuit breaker	Multi-session (days)	Rolling 30d ROI	Block all entries until recovery

A bad afternoon hits the ramp first — entries scale down within the same session. If the bad afternoon turns into a bad week, the circuit breaker kicks in over the multi-day window and disables the sport entirely until ROI recovers. The two responses are in series, not parallel: ramp before breaker. We've seen sessions where the ramp caught a real degradation early enough that the circuit breaker never needed to trip — the bot self-paced through the rough patch and recovered before multi-day stats turned ugly.

Size as Information

The deeper takeaway is one we keep coming back to: position size should respond to evidence as it accumulates, not just to a fixed formula computed at trade entry. Kelly is what to bet if the edge is real. The drawdown ramp is what to bet given new evidence about whether the edge is real. They're complementary, not competing.

Production trading systems that don't have this layered response tend to oscillate between two failure modes: undersizing healthy edges (so afraid of drawdowns they sandbag everything) or oversizing degraded ones (committed to the math even after the math has stopped applying). The fix is to let recent evidence — whatever measure you trust most, whether that's session P&L, closing-line value, realized win rate, or rolling ROI — modulate the static formula in real time.

The ramp is a 50-line implementation. The discipline to take its output seriously, even when you can talk yourself into "the model will recover next trade," is the harder part. We trust the ramp now. After enough sessions where the ramp cut a 4% drawdown to a 2% one, the trust gets easier.

Drawdown-aware sizing is part of the production stack we ship in our API and walk through in the bot course. Module 5 covers the full risk-control hierarchy: Kelly, fractional Kelly, drawdown ramps, sport-level circuit breakers, and daily kill switches — and how they compose into a system that survives bad days.