← Back to blog

How to Build a Polymarket Bot in Python: Architecture, Components, and What Actually Matters

2026-05-12 polymarket bot python architecture trading

A working Polymarket trading bot in 2026 has eight components. Most beginner tutorials show you one or two (usually authentication + a market data fetch) and stop there. The result is people copy-pasting code into a script that calls the API but is missing the persistence, the risk layer, the settlement handling, and the observability — all the parts that determine whether the bot is profitable or just an expensive way to lose USDC.

This post is the architectural walkthrough. We are not going to paste copy-pasteable code for every component — for that, see our Polymarket API Python tutorial. We are going to walk through what each component does, why it matters, and how to think about the design choices.

If you want the conceptual one-pager on what a bot does at all, our how prediction market bots work post covers that. This post assumes you understand the concept and want the production architecture.

The Eight Components

A working Polymarket bot consists of:

  1. Market discovery — which markets are active right now and worth scanning?
  2. Price feed — real-time bid/ask for those markets.
  3. Fair value model — your independent estimate of the true probability.
  4. Signal generation — when does edge cross your threshold?
  5. Risk & sizing — how big should each position be?
  6. Order placement — submit to the CLOB and confirm fills.
  7. Position tracking — what do you currently own, at what cost?
  8. Settlement & redemption — collect winnings when markets resolve.

Plus two cross-cutting concerns: logging/observability (which of these is breaking right now?) and persistence (state survives restarts).

You can skip none of these and end up with a working bot. We will walk through each in order of build priority.

1. Market Discovery

Your bot needs to know which markets are tradeable today. Use the Gamma API:

import requests

def list_active_sports_markets():
    resp = requests.get(
        "https://gamma-api.polymarket.com/markets",
        params={"tag": "sports", "active": True, "limit": 500}
    )
    return resp.json()

For each returned market you get a condition_id and the YES/NO token_id values. Cache this — the active market list does not change second-to-second.

Our own bots keep a poly_token_cache.json file (~14K tokens, schema v3) that maps human-readable game slugs to token IDs. The cache is rebuilt every few hours by a separate script, not on every signal evaluation. This decouples discovery from the trading loop and keeps the trading loop fast.

2. Price Feed

For real-time prices you need the WebSocket stream — polling REST every second is rude and slow. Connect to wss://ws-subscriptions-clob.polymarket.com and subscribe to the tokens you care about:

# Pseudocode — see py-clob-client for the real implementation
ws = await connect(WS_URL)
await ws.send_json({"type": "subscribe", "channel": "market", "tokens": token_ids})
async for msg in ws:
    update_orderbook(msg)

You will hit operational reality fast: - The WS will drop connections silently. Build a heartbeat-based reconnection layer with re-subscription on reconnect. - Subscribing to too many tokens at once degrades message rate. Batch wisely. - The initial snapshot from a subscription is large; treat incremental updates differently from the snapshot.

Our polymarket_ws_client.py handles reconnects, snapshot-vs-delta semantics, and graceful degradation when the WS is unhealthy. A reasonable starting point: 15-second heartbeat, reconnect after 30 seconds of silence, automatic re-subscription with the original token list.

3. Fair Value Model

The single most important component — and the only one that can have edge. Everything else is infrastructure.

Your fair value model takes whatever inputs you have (game state, team Elo, player stats, market microstructure) and produces a probability estimate. The critical constraint: the model cannot be trained on market prices. If it learns to agree with the market, you have nothing tradeable. Train against actual game outcomes.

Common approaches: - Win probability models trained on historical games. Inputs: score differential, time remaining, possession, Elo ratings, team stats. Output: P(home_wins). - Hierarchical models for sports with structured progression (tennis points → games → sets, soccer score over elapsed minutes). - Series models for multi-game/multi-map formats (BO3 esports).

Whatever you build, calibrate it. A model that says 70% should win 70% of the time. We covered calibration math in calibration beats accuracy; a calibrated model is the difference between a sized bet that works and a sized bet that blows up.

4. Signal Generation

A signal fires when your fair probability and the market price disagree by enough to be tradeable after costs. The basic check:

def should_trade(fair_prob: float, market_ask_c: int, min_edge_c: int = 5) -> bool:
    market_implied_c = market_ask_c
    fair_implied_c = int(fair_prob * 100)
    edge_c = fair_implied_c - market_implied_c
    return edge_c >= min_edge_c

The real version is messier. You filter for: - Maximum edge (extreme edges are usually stale data or model bugs, not real opportunities — our bots cap somewhere between 15-35c per sport, tuned per-sport based on observed calibration) - Minimum and maximum entry price (avoid buying at 5c or 90c on most strategies) - Cooldowns (do not trade the same market repeatedly within X seconds) - Per-sport circuit breakers (if a sport is bleeding the last 30 days, do not trade it) - Data freshness gates (if the price feed is stale, do not trade)

Most of these are about avoiding low-quality signals, not finding good ones. The good signal is just edge > threshold; everything else exists because real markets generate edge cases that look like signals but are not.

5. Risk & Sizing

Once a signal fires, how much do you bet? The honest answers in increasing sophistication:

We use a dynamic_sizing.py module that combines edge-tiered Kelly with per-sport allocations (BASE_ALLOCATION per sport) and a drawdown_scaler that reduces size during losing streaks. Kelly without drawdown protection ruins bankrolls; flat sizing leaves money on the table. Pick a middle path and stress-test it.

For the math, see our Kelly Criterion for Prediction Markets post.

6. Order Placement

Use the py-clob-client library. It handles the EIP-712 signing, the HMAC authentication, the proxy wallet integration. Do not roll your own signing unless you have a specific reason.

from py_clob_client.order_builder.constants import BUY
from py_clob_client.clob_types import OrderArgs, OrderType

order_args = OrderArgs(
    price=0.55,         # in dollars, not cents
    size=100,           # shares
    side=BUY,
    token_id="0x...",
)
signed_order = client.create_order(order_args)
resp = client.post_order(signed_order, OrderType.FOK)  # fill-or-kill

Order types matter. For a strategy chasing transient edges, OrderType.FOK (fill-or-kill) makes sense — either the full size fills now or the order cancels. For a strategy posting limit orders, OrderType.GTC (good-till-canceled) is right. Match the order type to the use case.

Handle partial fills explicitly. A 100-share order that fills 47 shares is a real situation; your position tracker needs to know about both the 47 filled and the 53 canceled.

7. Position Tracking

Your bot needs to know what it owns, at what entry price, and whether each position is still open. The naive approach (in-memory dict) breaks on restart. The right approach is an append-only log on disk.

Our bots write to trades.jsonl — one JSON record per trade event with fields like sport, entry_price_c, size, pnl_c, won, resolved, ts. The trade logger (trade_logger.py) is the source of truth. On bot restart, the bot reads the recent log to reconstruct open positions before resuming the main loop.

This matters more than people think. Without persistent position state: - Restarts wipe knowledge of open positions, so the bot cannot manage them. - The bot might re-buy the same position because it does not know it already owns it. - You cannot reconcile against actual on-chain holdings at startup.

Build the persistence layer second (after market discovery). Building it after the bot is "working in memory" means rewriting everything else around it.

8. Settlement & Redemption

When a market resolves, you need to: 1. Detect that resolution happened (poll market metadata, or watch settlement events). 2. Update your position records (resolved=True, won=True/False). 3. Redeem the winning shares to receive USDC.

The redemption step is the gotcha. For standard markets it is a straightforward CTF.redeemPositions(conditionId, amounts) call. For neg-risk markets (multi-outcome events like championship YES/NO per team), you use the NegRiskAdapter.redeemPositions on the adapter contract at 0xd91E80cF2E7be2e162c6513ceD06f1dD0dA35296 — and the token IDs you pass are the wrapped neg-risk token IDs, not the CLOB token IDs. Get this wrong and the call reverts silently. We covered this in our Polymarket API documentation guide.

If you use a proxy wallet (recommended for bots), redemption goes through the Safe's execTransaction with EIP-712 signing rather than a direct EOA call. Plan for this in your wallet integration code from day one — bolting on Safe support later is significantly painful.

Cross-Cutting: Logging and Persistence

The two things that turn a script into a system:

Logging. Every signal, every order, every fill, every reject. We log with structured fields (sport, market, edge, price, decision, result) so we can query the log to answer "why did the bot do X yesterday?" Plain print() statements are not enough; use the logging module with file rotation and at least info-level by default.

Persistence. All state that matters across restarts goes to disk: trade log, position state, calibration buffer, cooldowns, circuit breaker state. We use JSONL files for append-only state and JSON files for replaceable state. SQLite is fine if you outgrow flat files; many production bots run for years on JSONL with periodic compaction.

A Sensible Build Order

Do not build all eight components in parallel. The order that worked for us:

  1. Market discovery (component 1) — confirm you can list markets and resolve tokens.
  2. Persistence (cross-cutting) — set up trades.jsonl and the trade logger before any trades happen.
  3. Price feed (component 2) — get real-time prices flowing into local state.
  4. A trivial fair value model (component 3) — even just "predict 50% for everything" so you can develop the rest of the loop. Replace with real model later.
  5. Signal generation (component 4) — wire fair value into edge calculation.
  6. Order placement (component 6) — submit one tiny test order, confirm fill.
  7. Position tracking (component 7) — load and reconcile positions on restart.
  8. Settlement & redemption (component 8) — collect winnings on the first resolved trade.
  9. Risk & sizing (component 5) — start with flat sizing, upgrade to Kelly once you have data on realized edge.

Each step compiles and runs end-to-end before you move on. You will have a working (terrible) bot after step 6. A working (mediocre) bot after step 8. A real bot once you replace the trivial fair value model with something calibrated.

Common Mistakes That Sink Bots

In rough order of how often we see them:

  1. Training the fair-value model on market prices. Bot learns to agree with the market. Has no edge. Loses fees while feeling smart.
  2. No max-edge cap. Bot trades on stale model outputs that produce huge spurious edges. Loses on every "amazing opportunity" because the opportunity wasn't real.
  3. Ignoring slippage in the backtest. Backtest shows 8c edge per trade; live bot loses money because real fills cost 3-5c that the backtest didn't model. Build realistic slippage into both backtest and live cost accounting.
  4. No drawdown protection. Bot blows up during the inevitable losing streak because Kelly sizing assumed you would never lose 8 in a row.
  5. Skipping settlement handling. Bot accumulates winning positions but never redeems them. Winnings sit as tokens instead of USDC; bot cannot reinvest.

Production Considerations

Once your bot works, the deployment questions:

Bottom Line

A Polymarket bot is not hard to build. It is hard to build well, because "well" means correctly handling restart, settlement, partial fills, slippage, drawdowns, stale signals, and the dozen other edge cases that do not show up in tutorials.

The 80/20 advice: spend most of your engineering time on persistence, observability, and the fair value model. The other components are largely solved problems with well-known patterns. The fair value model is where edge actually lives, and the operational layers (persistence, logging, restart safety) are what keep that edge alive across the inevitable production failures.

If you want to see how we sequenced this in practice, our Polymarket trading bots P&L breakdown walks through actual results across the bots described above.

Related deeper reads: - Polymarket API Python Tutorial — the hands-on API code. - Polymarket API Documentation Guide — the docs orientation and operational gotchas. - How Prediction Market Bots Work — the conceptual one-pager. - Backtesting Polymarket Strategies — testing the bot before deploying. - Kelly Criterion for Prediction Markets — sizing math. - Calibration Beats Accuracy — why fair-value model calibration matters.

Get ZenHodl Weekly

One weekly email with live results, one model insight, and product updates.

Tuesday mornings. No spam.

Want to build this yourself?

The ZenHodl course teaches you to build a complete prediction market bot in 6 notebooks.

Join the community

Discuss strategies, share results, get help.

Join Discord