How to Build a Lightweight Odds Scanner: Finding Value Bets with a Sports Odds API

Value Bets

A practical, step-by-step build you can ship in a weekend—no heavy ML required.

TL;DR (what you’ll ship)

  • A tiny script that:

    1. fetches lines across multiple books,

    2. converts odds → implied probabilities,

    3. removes the vig (overround),

    4. builds a consensus “fair” price, and

    5. flags bets where an available price beats the fair price by a threshold (e.g., 1.5%+ EV).

  • Optional: alerts to Slack/email and a CSV log for post-mortems and CLV tracking.

Why this works (and what “value” actually means)

Winning long-term isn’t about “hot picks.” It’s about price. If you can routinely buy +110 when the fair price is +102, you’re capturing positive expected value (EV) even if the single outcome loses today. This scanner’s job is to separate signal (mispricing) from noise (random line drift)—and do it in a way that’s simple, auditable, and fast.

The data you need (and nothing you don’t)

  • Markets: Moneyline (2-way) and 3-way (soccer match result) are the cleanest for a first build. You can add spreads/totals later.

  • Fields: event id + start time, league, teams/sides, market type, odds per book, and a last-updated timestamp.

  • Cadence: pre-game updates every 15–60s are fine. Faster isn’t always better—respect rate limits and cache sensibly.

Source: For real-time line movement and multi-book coverage, pull prices via a Sports Odds API and convert to no-vig probabilities before screening for value.

Keep provider naming consistent across your stack (league slugs, team names, market keys). Normalization saves hours of debugging.

Odds → implied probabilities

You’ll meet three formats. Normalize to decimal internally, then compute implied probabilities.

American → decimal

  • If A ≥ 100: dec = 1 + (A / 100)

  • If A ≤ -100: dec = 1 + (100 / |A|)

Fractional → decimal
dec = (numerator / denominator) + 1

Decimal → implied prob (booked, includes vig)
p_book = 1 / dec

Removing the vig (no-vig probabilities)

Books bake in margin so that Σ p_book > 1. For scanning, you need fair (no-vig) probabilities.

Two-way market (A vs B)

  1. Compute pA_book = 1/decA, pB_book = 1/decB.

  2. Overround M = pA_book + pB_book.

  3. No-vig:

    • pA_fair = pA_book / M

    • pB_fair = pB_book / M

Three-way market (home/draw/away)

  1. Compute booked probabilities for each outcome.

  2. M = pH + pD + pA

  3. p*_fair = p*_book / M for each outcome.

This proportional normalization is simple and widely used for scanning. More advanced models (e.g., Shin) can come later.

Building a consensus “fair” price

You have two solid options:

  1. Market consensus:

    • For the same event/market/outcome, take multiple books’ no-vig probabilities and average (mean or a robust trimmed mean).

    • Convert consensus probability back to a “fair” decimal price: dec_fair = 1 / p_fair.

    • Compare best available price vs. dec_fair.

  2. Your baseline model:

    • If you have a simple rating model (Elo/Poisson for soccer, etc.), treat the model probability as the fair probability.

    • Compare book prices to your model’s fair price.

For a first version, market consensus is easier and surprisingly strong.

Expected value (EV) math

Given a fair probability p_fair and a book’s decimal price d:

  • Edge (EV%) when betting 1 unit:
    EV = (p_fair * (d – 1)) – ((1 – p_fair) * 1)
    Express as percent of stake: EV% = EV * 100

A quick heuristic: every 1% of price edge you can capture consistently is valuable over thousands of bets.

Minimal Python reference (drop-in ready)

Here’s an example code snippet just to give you an idea. You can just drop it in but with some fiddling around you can tweak it to be even more powerful.

Notes on the demo:

  • In production, build per-book no-vig first (normalize within each book’s own outcomes) and only then compute a consensus average. The demo shows a simpler consensus-first approach to keep the example compact.

  • For spreads/totals, convert the pair of prices around a given line (e.g., -3.5) to two-way probabilities and normalize the same way.

Event and team normalization (save yourself pain later)

  • Event IDs: Different books label games differently. Map to a canonical event id (league + date + normalized teams).

  • Team names: “Niners” vs. “49ers,” “Man City” vs. “Manchester City.” Maintain a team alias dictionary.

  • Market keys: Lock to a small vocabulary: MONEYLINE_2WAY, MONEYLINE_3WAY, SPREAD, TOTAL.

  • Clock drift: Store both provider timestamps and your ingest time; use provider lastUpdated for freshness checks.

What’s a “good” EV threshold?

  • 1–2% EV for major markets with decent liquidity is a realistic starting band.

  • Raise thresholds for low-liquidity leagues or when latency is high.

  • Add a min sample filter: only consider consensus built from ≥2–3 independent books.

Alerting and logging

Alerts: Send only when:

  • EV ≥ threshold,

  • market freshness < 2 minutes,

  • kickoff > 10 minutes away,

  • you haven’t alerted this pick in the last X minutes (dedupe key = event + market + side + book + price).

Log: Append every candidate (alerted or not) to a CSV/DB with:

  • event_id, league, start_ts

  • market, side, best_book, best_price

  • consensus_fair_prob, EV%

  • seen_at and source update timestamps

This makes CLV analysis trivial later (compare your bet’s price at time of placement to the closing consensus price).

Rate limits, caching, and etiquette

  • Pull less, think more: If the API supports If-Modified-Since / ETags, use them.

  • Backoff: On 429s, exponential backoff + jitter.

  • Staggered polling: Alternate leagues or market types per tick to avoid bursts.

  • De-dup: Only process events that changed since last poll (hash event payloads by market).

Sanity checks (catch bad edges)

  • Symmetry: In 2-way markets, p_home_fair + p_away_fair ≈ 1. If not, drop the event or re-normalize.

  • Wild prices: If a price differs from consensus by >10–12% in probability terms, treat it as a stale/mistaken quote unless confirmed by multiple books.

  • Start times: Ignore events that start too soon; execution + limits matter more than theoretical EV.

  • Min books: Require at least 2–3 reputable books for consensus.

Extending to spreads and totals (quick recipe)

  • For a spread line (e.g., HOME -3.5), you’ll typically see two prices: HOME -3.5 @ -105, AWAY +3.5 @ -115.

  • Convert both to decimal, to booked probabilities, normalize to no-vig, and treat HOME-cover vs AWAY-cover like a 2-way market.

  • Consensus fair probabilities are built exactly as above.

Responsible betting & compliance

  • Add a prominent responsible betting disclaimer and local resources.

  • Respect data usage terms; do not republish raw odds at scale where ToS forbids it.

  • If operating in regions with marketing rules, include required disclaimers/age gates.

Roadmap (after v1 ships)

  • Per-book no-vig first, then consensus: Improves fairness and reduces book-mix bias.

  • Prop markets: Start with player points (NBA) or shots on target (soccer).

  • Auto-sizing stakes: Kelly fraction based on EV and variance.

  • CLV dashboard: Show distribution of (your price – closing fair price).

  • Live (in-play): Only after you’ve nailed latency, dedupe, and alerts.

Closing thought

A lightweight scanner won’t win every bet—it will win on price. By consistently buying +EV edges, logging everything, and iterating with data, you’ll build a durable advantage that scales from a script to a full analytics pipeline.

If you want, I can tailor this into a polished Markdown/HTML draft for a specific publication (tone, length, and house style) or expand the Python snippet into a production-ready module with per-book no-vig, caching, and Slack/email alerts.

No products in the cart.