Series Focus archive · 2026-05-15

The Crosstown Classic, From the South Side

The White Sox sit at 22-21, above .500 this deep into May for the first time in three seasons. The Cubs roll in at 29-16 with one of the five strongest team-strength ratings in the model. Three games at Rate Field this weekend. The model says CHC, the form table says CWS, and the pitching matchups split the difference. Here is what the data says is actually going on.

42 per cent.

That is the model's pre-series probability that the White Sox win this weekend — taking two of three or sweeping outright. It is higher than the season-long talent gap between these two teams would suggest, and lower than the form gap alone would imply, and the reason it sits in the middle is a story about three different signals pulling in three different directions.

Is the .500 real?

Before anything else: the question every White Sox fan has earned the right to ask. Is this team genuinely playing .500 baseball, or are they one cold week away from re-baselining to where the offseason projections had them?

The cleanest test is Pythagorean — comparing actual wins to what the team's run differential implies they "should" have won. The numbers across both teams:

Chicago White Sox · 22-21, run diff -8 Actual 0.529 · Pythag 0.529 · gap +0.1pp
Chicago Cubs · 29-16, run diff +48 Actual 0.563 · Pythag 0.569 · gap -0.6pp

Both teams have outperformed their run differential by roughly three percentage points. That's well inside normal variance — the usual rule of thumb is that anything inside ±5pp is essentially noise on a 40-game sample, and both gaps live comfortably in that zone. The White Sox' 0.529 record is not a luck story. It is a team that has scored 189 and allowed 197 runs and won the games close enough to those totals that the underlying run flow says it earned them.

That said: the run differential itself (-8 through 85 games) is the picture of a team playing pure-coin-flip baseball. The model's overall team-strength rating puts the Sox at -0.05 log-runs below league average — rank 25 of 30. That is the gap that does not get closed by being +6 in run differential over the last ten. The L10 number is real; the underlying ability is also real; they are not the same number.

The Cubs, meanwhile, are rank 5 overall at +0.10. The model thinks they are a 9-or-10-win-stronger team across a full season. That is the framing this series is happening inside.

Three games, three pitching stories

Every game in the series is at Rate Field. Park effects are a wash because each one is identical. Home-field advantage is the same six-percentage-point bump for the Sox in all three. What changes from night to night is the starter on the mound — and in this series, the starter posterior tells a much cleaner story than the season ERA does. From the model's pitcher_strength table, last fit overnight:

Friday, 6:40 PM CT · Sean Burke (R) -0.046 vs Edward Cabrera (R) -0.094 CWS 42% · proj 4.0–4.9
Saturday, 6:10 PM CT · Davis Martin (R) +0.109 vs Jameson Taillon (R) +0.097 CWS 48% · proj 3.9–4.2
Sunday, 1:10 PM CT · Erick Fedde (R) +0.012 vs Colin Rea (R) -0.091 CWS 43% · proj 4.2–4.7

Positive pitcher_quality suppresses opponent runs. Higher = better starter. The three games stack from worst-CWS-edge to best:

Starter posterior quality · 80% credible interval, CWS green vs CHC orange

Both rotations are all right-handed. That makes the platoon picture identical across all three games. By hand: White Sox lineup is 4L / 3R / 2S; Cubs lineup is 3L / 5R / 1S.

The slash lines vs RHP this season are what matters — the lineup blocks below show each batter's actual season-to-date OPS against right-handers (highlighted) alongside their vs-LHP line for context. Pulled from predictions.feat_batter_splits, which the warehouse populates nightly from the MLB Stats API statSplits endpoint:

CWS lineup OPS vs RHP

Green bars are above league-average OPS (~.720), orange below. Dashed line marks the league benchmark.

CHC lineup OPS vs RHP

Green bars are above league-average OPS (~.720), orange below. Dashed line marks the league benchmark.

Friday — Sean Burke vs Edward Cabrera. Both starters' posteriors sit close to league average — Cabrera with a thin positive lean, Burke with a thin negative one. The 80% credible intervals overlap heavily, so the model is not confident that either side has a real starter edge. The CWS WP of 42% is mostly the team-strength gap working as expected, partially offset by home field. A toss-up game with the wrong team favoured.

Saturday — Davis Martin vs Jameson Taillon. This is the game. Davis Martin is the model's rank 2 starter in baseball by posterior quality_mean — +0.109 with an 80% interval of [+0.012, +0.209]. Taillon is solid but not in that tier. The starter delta of +0.012 is enough to drag the model's CWS WP up to 48% — almost a coin flip despite the -0.15 log-run team-strength gap. If the Sox steal the series, it runs through this start.

Sunday — Erick Fedde vs Colin Rea. Both starters sit a tick below league average, with Erick Fedde's posterior fractionally better than Colin Rea's. That is enough to nose the matchup edge in CWS's direction at the starter level, but not enough to overcome the underlying team gap. CWS WP 43%. The kind of game where one bullpen mistake decides the afternoon — Sunday games at Rate are also typically the offensive ones at this time of year, given afternoon wind-out conditions off the lake.

How each starter pitches

Six starters, six different shapes. Each heatmap below is built from every pitch each starter has thrown in 2026 — pulled from the new mlb_bronze.pitches ingestion and aggregated by zone in mlb_gold.feat_pitcher_zone. The 3x3 centre is the strike zone (catcher's perspective); the four outer cells are the corner zones outside the strike zone. Colour scales with pitch frequency — darker cells are where the pitcher most often locates.

Sean Burke RHP CWS · Friday 1,515 pitches '26

fastball · 58% · whiff/swing 14%
breaking · 39% · whiff/swing 14%
offspeed · 3% · whiff/swing 10%

Edward Cabrera RHP CHC · Friday 1,202 pitches '26

fastball · 39% · whiff/swing 12%
breaking · 31% · whiff/swing 19%
offspeed · 30% · whiff/swing 19%

Davis Martin RHP CWS · Saturday 1,419 pitches '26

fastball · 58% · whiff/swing 18%
breaking · 27% · whiff/swing 12%
offspeed · 15% · whiff/swing 8%
other · 0% · whiff/swing nan%

Jameson Taillon RHP CHC · Saturday 1,129 pitches '26

fastball · 57% · whiff/swing 17%
other · 17% · whiff/swing 13%
offspeed · 15% · whiff/swing 21%
breaking · 11% · whiff/swing 10%

Erick Fedde RHP CWS · Sunday 1,386 pitches '26

fastball · 48% · whiff/swing 10%
other · 38% · whiff/swing 12%
offspeed · 14% · whiff/swing 17%

Colin Rea RHP CHC · Sunday 1,490 pitches '26

fastball · 57% · whiff/swing 11%
offspeed · 18% · whiff/swing 18%
breaking · 16% · whiff/swing 9%
other · 9% · whiff/swing 18%

The Saturday pair is the cleanest tell. Davis Martin works the glove-side edge of the zone — the right column of the heatmap — heavier than the average right-hander, which is consistent with his model-rated quality_mean sitting top-five in MLB. His location density doesn't drift to the heart of the plate the way a more hittable starter's would. Taillon, by contrast, lives middle. That's a real reason Saturday's coin-flip WP is not actually a coin flip on the underlying mechanics.

Where the leverage lives

For each game in the series, simulating 2,000 trajectories from pre-game state — Poisson run-scoring per half-inning with each team's expected rate — and recording the CWS win probability at the end of each half-inning. The dark line is the median; the shaded band is the 25th-to-75th percentile across simulations. Where the band fans out is where the game tends to actually be decided.

Friday · Sean Burke vs Edward Cabrera · pre-game CWS WP 42%

Saturday · Davis Martin vs Jameson Taillon · pre-game CWS WP 48%

Sunday · Erick Fedde vs Colin Rea · pre-game CWS WP 43%

The three trajectories share a shape: median sits within five or six percentage points of 50% through five innings before starting to fan out in the 6th–8th, then collapse hard in the 9th. That late-game collapse is the entire reason the bullpen- leverage section above matters — the 25/75 band at end of T7 is already wider than the entire pre-game uncertainty, and at end of B8 the worst-case and best-case CWS outcomes are 60+ percentage points apart. Manager decisions in those innings are doing more for the outcome than anything that happened before.

The Skellam normal approximation used to compute conditional WP at each checkpoint isn't quite tight with the full PyMC per-game prediction — pre-game median on the chart can sit a couple of percentage points off the model's reported number — but the shape and the leverage timing are robust to that approximation. The chart is a where, not a what.

The bullpen state going into Friday

Every Crosstown game ends up in the bullpens. Three nights in a row at Rate Field, neither starter likely to reach the 7th, and whichever club has more high-leverage arms left for the late innings of any given game holds an edge the per-game model doesn't fully price in. The new predictions.pitcher_strength table gives us the model's posterior on each available reliever; the mlb_gold.feat_pitcher_workload table gives us who is fresh enough to use. Joined:

White Sox top-stack quality +0.021 · 9 fresh / 10 available
Grant Taylor (R) · fresh · 3d rest +0.024
Bryan Hudson (L) · fresh · 4d rest +0.022
Brandon Eisert (L) · taxed · 1d rest +0.017
Cubs top-stack quality +0.011 · 12 fresh / 13 available
Javier Assad (R) · fresh · 2d rest +0.025
Ryan Rolison (L) · fresh · 2d rest +0.006
Bryse Wilson (R) · fresh · 4d rest +0.000

The top-stack metric is the average posterior pitcher_quality across each team's three highest-rated fresh-or-taxed relievers — the model's "high-leverage core" — with positive values representing above-league-average run suppression. On those numbers the White Sox come into Friday with a +0.010 late-inning edge, which would normally be worth two to three percentage points of late-game win probability per game if both clubs use their best arms. Over three nights, that's real.

Bullpen-leverage caveat: warehouse ingestion currently only captures pitchers who have actually appeared in a 2026 game, which leaves most bullpen rosters two-to-four arms deep instead of the eight-arm reality. The numbers above are correct given the available inputs but understate stack depth on both sides until the upstream stg_pitcher_game_log ingestion is widened. Worth treating the comparison directionally, not absolutely, for now.

How the series resolves

Taking the three per-game win probabilities and treating the games as independent (defensible at this scope; the next iteration adds bullpen-leverage correlation across games), the Monte Carlo distribution over outcomes is:

Expected White Sox wins: 1.33 of 3. The single likeliest outcome (41%) is the Cubs taking it 2-1. The second likeliest (33%) is the Sox doing exactly the same to them. Together those two paths cover more than seven in ten simulations; the two sweep scenarios share what is left. This is a series where one game decides three.

What the rating posterior does about it

Every game is information. The current overall ratings come from a posterior fit nightly on the season-to-date; a three- game series moves both numbers by a small but measurable amount. Under a conjugate-normal update with a per-game log-rate observation variance of σ²_g = 0.025, here is where each team's overall rating sits today and where each outcome of this weekend would push it by Sunday night:

Chicago White Sox — current overall -0.050 (-0.291 to +0.266)

Light band = current 80% CI · dark line = current posterior mean · dots = scenario means (green = up, orange = down)

scenario detail (record · run diff · new pythag)

sweep 3-0 · record becomes 48-40 · run diff +31 rating +0.0425 · new pythag 0.534
win 2-1 · record becomes 47-41 · run diff +27 rating +0.0142 · new pythag 0.530
lose 1-2 · record becomes 46-42 · run diff +23 rating -0.0142 · new pythag 0.526
swept 0-3 · record becomes 45-43 · run diff +19 rating -0.0425 · new pythag 0.521

Chicago Cubs — current overall +0.100 (-0.170 to +0.362)

Light band = current 80% CI · dark line = current posterior mean · dots = scenario means (green = up, orange = down)

scenario detail (record · run diff · new pythag)

sweep 3-0 · record becomes 52-38 · run diff +69 rating +0.0419 · new pythag 0.573
win 2-1 · record becomes 51-39 · run diff +65 rating +0.0140 · new pythag 0.569
lose 1-2 · record becomes 50-40 · run diff +61 rating -0.0140 · new pythag 0.565
swept 0-3 · record becomes 49-41 · run diff +57 rating -0.0419 · new pythag 0.561

The implied rating shifts are real but modest — a sweep either way moves the overall by roughly a sixth of the current credible-interval half-width, a 2-1 outcome moves it by about a twentieth. None of those are enough to re-rank the team in any meaningful sense, but the direction of the movement does compound: a White Sox sweep ends the weekend with the model rating them genuinely closer to league average than to the bottom third, and the next series sits on that updated posterior. The Pythagorean column on the right is the run-flow translation of the same numbers — less abstract for the non-Bayesian reader, identical signal.

This is the slower feedback loop alluded to in the form section earlier — single games barely register against 43 prior data points, but a three-game stretch is enough to nudge the long-run estimate by a measurable amount. Every weekend of the season is a small Bayesian update; the Crosstown one is just the first that comes with a story.

What to watch, in order

1. Davis Martin's command on Saturday. The model has him as a 2-best starter in baseball on posterior quality. That is a top-tier rating from a pitcher who didn't crack a top-30 list anywhere coming into the season. Posterior credible intervals are wide because the sample is still short — the same eight-start cohort we wrote about in the Misiorowski piece sits between Martin and the unknown true-talent below him. If he goes seven and the Sox win Saturday, the posterior tightens and he stays elite. If he gets knocked around early, the interval widens and the rating regresses fast. Saturday's outing is doing more for the season-long projection than for the weekend game.

2. The bullpen state on Sunday afternoon. Both teams will likely have used their best arms by Saturday night. Whoever has more high-leverage availability left for Sunday wins a one-run game. The top-stack quality numbers above will move materially between Friday and Sunday as relievers appear in the early games and shift from fresh → taxed → unavailable. Refresh the Today tab Sunday morning before deciding how much faith to put in the model's 43% Sox WP — the bullpen-edge strip on the matchup card is the cleanest read.

3. Murakami against right-handers. Both Friday and Sunday feature RHP starters for the Cubs. The Sox's middle of the order — Murakami, Vargas, the rest of the left-side power — eats RHP at a rate well above their season line against lefties. The earlier piece on the Murakami split documented the underlying breaking-ball weakness; the platoon line is the opposite picture. If the Sox hit RHP this weekend the way they have hit RHP all year, the per-game WPs above are slightly conservative.

4. Whether the +6 form signal is real or just regression coming. The Sox are riding a +6 run differential over their last ten — meaningfully above the season pace. The model already incorporates this through the β_form coefficient on team-form, but the prior on that coefficient is intentionally tight (a 1σ bump is worth roughly two percentage points of WP, not twenty). If the Sox keep this form running, the team-strength posterior itself will start to move; that is the slower feedback loop where a hot stretch becomes a re-rated team. A weekend split would keep that machinery turning.

The bigger model question this series tests

The yesterday-night recalibration of the model (tie redistribution, recency τ tightened to 0.5, the new pitcher_strength table that this whole page reads from) flipped CWS-related picks in particular: the Sox jumped from being systematically under-rated (because the old, longer-recency model was holding their 2024-25 records against them) to being correctly framed as a competent ~.500 outfit. The Saturday Martin pick is the cleanest example — before the recalibration, his posterior was middle-of-the-pack; after, it sits near the top of the league. The same Saturday game would have come out as a clear Cubs edge two days ago. It does not, now.

Whether the model is right to have moved that much, that fast, is what the weekend answers.

Methodology: per-game win probabilities are read from predictions.game_predictions, generated nightly by the PyMC fit described on the Methodology tab. Pitcher quality intervals come from predictions.pitcher_strength, a new table added on 2026-05-14 that persists the per-pitcher posterior mean and 80% credible interval from each training run. Pythagorean expected win percentage uses the Bill James 1.83 exponent (RS^1.83 / (RS^1.83 + RA^1.83)). The series outcome distribution treats the three games as independent draws on their per-game posterior WPs. The bullpen-leverage section reads from predictions.pitcher_strength joined to mlb_gold.feat_pitcher_workload and surfaces the same projection that drives the bullpen-edge strip on the Today tab. Game-to-game outcome correlation via shared bullpen state — i.e. a reliever used Friday is taxed for Saturday — is not yet reflected in the series MC; that is the next iteration.