I ran 1 trillion Kentucky Derby simulations on a 1,000-vCPU cluster. Here’s what the model likes

Built a Kentucky Derby model on a 1,000-vCPU cloud cluster.

https://burla-cloud.github.io/examples/kentucky-derby-demo/

Pipeline: Dirichlet weight search across 16 historical Derbies (2010 to 2025) + sklearn ensemble for ML probs + 1,000,000,000,000 Monte Carlo race sims. 48.9 minutes wall time. Yes, one trillion sims. No, my electric bill did not enjoy this.

Backtest landed 126/160 on a 10-5-2-1-0 ranking metric. 2,000-permutation null test (re-run after scrambling winner labels) puts p < 1/2000. Real signal, not search noise.

This is not financial advice. The model is a math toy, not a guarantee, and a trillion sims doesn't change the fact that a horse race is still a horse race.

Four scratches (Silent Tactic, Fulleffort, Right To Party, The Puma) cut the field to 19. All comparisons below are model win % vs morning-line implied %. Program posts (1, 2, 3, 4, 6, 7, 8, 10, 11, 12, 14, 15, 16, 17, 18, 19, 21, 22, 23) leave gaps where horses scratched and put the three also-eligibles (Great White, Ocelli, Robusta) on the deep outside.

Top win pick (BET)

Further Ado (post 18, 6-1). 27.9% vs 14.3% = 1.95x. Field-leading 106 Beyer. Cox / Velazquez. Drew the highest-historical-win-rate gate in the 2010-2025 sample (Authentic won from post 18 in 2020). The chalk is also the value play.

Four longshots tagged BET (model at least 1.5x morning-line implied)

Litmus Test (post 4, 30-1). 6.12% vs 3.20% = 1.91x. Baffert / Garcia. Beyer 96.
Intrepido (post 3, 50-1). 3.75% vs 2.00% = 1.88x. Berrios / Mullins. Beyer 89, Pace style.
Robusta (post 23, 50-1). 3.73% vs 2.00% = 1.86x. O'Neill again. Calumet homebred. Drew in from AE list when Right To Party scratched.
Pavlovian (post 16, 30-1). 5.58% vs 3.20% = 1.74x. O'Neill (2-for-Derby) / Maldonado. Beyer 90 sits one above field median. Post 16 is where Sovereignty won in 2025.

Top 5 by model win %

Further Ado, 27.90%
Chief Wallabee, 6.75%
Litmus Test, 6.12%
So Happy, 5.73%
Pavlovian, 5.58%

Headline fade

Renegade (post 1, 4-1). 4.2% vs 20.0% = 4.7x market over model, the biggest gap on the board. Post 1 has not produced a Derby winner in our 2010-2025 sample (none since Ferdinand 1986). Toss off the top of every ticket.

Honest caveats

Morning line, not closing tote. Renegade likely tightens, longshots drift.
Churchill takes ~17-22%. The five BETs (multipliers 1.74x to 1.95x) clear takeout. Further Ado is the only one stake-able at full bankroll; the four longshots stay as small saver tickets.
Two of the top-five model weights (dosage, career win-rate) are placeholder for 2026 (same value for every horse). The 2026 ranking effectively leans on year-Beyer, stamina-test, post-position win-rate, trainer/jockey edges, and run style.
Model can't see Ragozin / Thoro-Graph / today's workouts / closing tote / weather. Or how good your bourbon is.

Tickets (light stakes, ~$32 total)

$10 win on Further Ado at 6-1 (full-stake)
$3 win each on Litmus Test, Pavlovian, Intrepido, Robusta ($12)
$1 exacta box: Further Ado / Chief Wallabee / Litmus Test ($6)
10-cent superfecta box: Further Ado / Litmus Test / Pavlovian / Robusta ($2.40)

Disclosure: I built the model and I work on Burla, the open-source Python library that ran the cluster.

Full pipeline, methodology audit, and all 19 horses ranked: burla-cloud.github.io/examples/kentucky-derby-demo/#rankings

GL today, may your closer hit the wire first.

submitted by /u/Ok_Post_149
[link] [comments]

I ran 1 trillion Kentucky Derby simulations on a 1,000-vCPU cluster. Here’s what the model likes

Want to read more?

Tagged with