In honor of today's Breeders' Cup races, here is my first attempt at creating a betting market ranking for thoroughbred horse racing.

My first real foray into sports analytics was a post to Brian Burke's Advanced NFL Stats Community page on how to derive an implied betting market ranking for the NFL from weekly point spreads. I have since refined that initial approach and extended it to additional sports: the NBA, Major League Baseball, College Football, College Basketball, and the WNBA.

The basic idea is to take the market odds and point spreads for each game and use them to reverse engineer an implied ranking. Horse racing odds use a parimutuel system, which doesn't require bookies/sharps to set prices, but are instead a pure reflection of the money bet by the wagering public. So, a betting market ranking derived from these odds would be a true distillation of the "wisdom of crowds".

But in order to extend my method to horse racing, I had to overcome the following challenges:
2. Converting parimutuel odds to a parameter that "adds" like point spreads do
3. Creating a method that works for contests with more than two participants
I've since solved the data access issue. On the second issue, I had to solve a similar challenge to develop my rankings for Major League Baseball. Betting markets in baseball use odds (the "money line") rather than point/run spreads, so I had to create a reverse Pythagorean theorem of sorts for baseball that translated win expectancy into run differential.

I was able to create a similar Pythagorean theorem for horse racing by converting parimutuel odds into the "natural units" for that sport: lengths. Given a set of odds, these can be converted into an implied finishing order, and, most importantly, the margin of victory in terms of lengths. I plan on sharing the details in a future, more technical, post. Converting odds into implied length margins also allowed me to incorporate actual results into the model, which led to better accuracy in predicting future market odds.

For the third obstacle, I solved by introducing an additional set of dummy variables into the model corresponding to each race. A horse's implied ranking was then modeled as the average quality of the horses in the race plus an adjustment up or down based on the odds for that horse (more details to come in a future post).

### The Rankings

Below is my first stab at rankings derived from betting odds. The rankings below are derived from all North American Thoroughbred races run over the past six months. I used some simple rules to categorize horses into 3 main categories: Dirt (higher quality horses that tend to run at mile distances or greater), Turf (horses that run primarily on the turf), and Sprint (horses that run distances less than a mile).

The "GLA" column stands for Generic Lengths Advantage, and it is the number of lengths you would expect the horse to win by against an average horse in a one mile race. For example, Gun Runner, the #1 dirt horse, would be expected to beat Sharp Azteca, the #3, by 1.81 lengths in an one mile race (44.95-43.14).

I cross checked these against the Daily Racing Form's Weekly Divisional Ratings, and for the most part, they seem to pass the sniff test. Some might argue that Arrogate is ranked too low, but we will see what the market truly thinks today as the bets pour in for the Breeders Cup Classic (note that Gun Runner, not Arrogate, was listed as the morning line favorite).

There are also horses on this list that are clearly just wrong, indicating either a data quirk or further refinements of my methodology are needed. For example, Lisandra, a middling horse with lifetime earnings of \$18,806 over 20 starts is clearly not the number two sprinter in North America.

Dirt Rankings
HorserankGLA
Gun Runner 1 44.95
Songbird 2 44.82
Sharp Azteca 3 43.14
West Coast 4 43.02
Forever Unbridled 5 42.77
Arrogate 6 42.52
Shaman Ghost 7 42.49
Keen Ice 8 41.46
Abel Tasman 9 40.73
Collected 10 40.71
Elate 11 40.45
Tapwrit 12 40.24
Practical Joke 13 39.87
Diversify 14 39.80
Accelerate 15 39.56
Irap 16 39.54
Seymourdini 17 39.26
Cupid 18 38.64
Neolithic 19 38.39
Unchained Melody 20 38.14
Mubtaahij (IRE) 21 38.11
Cloud Computing 22 37.93
Always Dreaming 23 37.86
Good Samaritan 24 37.76
Battle of Midway 25 37.71

Turf Rankings
HorserankGLA
Antonoe 3 39.00
World Approval 4 38.69
Time Test (GB) 5 38.37
Sistercharlie (IRE) 6 38.00
Idaho (IRE) 7 37.16
Heart to Heart 8 37.06
Lancaster Bomber 9 36.97
Off Limits (IRE) 10 36.93
Beach Patrol 11 36.63
Fourstar Crook 12 36.54
Grand Jete (GB) 13 36.47
Delta Prince 14 36.45
Cambodia 15 36.27
Nezwaah (GB) 16 36.27
Disco Partner 17 36.23
Kitten's Roar 18 36.20
Quidura (GB) 19 36.18
Blond Me (IRE) 20 36.09
Pure Sensation 21 36.09
Avenge 22 36.07
New Money Honey 23 36.02
Uni (GB) 24 35.89
Bricks and Mortar 25 35.87

Sprint Rankings
HorserankGLA
Drefong 1 43.04
Lisandra 2 42.71
Montauk 3 41.82
Imperial Hint 4 41.80
Roy H 5 41.16
El Deal 6 40.73
Danzing Candy 7 40.19
Divining Rod 9 38.86
American Anthem 10 38.70
Takaful 11 38.63
Finley'sluckycharm 12 38.19
Awesome Slew 13 38.13
Lightstream 14 37.83
Coal Front 15 37.62
Carina Mia 16 37.55
Stallwalkin' Dude 17 37.44
American Gal 18 37.40
Limousine Liberal 19 37.29
Ransom the Moon 20 37.28
Ivan Fallunovalot 21 37.03
Whitmore 22 36.96
Highway Star 23 36.71
Unique Bella 24 36.63
Threefiveindia 25 36.59

### The Breeders' Cup

Rankings are useless if they can't be used to make predictions. Fortunately, the rankings I have derived can be used in a very straightforward manner to create projected win odds for any desired slate of horses (once again, details in a future post).

I have chosen 4 high profile races from today's Breeders' Cup in Del Mar to put my rankings to the test. The tables below show the odds and probabilities from the morning line compared to what my rankings imply. In a future post, I will check back on whether the morning line or my rankings were a better predictor of the closing odds for each race. So far, I have found that my simple model outperforms the morning line for the smaller tracks, but in general can't compete with the premier tracks like Santa Anita, Del Mar, Belmont, Churchill, etc.

Breeders Cup - Race 12 - Breeders Cup Classic
win probabilityodds
Horsemorning lineinpredictablediffmorning lineinpredictable
Gun Runner 27.6% 31.1% 3.5% 9/5 9/5
Arrogate 25.8% 17.5% -8.3% 2/1 4/1
West Coast 11.1% 19.7% 8.6% 6/1 7/2
Collected 11.1% 11.4% 0.3% 6/1 6/1
Mubtaahij (IRE) 6.0% 6.2% 0.2% 12/1 13/1
Churchill (IRE) 4.8% 15/1
Pavel 3.7% 5.4% 1.7% 20/1 15/1
War Decree 2.5% 30/1
Win the Space 2.5% 1.4% -1.1% 30/1 60/1
War Story 2.5% 2.9% 0.4% 30/1 30/1
Gunnevera 2.5% 4.5% 2.0% 30/1 18/1

Breeders Cup - Race 4 - 14 Hands Winery Breeders' Cup Juvenile Fillies
win probabilityodds
Horsemorning lineinpredictablediffmorning lineinpredictable
Moonshine Memories 17.1% 25.3% 8.2% 7/2 5/2
Separationofpowers 15.4% 10.8% -4.6% 4/1 7/1
Heavenly Love 14.0% 1.8% -12.2% 9/2 45/1
Alluring Star 11.0% 35.9% 24.9% 6/1 3/2
Wonder Gadot 8.6% 1.6% -6.9% 8/1 50/1
Princess Warrior 5.9% 2.4% -3.5% 12/1 35/1
Gio Game 4.8% 1.3% -3.5% 15/1 60/1
Piedi Bianchi 4.8% 10.6% 5.7% 15/1 7/1
Caledonia Road 4.8% 4.9% 0.1% 15/1 16/1
Blonde Bomber 3.7% 1.4% -2.3% 20/1 60/1
Stainless 3.7% 0.8% -2.9% 20/1 100/1
Maya Malibu 3.7% 2.9% -0.8% 20/1 30/1
Tell Your Mama 2.5% 0.2% -2.3% 30/1 350/1

Breeders Cup - Race 8 - TwinSpires Breeders' Cup Sprint
win probabilityodds
Horsemorning lineinpredictablediffmorning lineinpredictable
Drefong 22.1% 24.8% 2.8% 5/2 5/2
Roy H 17.2% 15.9% -1.2% 7/2 9/2
Imperial Hint 14.0% 18.5% 4.5% 9/2 7/2
Takaful 12.9% 8.7% -4.1% 5/1 9/1
Mind Your Biscuits 11.0% 11.5% 0.5% 6/1 6/1
American Pastime 5.9% 4.8% -1.1% 12/1 17/1
Ransom the Moon 5.9% 6.3% 0.4% 12/1 12/1
Whitmore 4.8% 5.9% 1.0% 15/1 13/1
Calculator 3.7% 2.7% -1.0% 20/1 30/1
B Squared 2.5% 0.8% -1.7% 30/1 100/1

Breeders Cup - Race 10 - Sentient Jet Breeders' Cup Juvenile
win probabilityodds
Horsemorning lineinpredictablediffmorning lineinpredictable
Bolt d'Oro 27.4% 34.2% 6.7% 9/5 3/2
Free Drop Billy 12.8% 13.2% 0.4% 5/1 5/1
Solomini 11.0% 12.9% 2.0% 6/1 6/1
Firenze Fire 11.0% 5.3% -5.7% 6/1 15/1
U S Navy Flag 8.5% 8/1
Good Magic 8.5% 8.2% -0.3% 8/1 9/1
Hollywood Star 4.8% 10.0% 5.2% 15/1 7/1
Givemeaminit 3.7% 5.2% 1.5% 20/1 15/1
The Tabulator 3.7% 5.2% 1.5% 20/1 15/1
Hazit 3.7% 4.0% 0.3% 20/1 20/1
Bahamian 2.5% 1.7% -0.7% 30/1 50/1
Golden Dragon 2.5% 0.2% -2.3% 30/1 500/1

### The Complete Rankings

For those that are interested, I have dumped the rankings for all horses (all 37,000+ of them) in a google spreadsheet. Any comments/feedback are welcome.