Saturday, October 13, 2018

Betting Market Rankings for the NHL

In what is surely the second most exciting development of the NHL season, I have added hockey to my suite of betting market team rankings. For those unfamiliar, the basic idea of these rankings is to reverse engineer an implied team ranking from the game by game point spreads, moneylines, and over/unders. See my post at the Advanced NFL Stats Community site for a basic overview of the concept. With this latest addition, I now have daily market rankings for the NFL, College Football, NBA, WNBA, College Basketball, MLB, and the NHL.

The nice thing about market derived rankings is that you can get a reasonable ranking with a relatively small sample size. We are just a week into the season, and the rankings already pass a sniff test. The top 5 of Tampa Bay, Nashville, Winnipeg, Toronto, and Pittsburgh in my rankings are also the top 5 favored teams to win the Stanley Cup.

Saturday, September 29, 2018

The NBA's new shot clock rule and its effect on pace

Earlier this month, the NBA formally approved a change to its shot clock rules. Now, following an offensive rebound, the shot clock will reset to just 14 seconds, instead of the usual 24.

Over at Nylon Calculus, Daniel Massop argues that the effect on pace will be minimal, given that only 6 percent of offensive rebound possessions lasted more than 14 seconds. For a deeper dive, check out Blake Murphy's piece at Uproxx, which uses, among other data points, stats from the NBA's G-League, which went to the 14 second rule two seasons back.

As it turns out, the WNBA was also an early adopter of this rule change, having switched to 14 seconds for the 2016 season. That rule's impact on pace can provide clues to what will happen in the NBA this season.

The chart below shows average seconds per possession in the WNBA for every season.


Saturday, May 5, 2018

Never Bet a Horse Named Joe: Update

Several years back, I tested a theory that horses with popular boys or girls names were overbet in parimutuel markets. My hunch was that the betting public is more likely to bet on a horse if that horse's name contained their own name (or that of their wife, son, daughter, etc.). I arrived at that hunch by extrapolating from a sample size of one - me.

What I found in that original analysis, using a limited dataset of races run in California, was weak evidence for my theory, but that fell short of statistical significance. However, I now have a much more robust dataset, consisting of nearly all races run in North America over the past four years.

With this new dataset of some 200,000 races, I ran a logistic regression on the probability of a horse winning a race using the following variables:

Saturday, January 27, 2018

Odds Predictions for the Pegasus Cup

The 2nd running of the $16 million Pegasus World Cup takes place later today at Gulfstream Park. As I did with the Breeders Cup, I will use my newly developed betting market rankings for horse racing to make odds predictions for this race. First, here are updated rankings, broken down by the following categories:
  • Dirt - Horses that run primarily dirt races of a mile or more
  • Turf - Horses than run primarily on the turf
  • Sprint - Horses that run races less than a mile
These rankings are based on odds data and results for all North American thoroughbred races run over the past 120 days. Using techniques similar to that for my other betting market rankings, I use multivariate regression to build connections between the races and derive what the market "thinks" are the best horses.

GLA stands for "Generic Lengths Advantage" and is the expected margin of victory (in lengths) over an average North American thoroughbred. So, Gun Runner would be expected to beat Arrogate by about 1.4 lengths in a mile race.

Sunday, January 14, 2018

Judging Win Probability Models

February 11, 2018 update: The Brier score chart at the bottom of this post had an incorrect value for the ESPN "Start of Game" score. The corrected numbers (with updates through 2/10/18) can be found in this tweet. With the update, my comments regarding the ESPN model being too reactive no longer apply.

Win probability models tend to get the most attention when they are "wrong". The Atlanta Falcons famously had a 99.7% chance to win Super Bowl LI according to ESPN, holding a 28-3 lead in the third quarter, and the Patriots facing fourth down. Google search interest in "win probability" reached a five year high in the week following the Patriots' improbable comeback.



Some point to the Falcons' 99.7% chances, and other improbable results, as evidence of the uselessness of win probability models. But a 99.7% prediction is not certainty, and should be incorrect 3 out of every 1,000 times. But it's not like we can replay last year's Super Bowl 1,000 times (unless you live inside the head of a Falcons fan).

So, in what sense can a probability model ever be wrong? As long as you don't predict complete certainty (0% or 100%), you can hand wave away any outcome, as I did above with the Falcons collapse. Or take another high profile win probability "failure": the November 2016 Presidential Election. On the morning of the election, Nate Silver's FiveThirtyEight gave Hilary Clinton a 71% chance of winning the presidency and Donald Trump a 29% chance.