Wednesday, February 18, 2015

New NBA Features at FiveThirtyEight

Today, FiveThirtyEight is running two new features based on my NBA win probability model. The first is a Datalab article on "exciting" NBA games this season and whether those are predictable in advance.

The second is an interactive that summarizes each team's average win probability by minute for the 2014-15 NBA season (pre-All Star break). In effect, it's a real time evolution of a team's win-loss record. For example, midway through the first quarter, the Sacramento Kings still looked like an above 0.500 team. One of the nice things about working with FiveThirtyEight, aside from the added exposure it brings here, is the opportunity to work with their talented data visualization experts - Allison McCann for this feature, and Reuben Fischer-Baum for the NFL Playoff Implications weekly interactive.

Monday, February 16, 2015

How NBA Games Are Won

Basketball netWhat's more important in basketball: rebounding or getting to the foul line? Field goal percentage or forcing turnovers? These questions aren't new, but for this post I will use my win probability model to provide a new perspective on what matters most when it comes to winning basketball games.

Dean Oliver, pioneer of basketball statistical analysis, identified what he termed the "four factors" of basketball success in his influential book Basketball on Paper. Those four factors are:
  • Shooting
  • Free Throws
  • Rebounding
  • Turnovers
Nearly everything that is important to the game of basketball can be attributed to one of those four factors. But is each factor created equally? Or is one "more equal" than the others? Oliver himself tackled this question, using his futuristically-titled RoboScout program. Here is how Oliver assessed the relative importance the the four factors:

Saturday, February 7, 2015

Anthony Davis's Torrid MVP Pace

Anthony Davis is still considered a long shot to win NBA MVP this year (despite recent heroics). But by at least one measure (and others), it's not even close - Anthony Davis is the league's MVP. Using my win probability model, I can assign a value to each player's contributions, based on how those contributions affected their team's chances of winning. This approach automatically devalues garbage time stats, and assigns more credit for clutch performance. When ranked by Win Probability Added (WPA), Davis' total of 7.32 is far and away the league's best this season. Atlanta's Kyle Korver is a distant second with 4.86. Here is the top 10:


Friday, February 6, 2015

Updated NBA Win Probability Calculator

The odds of winning a game when down by 6
with 18 seconds left are approximately 250 to 1.
Last month, I rolled out a new version of my NBA Win Probability Graphs and Box Scores (new link | old link). In addition to adding some new features, such as the option of displaying real time along the horizontal axis, the underlying win probability model was rebuilt as well. The dataset was updated and expanded, model parameters were further optimized, and handling of late game situations was improved, particularly in the final seconds.

Until now, that new model was only used to generate the graphs. The interactive Win Probability Calculator was still using the old model. The calculator tool has now been updated with the new, improved model. I have also removed the "Beta" tag that had been there since its inception.

But how do I know the model is improved, and not just new? One way to assess a probability model's accuracy is by measuring log-likelihood. Likelihood, in this context, signifies the probability the model assigned to any specific game outcome. For example, if the model says that the win probability for a team is 15%, and the team actually goes on to win the game, the likelihood is 0.15. If the team lost, the likelihood was 85%. We can do this calculation for all game situations in which the model estimates a win probability. The total likelihood is just the product of all of those individual likelihoods. As a mathematical convenience, one often takes the natural logarithm of that product.

Sunday, February 1, 2015

Squares Probabilities at FiveThirtyEight

In a squares pool today? Want to know if your square is any good? Check my latest article at FiveThirtyEight: How Much Money You're Going to Win Playing Super Bowl Squares.

Expected payouts are shown for the standard version of Super Bowl squares, in which payouts occur based on the score at the end of each quarter. I also calculated expected payouts for the "Every Score Pays" rules, in which 5% of the pool is dished out for every score change (including extra points).

In doing some idle googling for the article, I came across the following on Leon Lett and the hate mail he received from squares owners after Super Bowl XXVII (didn't make the edit, so I thought I'd share it here):

Saturday, January 24, 2015

Australian Open Win Probability Graphs

Win probability graphs for the Australian Open are now available. Unlike my win probability models for the NBA and soccer, which required regression analyses and lots of smoothing, the development of the tennis model was more straightforward, if a bit tedious. Once you specify the probability of winning a point on serve, the rest is no worse than a college-level probability exercise.

The graphs come in two versions. The "50/50" version assumes the two competitors to be of equal strength, with a 67.5 percent probability of winning a point on serve, and 32.5 percent probability of winning a point when returning. The "Market" version of the graphs adjusts both the serve and return probabilities up or down so that the starting probability aligns with the pre-match betting odds. For example, take Andreas Seppi's third round upset of Roger Federer. The betting market gave Seppi just a 6 percent chance of beating Federer, which implied a 67.2 percent serve probability for Federer and a corresponding 57.6 percent serve probability for Seppi. Here is the graph:

Friday, January 16, 2015

NBA Win Probability - Take Two

On the eve of the 2014-15 NBA season, I hinted at new features I planned on adding to my NBA Win Probability tools. I believe the word I used was "soon", and I suppose this qualifies, but it took a bit longer than expected. I'm not sure what went wrong, I even took into account Hofstadter's Law.

The new graphs and their corresponding features version use a different data source that goes back to the 1996-97 NBA season. I just have current season results up now, but I have plans to add prior seasons in the coming weeks, barring any data issues or website strain (I'm dying to see what Tracy McGrady's 13 points in 35 seconds looks like on a win probability graph).

With the new data source, I created a new win probability model, removing some of the kludge-iness of the old version. The new model appears to perform better out of sample, resulting in a higher log likelihood score for the 2014 season, compared to the existing version.