Saturday, April 25, 2015

The Improbability of the Warriors' Comeback

Hindsight has a way of making the improbable seem inevitable. Of course the Warriors erased a 20 point deficit in the fourth quarter (despite being only the third playoff team to do so). Of course Steph Curry hits that game tying three from the corner to force overtime (despite having missed from the wing just three seconds prior).

But a 20 point comeback is anything but inevitable, and we tend to forget the games in which a blowout stays a blowout because, well, those games are forgettable. So what do the numbers say?

My own win probability model put the Warriors chances as low as 0.2%. That low point occurred after a miss by the Warriors' Shaun Livingston with 6:24 left in the game and Golden State down by seventeen. Livingston would rebound his own miss for the put back slam dunk, tripling his team's chances to 0.6%.

But that win probability estimate assumes teams that are evenly matched. The Warriors, however, are anything but an even match for the Pelicans. At home, they were 12.5 point favorites over the Pelicans in the first two games of their first round series. With game three being in New Orleans, the Warriors were still favorites, but not overwhelmingly so at just 5 points. If we use the win probability model calibrated to pre-game odds, the Warriors comeback becomes slightly less improbable, with a low point of 0.4%.

How does this compare to other estimates of the Warriors' chances? I am aware of two others:
  • Gambletron 2000 - A site that aggregates in-game betting data. Think of it as a stock ticker for each NBA game (along with many, many other sports)
  • numberFire - The popular fantasy/analytics site which has recently begun publishing in-game probabilities for both football and basketball.
According to Gambletron, the Warriors chances sunk as low as 2.4% with around 6 or 7 minutes left in the fourth quarter. This amount is somewhat higher than my low point estimate of 0.4%. 

numberFire's estimate is even further off from mine, with a Warriors' win probability of 6% right around the seven minute mark of the fourth quarter. While neither model is going to be objectively "right", this is clearly a significant difference. 

So let's look at the raw data. I built my win probability model from play by play data spanning the 2000-2012 seasons. With over a thousand games per season, this makes for a fairly robust dataset. Here is how often teams came back from large deficits midway through the fourth quarter:

minutes to go:
seven minutes six minutes five minutes
trailing by games won pct games won pct games won pct
20 594 0 0.0% 613 0 0.0% 587 0 0.0%
19 703 0 0.0% 724 0 0.0% 708 0 0.0%
18 764 3 0.4% 760 1 0.1% 754 1 0.1%
17 839 6 0.7% 887 1 0.1% 884 2 0.2%
16 914 2 0.2% 960 3 0.3% 921 1 0.1%
15 1059 16 1.5% 1004 7 0.7% 1036 2 0.2%
14 1138 19 1.7% 1129 14 1.2% 1194 4 0.3%

The Warriors were down 17 with six minutes to go in their game. Over the course of 13 NBA seasons, there were 887 games in which a team trailed by that many with that much time remaining. Only once did that team go on to win - a raw frequency of just 0.1%. Eyeballing these numbers, my model estimate of about 0.5% seems most in line with the actual data, compared to numberFire and Gambletron.

But this dataset includes all games, underdogs and favorites alike. What if we restrict the view to favorites? The Warriors were 5 point favorites at New Orleans. The table below only looks at outcomes for trailing teams that were favored by 2.5 to 7.5 points:

2.5 to 7 point favorites, minutes to go:
seven minutes six minutes five minutes
trailing by games won pct games won pct games won pct
20 55 0 0.0% 65 0 0.0% 68 0 0.0%
19 74 0 0.0% 78 0 0.0% 91 0 0.0%
18 83 0 0.0% 96 0 0.0% 83 0 0.0%
17 117 1 0.9% 121 0 0.0% 99 0 0.0%
16 116 0 0.0% 107 0 0.0% 102 0 0.0%
15 152 1 0.7% 132 1 0.8% 125 0 0.0%
14 161 4 2.5% 133 1 0.8% 142 0 0.0%

And here is the data for 7.5 to 12 point favorites:

minutes to go:
seven minutes six minutes five minutes
trailing by games won pct games won pct games won pct
20 13 0 0.0% 12 0 0.0% 15 0 0.0%
19 17 0 0.0% 16 0 0.0% 16 0 0.0%
18 16 0 0.0% 19 0 0.0% 26 0 0.0%
17 28 1 3.6% 24 1 4.2% 21 0 0.0%
16 28 0 0.0% 35 0 0.0% 30 0 0.0%
15 27 1 3.7% 34 1 2.9% 31 0 0.0%
14 43 2 4.7% 35 1 2.9% 40 1 2.5%

As you can see, the data gets fairly sparse and noisy once we start slicing and dicing. The art of building a win probability model is drawing smooth, rational lines through a messy cloud of data points. Feel free to draw your own conclusions, but the data above gives me confidence in my model's estimates, as well as a proper appreciation of what Steph Curry and the Warriors pulled off Thursday night in New Orleans.