Tuesday, May 26, 2015

Introducing ShArc: Shot Arc Analysis

This post has been in the works for several months now, and is the result of a deep dive into the NBA's SportVU motion tracking data. It's a project I picked up when I had time, and put down when I got stuck (which was often). What follows is rough and unpolished, but for me, its creation was a lot of fun. It allowed me to take my first and most abiding academic obsession, physics, and combine it with my more recent obsessions of data mining and sports analytics. In short, I nerd-sniped myself.

The Mixed Blessing of Big Data

The NBA is in the midst of a data-explosion that at times feels paralysis-inducing. We're spoiled for choice, with literally hundreds of thousands of data points, per game, begging to de dissected and analyzed. And yet, progress has been made. Kirk Goldsberry of Grantland, an early pioneer of shot location data, last year introduced a new SportVU-based stat called EPV, or Expected Point Value. EPV evaluates a team's expected points on a real time basis, as the possession evolves, accounting for shot clock, ball location, and the position of all ten players on the court. A new "microeconomics" for the NBA, as Goldsberry and his coauthors described it in their Sloan Analytics paper.

The NBA itself, in addition to funding and supporting the addition of the cameras, has also developed a whole host of new stats from this "big data" they helped create: Catch and Shoot, Defense at the Rim, Pull Up Shooting, among many, many others. There is also the work of the smart people at Nylon Calculus, making sense of the SportVU data with simple metrics built from common sense understanding of the game of basketball (as opposed to inscrutable mathematical black boxes).

In this post, I begin my own foray into the NBA's big data world, but with a different focus. While it's common knowledge that SportVU data provides location in two dimensions for every player on the court, what may not be widely appreciated is that the ball itself is tracked in all three dimensions. When developing EPV with his students, Kirk Goldsberry code named the work "XY Hoops". Consider the work below an "XYZ Hoops" project of sorts.

Friday, May 15, 2015

Heavy Favorites Usually Don't Surrender Big Leads (usually)

The Houston Rockets pulled off the second most improbable comeback of this year's playoffs last night. Down by 19 with 2 minutes left in the third, the Rockets finished the game on a ridiculous 49-18 run to force a game seven in their conference semifinals series with the Clippers.

From 2000 to 2012, there were 624 games in which a team trailed by 19 with 2 minutes left in the third. In just 12 of those games (1.9%) did the trailing team go on to win. But that includes all games, including those in which a heavily favored team fights back from a steep deficit.

The Rockets were 8.5 point underdogs against the Clippers, and heavy underdogs rarely pull off what Houston did last night. Here is the raw data from the 2000-2012 NBA seasons (the raw data behind my win probability model).

two minutes left in the third:
all games 7.5 to 12 pt underdogs
trailing by games won pct games won pct
21 500 5 1.0% 178 0 0.0%
20 599 10 1.7% 203 1 0.5%
19 624 12 1.9% 193 0 0.0%
18 749 17 2.3% 212 3 1.4%
17 843 26 3.1% 242 2 0.8%

Out of 193 games, not a single underdog of 7.5 to 12 points came back from a 19 point deficit. These raw numbers are fairly consistent with the two win probability graphs for this game (by design):

• The "50/50" version which ignores team strength differences. The Rockets low point was 2.3% in that version.
• The "pre game" version which factors in the Vegas point spread, with a low point of 0.7% for the Rockets.

Thursday, May 14, 2015

The Clutch Shooting of Paul Pierce

Paul Pierce currently leads all NBA players in win probability added for the 2015 playoffs. Win probability added, as I've defined it, measures the impact a player's shots, free throw attempts, and turnovers has on his team's chances of victory. It deliberately gives more weight to "clutch" shots that occur during crucial game situations - situations such as Pierce's go ahead (temporarily so) three point shot last night against the Hawks.

That shot, which put the Wizards up 1 with 8 seconds left, increased their win probability by 50%, from 18% to 68%. Had Pierce missed, the Wizards win probability would have dropped to 8%, a total potential swing of 60% riding on Pierce's shot. As I did earlier this week with Derrick Rose, I can measure Pierce's shooting performance during similar high pressure, clutch situations.

Saturday, May 9, 2015

The Clutch Shooting of Derrick Rose

The bank was open late for Derrick Rose last night. The Bulls took a two games to one lead over the Cavaliers, thanks to his buzzer beating three pointer. "I don't mean to sound cocky", Rose said, "but that's a shot you want to take if you are a player in my position".

Clutch shooting isn't easy, and it gets progressively harder as the stakes increase. For Rose's bank shot last night, roughly 50% of the game's win probability hung in the balance (make: win, miss: overtime). Last night's shot fell (with a little help from the United Center glass), but how has Rose fared in similar clutch situations, and how does that compare to league averages? The table below summarizes Derrick Rose's three point field goal percentage as a function of win probability "swing" (i.e. the stakes).

Thursday, May 7, 2015

Reggie Miller Shoots, Steals, and Rebounds the Pacers out of a 250 to 1 Hole

Today marks the 20th anniversary of one of the most impressive NBA playoff performances of all time: Reggie Miller's sucker punch of the Knicks in Game 1 of the 1995 Eastern Conference Semifinals. In nine seconds of game time, Miller scored eight unanswered points, the first six of which came in a brutally efficient five second span. Miller's eight points turned a seemingly insurmountable six point deficit into a two point lead. The Knicks players, who appeared just as shocked as the remaining MSG crowd that day, weren't even able to get a shot off in their final possession.

If you want to relive the moment, you can't go wrong with ESPN's 30 for 30 feature Winning Time, And Shea Serrano has a feature today on Grantland about his experiences as a young Reggie Miller fan in the mid-nineties (I, too, missed the 8 points in 9 seconds miracle due to the mistaken belief that the game was over). I'm not the writer Shea is, and I'm certainly not a filmmaker, but I can put Reggie's achievement in a historical context of sorts.

Monday, May 4, 2015

The Spurs-Clippers Series in a Single Chart

Three of the top five most exciting games of the NBA playoffs' first round came from a single series: Spurs-Clippers. The matchup, while having no business being in the first round (the #2 and #4 teams, according to my market rankings), nevertheless delivered with one of the most exciting seven game playoff series in recent memory.

For this post, I have attempted to condense the drama of that series onto a single page. The chart below shows the Los Angeles Clippers' probability of winning the series on a play by play basis. Each game's win probability chart was adjusted to show its impact on the ultimate outcome of the series. On the chart, the lines are color coded based on location (red for Clippers home games, black for the Spurs).

Saturday, May 2, 2015

NBA Live Win Probabilities - Mark II

A few new bells and whistles have been added to the live version of the NBA Win Probability graphs. Users now have the option of selecting either the "50/50" win probabilities or the "Adjusted" version, the latter of which takes into account the pre-game point spread. The 50/50 version assumes both teams are evenly matched and playing on an a neutral court. The adjusted version would be more appropriate for in-game betting, and should track fairly closely to the probabilities found at Gambletron 2000.

You can toggle between the two versions, or, if you can't make up your mind, just enable the "Show Both Graphs" checkbox.

A note that the Excitement Index, Comeback Factor, and Player Win Probability added stats always use the 50/50 version, irrespective of user input.

That link again: Live NBA Win Probabilities (graph will update once Spurs-Clippers gets underway tonight).