## Tuesday, May 26, 2015

### Introducing ShArc: Shot Arc Analysis

This post has been in the works for several months now, and is the result of a deep dive into the NBA's SportVU motion tracking data. It's a project I picked up when I had time, and put down when I got stuck (which was often). What follows is rough and unpolished, but for me, its creation was a lot of fun. It allowed me to take my first and most abiding academic obsession, physics, and combine it with my more recent obsessions of data mining and sports analytics. In short, I nerd-sniped myself.

### The Mixed Blessing of Big Data

The NBA is in the midst of a data-explosion that at times feels paralysis-inducing. We're spoiled for choice, with literally hundreds of thousands of data points, per game, begging to de dissected and analyzed. And yet, progress has been made. Kirk Goldsberry of Grantland, an early pioneer of shot location data, last year introduced a new SportVU-based stat called EPV, or Expected Point Value. EPV evaluates a team's expected points on a real time basis, as the possession evolves, accounting for shot clock, ball location, and the position of all ten players on the court. A new "microeconomics" for the NBA, as Goldsberry and his coauthors described it in their Sloan Analytics paper.

The NBA itself, in addition to funding and supporting the addition of the cameras, has also developed a whole host of new stats from this "big data" they helped create: Catch and Shoot, Defense at the Rim, Pull Up Shooting, among many, many others. There is also the work of the smart people at Nylon Calculus, making sense of the SportVU data with simple metrics built from common sense understanding of the game of basketball (as opposed to inscrutable mathematical black boxes).

In this post, I begin my own foray into the NBA's big data world, but with a different focus. While it's common knowledge that SportVU data provides location in two dimensions for every player on the court, what may not be widely appreciated is that the ball itself is tracked in all three dimensions. When developing EPV with his students, Kirk Goldsberry code named the work "XY Hoops". Consider the work below an "XYZ Hoops" project of sorts.

SportVU cameras track the ball's location approximately 25 times per second. Here is what that data looks like for a successful free throw - one made by the Knicks' Lance Thomas on April 5:

Data can be messy, and the SportVU data is no exception. This particular example turned out to be pretty clean, but notice the odd dip in the ball's trajectory on its way down. Barring a random wind gust from the rafters of Madison Square Garden, this is most likely noise rather than the ball's actual path. So, to better make sense of the data, I can apply some simple physics concepts to derive what was likely the true trajectory of the ball.

There are four basic forces that affect a basketball's trajectory. Here they are, in order of importance:
• Gravity - In a vacuum, objects near the Earth's surface accelerate downwards at a rate of 32.2 feet per second squared.
• Buoyancy - This force goes in the opposite direction of gravity, and is proportional to the weight of the air displaced by the ball (the same force that causes a helium balloon to rise). This offsets the effect of gravity by about 1.5%, resulting in an expected net acceleration of 31.7 feet per second squared.
• Drag - As the ball moves through the air, it is essentially pushing that air out of the way, and the air pushes back in accordance with Newton's Third Law. This force is in the opposite direction of the ball's velocity, and proportional to the balls's speed and the density of the air.
• Magnus effect - A phenomenon much more apparent in soccer and baseball, this is the effect the ball's spin has on its flight path. The effect is much smaller for a typical basketball shot. Most shooters add backspin to their shot, which, in effect, creates a slight upward lift on the ball's trajectory.
For a far more comprehensive take on the subject, I recommend John Fontanella's The Physics of Basketball. There is also this Wired feature from last year, as well as this nice visual representation of the forces outlined above.

Based on my own investigations I found that the ball's trajectory could be modeled fairly accurately by ignoring the last two forces, and focusing solely on the combination of gravity and buoyancy. Armed with this basic physics knowledge, I fit the raw data to a simple regression model, and removed trajectories with less than a modeled 0.985 R squared. Only about half of all free throws this season met this minimum criteria, but that is still over 33,000 shots.

The average vertical acceleration emerging from that simple model was 31.6 feet per second squared, which is very close to the theoretical expectation of 31.7 feet per second squared. An encouraging result. Basketballs do follow the laws of physics.

### Creating a Shot Fingerprint

So with a little freshman-level physics, I can now tease out the likely "true" trajectory of each shot, and from that, develop a whole host of new shot-based statistics. How high was the ball when it was released? And with what velocity? How high did the ball go (i.e. is this a shooter with a "high arc")? At what angle and speed did the ball approach the rim? Was it on target?

Returning to that Lance Thomas free throw, here is the modeled version:

The ball was released at a height of eight feet, which is a bit high for a player of Lance's height (6'8). Typical release height for a free throw is 1 foot above the top of the shooter's head. The release angle of 54.0 degrees and velocity of 23.3 ft/s is fairly typical for a 6'8 player (53.9 degrees, 23.5 ft/s). The modeled acceleration of 32.3 ft/s^2 is a bit high compared to theory, but within reason.

The ball approached the rim with a velocity of 20.3 ft/s and at an angle of 47 degrees. The free throw was successful, and appeared to be off target from center by about 5 inches.

I should point out that every number, graph, and table in this post should be taken as preliminary. As best as I can tell, there has been very little, if any, public research on this topic, so I don't have much in the way of cross checking or validation.

### Building Shooter Profiles

The Warriors' Stephen Curry is the best shooter in the NBA (and perhaps of all time). He is known for a quick release and a high shot arc. Good shooters are often distinguished by their high, arching shots. For one, it makes one's shot harder to block, but that's not really a consideration for free throws. A high arc also allows the ball to approach the basket at a more direct angle, making the hoop appear larger and more forgiving. For shots with a low, line drive trajectory, the hoop is a narrow ellipse, or a small thermal exhaust port if you will, allowing for little margin of error on approach.

But a high arc comes with a tradeoff, and that is speed. In order to achieve that higher arc, yet still have the ball reach the basket, the shooter must release the ball with a higher velocity. Presumably the faster a ball is released, the more difficult it is to control. So how do NBA shooters balance this tradeoff? The chart below shows average release angle as a function of player height. Player height is a key variable, as it affects the optimal angle of release; the taller a player, the shallower the optimal release angle.

The solid black line represents the release angle required to minimize launch velocity, while the gray line is the actual average angle for free throws taken this season. As expected, the average release angle declines as height increases, but is consistently higher than the "minimum velocity" angle. This indicates NBA shooters are willing to increase launch velocity in order to optimize the ball's angle of approach.

How does this play out at an individual level? As mentioned above, Steph Curry is known for having a high shot arc. Does this show up in the SportVU data?

The chart above depicts a typical successful free throw for Stephen Curry, and compares that to a typical free throw for all players of Steph's height. I defined "typical" by collecting all the variables pertinent to a shot (release angle, release velocity, release location, and modeled acceleration) and then picking a shot that falls closest to the middle of that multivariate distribution. 6'3 players typically release their free throws at an angle of 54.6 degrees and a speed of 23.5 feet per second. A typical Curry free throw, however, is released at a higher angle of 58.1 degrees, and a velocity of 23.8 feet per second. As a result, his free throw flies a half foot higher than what is typical for a 6'3 player.

Saying the hoop seems bigger for good shooters isn't just an announcing cliche. The SportVU data shows us this is actually the case for Curry and other high arc shooters. A typical Stephen Curry free throw approaches the hoop at 49 degrees to the vertical (a 90 degree angle, while impossible, would be optimal). This is four degrees more than the average 6'3 player shooting free throws. The geometry of Stephen Curry's shot really does make the basket appear bigger.

Interestingly, James Harden, Curry's chief rival for MVP this season (and current conference finals rival), is at the opposite end of the spectrum. Harden's free throw shot has the lowest release angle among all 6'5 players. A typical player his height releases the ball at an angle of 53.4 degrees. Harden, on the other hand, is hurling virtual line drives at the hoop. The ball leaves his hands at a 49.4 degree angle.

### PITCHf/x for the NBA?

Using the same physics model of a shot's trajectory, I can also estimate where the ball was when it crossed the hoop's threshold. In other words, how "on target" was the shot in relation to dead center of the rim? Much in the same way baseball's PITCHf/x system can be used to track the precise location of balls and strikes, SportVU data can be used to track the precision of each basketball shot.

Here is Stephen Curry's "PITCHf/x"-style free throw chart (the perspective here is top down, looking down at the rim):

Curry is an 89% free throw shooter and it shows in this chart, with his shots clustered in a nice, tight distribution.

Contrast Curry's laser-guided precision to the indiscriminate saturation-bombing of Andre Drummond, a 40% free throw shooter:

Note that there are several "misses" on this chart that otherwise seem on target. While this could be due to a shallow approach angle, the more likely, and less interesting explanation is that the SportVU data is simply messy and imprecise (to say nothing of my own imperfect methodologies for deciphering said data).

To paraphrase Tolstoy, good free throw shooters all look alike, but crappy free throw shooters are all crappy in their own way. Here is DeAndre Jordan's chart:

In contrast to Andre Drummond's more scattershot chart, DeAndre cuts a more narrow swath from the hoop to the backboard. His misses are more of the "too strong" or "too weak" variety, as opposed to wide left and wide right.

### Next Steps

Where to go next could be a blog post in its own right, and I welcome any and all suggestions. An obvious next step would be to extend this analysis to field goal shooting. I started with free throws as it was simpler to mine the raw data for shots taken from a consistent spot on the floor. But field goal shooting is kind of important to the game of basketball, and thus worth powering through the data complexities.

I also plan on compiling and sharing these newly created shot metrics at a player level. For example, other members of the Stephen Curry "high arc" club include Chris Paul, Mike Conley, and Reggie Jackson, while Tony Allen, Chris Andersen, and Blake Griffin take after Harden's line drive style of shooting. Perhaps with the data compiled, we can zero in on what characteristics the great shooters have in common (if any). In addition, the data could be mined at a player level and monitored for subtle changes in shot mechanics.

With only a couple weeks left in the NBA season, I doubt I will have made much more progress before the champion is crowned (I do have a day job as well as a family that likes to spend time with me on occasion). But this feels like an ideal project for the offseason - something to fill that mid-June to October gap.