### The Mixed Blessing of Big Data

The NBA is in the midst of a data-explosion that at times feels paralysis-inducing. We're spoiled for choice, with literally hundreds of thousands of data points,*per game*, begging to de dissected and analyzed. And yet, progress has been made. Kirk Goldsberry of Grantland, an early pioneer of shot location data, last year introduced a new SportVU-based stat called EPV, or Expected Point Value. EPV evaluates a team's expected points on a real time basis, as the possession evolves, accounting for shot clock, ball location, and the position of all ten players on the court. A new "microeconomics" for the NBA, as Goldsberry and his coauthors described it in their Sloan Analytics paper.

The NBA itself, in addition to funding and supporting the addition of the cameras, has also developed a whole host of new stats from this "big data" they helped create: Catch and Shoot, Defense at the Rim, Pull Up Shooting, among many, many others. There is also the work of the smart people at Nylon Calculus, making sense of the SportVU data with simple metrics built from common sense understanding of the game of basketball (as opposed to inscrutable mathematical black boxes).

In this post, I begin my own foray into the NBA's big data world, but with a different focus. While it's common knowledge that SportVU data provides location in two dimensions for every player on the court, what may not be widely appreciated is that the ball itself is tracked in

__all three__dimensions. When developing EPV with his students, Kirk Goldsberry code named the work "XY Hoops". Consider the work below an "XYZ Hoops" project of sorts.

### The Physics of Basketball

SportVU cameras track the ball's location approximately 25 times per second. Here is what that data looks like for a successful free throw - one made by the Knicks' Lance Thomas on April 5:

Data can be messy, and the SportVU data is no exception. This particular example turned out to be pretty clean, but notice the odd dip in the ball's trajectory on its way down. Barring a random wind gust from the rafters of Madison Square Garden, this is most likely noise rather than the ball's actual path. So, to better make sense of the data, I can apply some simple physics concepts to derive what was likely the true trajectory of the ball.

There are four basic forces that affect a basketball's trajectory. Here they are, in order of importance:

**Gravity**- In a vacuum, objects near the Earth's surface accelerate downwards at a rate of 32.2 feet per second squared.**Buoyancy**- This force goes in the opposite direction of gravity, and is proportional to the weight of the air displaced by the ball (the same force that causes a helium balloon to rise). This offsets the effect of gravity by about 1.5%, resulting in an expected net acceleration of 31.7 feet per second squared.**Drag**- As the ball moves through the air, it is essentially pushing that air out of the way, and the air pushes back in accordance with Newton's Third Law. This force is in the opposite direction of the ball's velocity, and proportional to the balls's speed and the density of the air.**Magnus effect**- A phenomenon much more apparent in soccer and baseball, this is the effect the ball's spin has on its flight path. The effect is much smaller for a typical basketball shot. Most shooters add backspin to their shot, which, in effect, creates a slight upward lift on the ball's trajectory.

For a far more comprehensive take on the subject, I recommend John Fontanella's The Physics of Basketball. There is also this Wired feature from last year, as well as this nice visual representation of the forces outlined above.

Based on my own investigations I found that the ball's trajectory could be modeled fairly accurately by ignoring the last two forces, and focusing solely on the combination of gravity and buoyancy. Armed with this basic physics knowledge, I fit the raw data to a simple regression model, and removed trajectories with less than a modeled 0.985 R squared. Only about half of all free throws this season met this minimum criteria, but that is still over

The average vertical acceleration emerging from that simple model was 31.6 feet per second squared, which is very close to the theoretical expectation of 31.7 feet per second squared. An encouraging result. Basketballs

*33,000 shots*.The average vertical acceleration emerging from that simple model was 31.6 feet per second squared, which is very close to the theoretical expectation of 31.7 feet per second squared. An encouraging result. Basketballs

*do*follow the laws of physics.### Creating a Shot Fingerprint

So with a little freshman-level physics, I can now tease out the likely "true" trajectory of each shot, and from that, develop a whole host of new shot-based statistics. How high was the ball when it was released? And with what velocity? How high did the ball go (i.e. is this a shooter with a "high arc")? At what angle and speed did the ball approach the rim? Was it on target?

Returning to that Lance Thomas free throw, here is the modeled version:

Returning to that Lance Thomas free throw, here is the modeled version:

The ball was released at a height of eight feet, which is a bit high for a player of Lance's height (6'8). Typical release height for a free throw is 1 foot above the top of the shooter's head. The release angle of 54.0 degrees and velocity of 23.3 ft/s is fairly typical for a 6'8 player (53.9 degrees, 23.5 ft/s). The modeled acceleration of 32.3 ft/s^2 is a bit high compared to theory, but within reason.

The ball approached the rim with a velocity of 20.3 ft/s and at an angle of 47 degrees. The free throw was successful, and appeared to be off target from center by about 5 inches.

I should point out that every number, graph, and table in this post should be taken as preliminary. As best as I can tell, there has been very little, if any, public research on this topic, so I don't have much in the way of cross checking or validation.

### Building Shooter Profiles

The Warriors' Stephen Curry is the best shooter in the NBA (and perhaps of all time). He is known for a quick release and a high shot arc. Good shooters are often distinguished by their high, arching shots. For one, it makes one's shot harder to block, but that's not really a consideration for free throws. A high arc also allows the ball to approach the basket at a more direct angle, making the hoop appear larger and more forgiving. For shots with a low, line drive trajectory, the hoop is a narrow ellipse, or a small thermal exhaust port if you will, allowing for little margin of error on approach.

But a high arc comes with a tradeoff, and that is speed. In order to achieve that higher arc, yet still have the ball reach the basket, the shooter must release the ball with a higher velocity. Presumably the faster a ball is released, the more difficult it is to control. So how do NBA shooters balance this tradeoff? The chart below shows average release angle as a function of player height. Player height is a key variable, as it affects the optimal angle of release; the taller a player, the shallower the optimal release angle.

The solid black line represents the release angle required to minimize launch velocity, while the gray line is the actual average angle for free throws taken this season. As expected, the average release angle declines as height increases, but is consistently higher than the "minimum velocity" angle. This indicates NBA shooters are willing to increase launch velocity in order to optimize the ball's angle of approach.

How does this play out at an individual level? As mentioned above, Steph Curry is known for having a high shot arc. Does this show up in the SportVU data?

The chart above depicts a typical successful free throw for Stephen Curry, and compares that to a typical free throw for all players of Steph's height. I defined "typical" by collecting all the variables pertinent to a shot (release angle, release velocity, release location, and modeled acceleration) and then picking a shot that falls closest to the middle of that multivariate distribution. 6'3 players typically release their free throws at an angle of 54.6 degrees and a speed of 23.5 feet per second. A typical Curry free throw, however, is released at a higher angle of 58.1 degrees, and a velocity of 23.8 feet per second. As a result, his free throw flies a half foot higher than what is typical for a 6'3 player.

Saying the hoop seems bigger for good shooters isn't just an announcing cliche. The SportVU data shows us this is actually the case for Curry and other high arc shooters. A typical Stephen Curry free throw approaches the hoop at 49 degrees to the vertical (a 90 degree angle, while impossible, would be optimal). This is four degrees more than the average 6'3 player shooting free throws. The geometry of Stephen Curry's shot really does make the basket appear bigger.

Interestingly, James Harden, Curry's chief rival for MVP this season (and current conference finals rival), is at the opposite end of the spectrum. Harden's free throw shot has the lowest release angle among all 6'5 players. A typical player his height releases the ball at an angle of 53.4 degrees. Harden, on the other hand, is hurling virtual line drives at the hoop. The ball leaves his hands at a 49.4 degree angle.

But a high arc comes with a tradeoff, and that is speed. In order to achieve that higher arc, yet still have the ball reach the basket, the shooter must release the ball with a higher velocity. Presumably the faster a ball is released, the more difficult it is to control. So how do NBA shooters balance this tradeoff? The chart below shows average release angle as a function of player height. Player height is a key variable, as it affects the optimal angle of release; the taller a player, the shallower the optimal release angle.

The solid black line represents the release angle required to minimize launch velocity, while the gray line is the actual average angle for free throws taken this season. As expected, the average release angle declines as height increases, but is consistently higher than the "minimum velocity" angle. This indicates NBA shooters are willing to increase launch velocity in order to optimize the ball's angle of approach.

How does this play out at an individual level? As mentioned above, Steph Curry is known for having a high shot arc. Does this show up in the SportVU data?

The chart above depicts a typical successful free throw for Stephen Curry, and compares that to a typical free throw for all players of Steph's height. I defined "typical" by collecting all the variables pertinent to a shot (release angle, release velocity, release location, and modeled acceleration) and then picking a shot that falls closest to the middle of that multivariate distribution. 6'3 players typically release their free throws at an angle of 54.6 degrees and a speed of 23.5 feet per second. A typical Curry free throw, however, is released at a higher angle of 58.1 degrees, and a velocity of 23.8 feet per second. As a result, his free throw flies a half foot higher than what is typical for a 6'3 player.

Saying the hoop seems bigger for good shooters isn't just an announcing cliche. The SportVU data shows us this is actually the case for Curry and other high arc shooters. A typical Stephen Curry free throw approaches the hoop at 49 degrees to the vertical (a 90 degree angle, while impossible, would be optimal). This is four degrees more than the average 6'3 player shooting free throws. The geometry of Stephen Curry's shot really does make the basket appear bigger.

Interestingly, James Harden, Curry's chief rival for MVP this season (and current conference finals rival), is at the opposite end of the spectrum. Harden's free throw shot has the lowest release angle among all 6'5 players. A typical player his height releases the ball at an angle of 53.4 degrees. Harden, on the other hand, is hurling virtual line drives at the hoop. The ball leaves his hands at a 49.4 degree angle.

### PITCHf/x for the NBA?

Using the same physics model of a shot's trajectory, I can also estimate where the ball was when it crossed the hoop's threshold. In other words, how "on target" was the shot in relation to dead center of the rim? Much in the same way baseball's PITCHf/x system can be used to track the precise location of balls and strikes, SportVU data can be used to track the precision of each basketball shot.

Here is Stephen Curry's "PITCHf/x"-style free throw chart (the perspective here is top down, looking down at the rim):

Curry is an 89% free throw shooter and it shows in this chart, with his shots clustered in a nice, tight distribution.

Contrast Curry's laser-guided precision to the indiscriminate saturation-bombing of Andre Drummond, a 40% free throw shooter:

Note that there are several "misses" on this chart that otherwise seem on target. While this could be due to a shallow approach angle, the more likely, and less interesting explanation is that the SportVU data is simply messy and imprecise (to say nothing of my own imperfect methodologies for deciphering said data).

To paraphrase Tolstoy, good free throw shooters all look alike, but crappy free throw shooters are all crappy in their own way. Here is DeAndre Jordan's chart:

In contrast to Andre Drummond's more scattershot chart, DeAndre cuts a more narrow swath from the hoop to the backboard. His misses are more of the "too strong" or "too weak" variety, as opposed to wide left and wide right.

Here is Stephen Curry's "PITCHf/x"-style free throw chart (the perspective here is top down, looking down at the rim):

Curry is an 89% free throw shooter and it shows in this chart, with his shots clustered in a nice, tight distribution.

Contrast Curry's laser-guided precision to the indiscriminate saturation-bombing of Andre Drummond, a 40% free throw shooter:

Note that there are several "misses" on this chart that otherwise seem on target. While this could be due to a shallow approach angle, the more likely, and less interesting explanation is that the SportVU data is simply messy and imprecise (to say nothing of my own imperfect methodologies for deciphering said data).

To paraphrase Tolstoy, good free throw shooters all look alike, but crappy free throw shooters are all crappy in their own way. Here is DeAndre Jordan's chart:

In contrast to Andre Drummond's more scattershot chart, DeAndre cuts a more narrow swath from the hoop to the backboard. His misses are more of the "too strong" or "too weak" variety, as opposed to wide left and wide right.

### Next Steps

Where to go next could be a blog post in its own right, and I welcome any and all suggestions. An obvious next step would be to extend this analysis to field goal shooting. I started with free throws as it was simpler to mine the raw data for shots taken from a consistent spot on the floor. But field goal shooting is kind of important to the game of basketball, and thus worth powering through the data complexities.

I also plan on compiling and sharing these newly created shot metrics at a player level. For example, other members of the Stephen Curry "high arc" club include Chris Paul, Mike Conley, and Reggie Jackson, while Tony Allen, Chris Andersen, and Blake Griffin take after Harden's line drive style of shooting. Perhaps with the data compiled, we can zero in on what characteristics the great shooters have in common (if any). In addition, the data could be mined at a player level and monitored for subtle changes in shot mechanics.

With only a couple weeks left in the NBA season, I doubt I will have made much more progress before the champion is crowned (I do have a day job as well as a family that likes to spend time with me on occasion). But this feels like an ideal project for the offseason - something to fill that mid-June to October gap.

Good stuff. Please analyze Joakim Noah's free throws. He was roughly a 74-75% shooter the last four years. This year, he was all over the place, and was especially bad in the last month of the regular season and in the first round of the playoffs. It looked like he was trying to bank them in, at one point.

ReplyDeleteIs the data for this freely available somewhere?

ReplyDeleteAn obvious thing to look at is how consistent the players' release in terms of angle, position, and velocity is.

DeAndre Jordan's free throws have a particularly high offensive rebound rate. Perhaps he's shooting for possession rather than points?

> ... It looked like he was trying to bank them in, at one point.

The backboard collision is inelastic so hitting it can 'make the rim bigger' and - especially for a tall person - the backboard is a bigger visual target than the rim. In this model, the free throw line favors direct shots, but not by much. (http://www.wired.com/2011/03/physics-basketball-shots/)

It would be interesting to know the difference in target area for high angle vs low angle shooters. How much of an advantage is Stephen Curry generating by shooting high? For example, does that additional 4º net him a 5% larger target over the average shooters in his height group?

ReplyDeleteI tried to work out the math on this. Thought it would be simple, but soon got lost in endless trigonometry. Either I'm overlooking something obvious, or I'll have to resort to numerical methods.

DeleteApparently this is covered in John Fontanella's book "The Physics of Basketball".

DeleteWhen I do a quick and dirty approximation, I get that the effective size should be proportional to

v^-2 cosecant(2 theta-15)

where theta is the angle off the horizontal.

I looked at this again, and realized that dividing by the derivative isn't ideal where the derivative goes to zero, and that I might be considering the wrong physics.

DeleteHey guys, if you are interested in a visualization of the tolerance area inside the basket with respect to the angle the ball aproaches the rim, check out our App Dirkometrix https://www.dirkometrix.com/sites/App.php?lang=en, developed by Holger Geschwindner (coach of Dirk Nowitzki) and myself. Any questions just contact me.

DeleteSouth Koreans do this good. Good read. The Science(Physics of Basketball.

ReplyDelete(-0_0-)

Fantastic post! Thanks. I'm, unfortunately, quite confused by the 3 "PITCHf/x" style free throw charts at the end. For all 3 players' data there are misses (blue x's) right in the middle of what should be the make zone. For Curry's data, in particular there doesn't seem (except for the 2 short misses to the right) to be a difference between the spatial distribution of the makes and the misses. This doesn't seem reasonable to me. What z-height are these xy coordinates taken at?

ReplyDelete