Just a heads up that this post is pretty heavy on math and physics: differential equations, integrals, drag coefficients, air density at varying altitudes, etc. You know, in case you're into that kind of thing.

Most physics problems, especially the undergraduate variety, require simplifying assumptions in order to make them workable - frictionless surfaces, perfect vacuums, spherical cows. This past May, I employed some simple physics to analyze the shooting trajectories of NBA players. The raw location data came from the SportVU location tracking system, and is somewhat noisy. To tease signal from that noise, I assumed the ball's path followed a simple trajectory, the kind which physics majors cut their teeth on as freshman. I then used linear regression to pick a path that best matched the raw data.

One key simplifying assumption made was to ignore the impact of drag on the flight of the ball. As the ball moves through the air, it is pushing that air out of the way, and the air pushes back, slowly degrading the ball's velocity. As it turns out, this effect, while small, was not negligible, and its omission was creating persistent bias in the modeling of free throws and field goals.

In this post, I'll outline my attempts to incorporate drag into my model of a basketball's trajectory, and then test that model's predictions against the raw SportVU data (science!). As a bonus project of sorts, I will also examine whether drag effects are noticeably different for thin air arenas such as those of the Denver Nuggets and Utah Jazz.

In my original modeling, the ball was assumed to be under zero horizontal force. As a result, the ball's horizontal trajectory maintained a presumed constant velocity, in accordance with Newton's First Law of Motion. This is a good first approximation, but let's see if we can add the drag force to this simple model.

Here is the drag force equation:

The first term (Cd) is the drag coefficient, and is a function of the object's shape. For a sphere, the coefficient is 0.47. The second term, ρ, is the air density. The more dense the air, the more significant the drag force. The third term is the cross-sectional area of the basketball (more surface area = more drag). The next term is the square of the object's velocity. So the faster the object is moving, the more significant the drag force. The final v term just tells you (along with the minus sign at the beginning) that the force is always in the opposite direction of the object's velocity.

Focusing just on the horizontal component of drag force (eq. 1), we get the following:

Rearranging terms a bit further:

As it stands, this amounts to solving a nonlinear, multivariate differential equation, which is as difficult as it sounds. But if we could find a way of dealing with that pesky square root term on the right, what's left is something far more tractable. For a typical free throw, here is what the square root term looks like (the blue line):

In order to simplify our equation, I am going to replace the square root term with its average value: 1.20. However, to keep the equations clean, I will refer to this term as f in the development below.

Newton's Second Law of Motion states that force equals mass times acceleration. And acceleration is just the derivative of velocity with respect to time. Putting that together, and using our factor f to approximate the square root term in our equation, we now have the following (where m is the mass of the basketball):

This is a far more workable equation. To solve it, we'll group terms and slap an integral sign on both sides:

After integrating both sides, and remembering to add the constant of integration, we arrive at the following equation for the ball's horizontal velocity:

Where v0 is the ball's initial horizontal velocity. Integrating once more gives us the full equation of motion as a function of time:

Closed form solutions such as these are nice, but I would like to simplify this even further. Under suitable conditions, we can approximate the logarithm by using the following infinite series:

When x is small (i.e. much less than 1), this function can be approximated by taking the first few terms in the series expansion. So, we need to estimate the typical value of the term inside the logarithm in (7).

As mentioned above, the drag coefficient for a sphere is 0.47. The density of air at typical altitudes is 0.0765 pounds per cubic foot. The cross sectional area of a basketball is 0.492 square feet. Our "fudge factor", f, was 1.20. Typical horizontal release velocity for a free throw is 14.38 feet per second. The weight of a basketball is 1.375 pounds. A typical free throw spends about 1.044 seconds in the air, so we'll set t at 1.044 as an upper limit. Combining all those factors together and we get a total factor of 0.1159.

With an x that small, we can safely approximate the logarithm as follows:

The relative error in this approximation is just 0.5% when x = 0.1159. Plugging (9) into our equation of motion (7), we have our final result for the horizontal equation of motion:

So it appears I can reasonably model the effect of drag by adding a quadratic time term to the regression model that I fit against the raw location data. And we also have a theoretical value that we can compare against the actual regression coefficients from each modeled shot.

Based on modeling 30,000+ free throws from the 2014-15 season, the average regression coefficient in the t2 term came out to be -0.570 s-2. How does this compare to the theoretical value of:

Plugging values into (11), we get: (0.47) x (0.0765) x (0.492) x (1.20) x (14.38)2 / 4 / (1.375) = 0.798 s-2. This is ~40% higher than the observed value. While this may seem like a large discrepancy, given the noisiness of the SportVU data and the shortcuts taken above, I'm happy to be within an order of magnitude. The drag force should only alter the path of a free throw by about 6 inches, which is probably at the limit of the SportVU system's precision.

In the calculations above, I assumed a "typical altitude" air density of 0.0765 pounds per cubic foot. But air density varies due to altitude, with high elevation areas like Denver and Salt Lake City having approximately 20% lower air density than areas near sea level such as Boston or Los Angeles. According to the drag force equation (1), we should see lower drag force in a high elevation arena like the Denver Nuggets' Pepsi Center. Does this show up in the SportVU data?

I grouped my free throw data by venue, and calculated an average drag force for each arena. That data was cross referenced against the air density one would expect for each venue's elevation. The results are summarized in the chart below.

A few things jump out. First, the correlation between elevation and drag force is not very tight, so we may just be chasing noise. The trendline, which is just a linear fit of the raw data, slopes downward as one would expect. The "theory" line, while consistently above the trendline, has a similar slope to the fitted raw data, which is somewhat encouraging.

There are some interesting outliers in the data. Cleveland has the lowest average drag force, despite a fairly modest elevation. And Boston has the highest drag force, well above other arenas near sea level (maybe the air in Boston is just dirtier?).

As I said, I may just be chasing noise here, but there does seem to be a detectable high altitude effect when it comes to the trajectory of a free throw shot. All other things being equal, a free throw will travel about an inch or two farther in Denver than it would in Los Angeles due to differences in drag force. This could conceivably be enough to affect free throw accuracy, particularly for visiting teams. However, I could not find any pattern in the actual data to support this. Free throw accuracy for opponents visiting the Jazz or Nuggets does not appear to be noticeably worse.

Part of the reason may be that the impact of buoyancy offsets the impact of drag. The upward buoyant force is less in high altitude, meaning that the ball accelerates downward at a higher rate. The ball may travel farther horizontally due to less drag, but it is getting pulled down to earth faster, so it may arrive at the hoop at about the same point, regardless of elevation.

You'll note that I did not add consider drag effects when modeling the vertical trajectory of the ball. Because the force points downward during the ball's upward trajectory, and upwards when it's on the way down, the affect may wash out somewhat. You also have the competing Magnus effect further complicating matters. For now, the net effect of these additional forces are buried in the t2 term of the regression used to model the ball's vertical trajectory.

This concludes our summer physics lesson. Thank you to those that stuck around. I plan on sharing more with respect to Shot Arc analysis (aka "ShArc"), but want to make sure I have the right model under the hood before getting too far afield.