Sunday, January 8, 2017

A Deflategate Analysis for the NBA


Phil Jackson, 1986:
"We'd try to take some air out of the ball. You see, on the ball it says something like 'inflate to 7 to 9 pounds.' We'd all carry pins and take the air out to deaden the ball. 
It also helped our offense because we were a team that liked to pass the ball without dribbling it, so it didn't matter how much air was in the ball. It also kept other teams from running on us because when they'd dribble the ball, it wouldn't come up so fast."


At its news cycle peak, the NFL's Deflategate scandal was inescapable. It even spilled over into the NBA, where admissions and accusations of ball tampering had been hiding in plain sight:
  • Marv Albert, in his 1993 autobiography, claiming to have seen future senator and presidential candidate Bill Bradley use a pin to surreptitiously deflate the ball as a member of the 1970s New York Knicks.
  • Bradley's teammate, Phil Jackson (quoted above), admitted to deflating balls in a 1986 Chicago Tribune article on cheating in sports.
  • Later, as an NBA coach, Jackson says he caught other NBA teams changing the pressure of the ball to better suit their playing style (e.g. the Magic Johnson-era Lakers trying to inflate the ball to nearly twice the allowed pressure to facilitate long rebounds and fast breaks)
  • Shaquille O'Neal, in the summer of 2015, says he used a needle to let air out of the ball during the Lakers' championship runs, claiming it helped him better palm the ball (he didn't think it gave him an advantage on free throws)
The NFL's Deflategate scandal was long on rumor and insinuation, but short on hard data (i.e. the makings of a good scandal). It boiled down to just 30 data points: two separate pressure gauge readings of 11 Patriots footballs and 4 Colts footballs, taken at halftime of the 2015 AFC Conference championship. You can breathe a sigh of relief, because I'm certainly not going to rehash that analysis here.

Instead, I will use not 30, but more than 2 million data points to analyze whether the NBA has a ball pressure scandal of its own.

Follow the bouncing ball

Prior to each NBA game, the home team equipment crew presents three game balls to the officials. The balls are inspected for wear and tear, and the crew chief ensures that each ball is inflated to within 7.5 and 8.5 psi. As far as I know, those measurements are not recorded, and it is not standard practice to re-measure the pressure in-game.

But there are other ways to measure ball pressure. The opening quote from Phil Jackson provides a clue: they let air out of the ball so that when it was dribbled "it wouldn't come up so fast".

And it just so happens that we have a remarkably robust (but noisy) dataset on the real time position of the basketball for thousands of NBA games. Since 2013, the NBA has used the SportVU camera system to track the position of all ten players on the court in two dimensions. And that same system tracks the position of the basketball in all three spatial dimensions. This data was publicly available (sorta) for all games going back to the 2013-14 season, until the NBA took it down on January 23, 2016.

I have this data archived and in this post, we will analyze over 2 million bounces to see if there are venues where the ball doesn't "come up so fast".

Let's start with an example. The chart below is from Game 6 of the 2015 NBA Finals. With 0:29 left and the Warriors leading 101-97, Steph Curry stepped to the foul line with the opportunity to ice the game and the championship for Golden State. Curry dribbled the ball once (his standard pre-shot routine), and sunk the first of two free throws.

The SportVU cameras, clicking away approximately 25 times per second, reported the trajectory of that dribble as follows:

Curry releases the ball about 3 feet above the ground. The ball's downward velocity right before striking the ground was 17.7 feet per second. Immediately after the bounce, the ball's upward velocity was 14.2 feet per second.

The ratio of the post bounce velocity to pre bounce velocity is known as the coefficient of restitution. Originally conceived by Isaac Newton, a higher coefficient of restitution implies a "bouncier" ball (or perhaps a bouncier floor). When Phil Jackson and his Knicks teammates were letting air out of the balls in the seventies, they were, in effect, hoping to lower the coefficient of restitution.

The coefficient of restitution will be the main metric of focus for this post. For each NBA game, we can calculate the implied coefficient of restitution by analyzing the pre and post bounce velocities for every bounce of the ball.

The ball bounces thousands of times in each NBA game (think of how many times the ball is dribbled). However, the SportVU cameras can be easily confused, and the data is unusable for a significant portion of those bounces. But once we cull the bad data, we are left with about 850 relatively clean bounces per game, which should still be sufficient to get a reliable estimate of the ball's coefficient of restitution.

Another complicating factor in all of this is that the coefficient of restitution is not a fixed number, but varies with the speed of impact. It's more of an empirical rule of thumb, rather than ironclad physical law. There's a reason it is rarely included among Newton's greatest hits.

More specifically, the coefficient of restitution tends to decrease as impact velocity increases. Here is an illustration, using the 751 clean bounces from Game 6 of the 2015 NBA Finals.


The first thing you'll notice is that even for "clean" bounces, the data is messy, with several bounces implying a coefficient of restitution greater than 1.0 (a physical impossibility). So, our data is of low quality, but we've got it in mass quantities. And quantity, as they say, has a quality all its own.

If we have enough data points (and the noise in our data is truly random and unbiased), a linear regression model should be able to tease signal from that noise. The blue line on the chart shows that linear fit. As expected, the coefficient decreases as impact velocity increases.

But this creates a problem if my goal is to calculate an average coefficient of restitution for each NBA game. For example, let's say there are certain point guards that dribble the ball harder than others. This will result in lower average coefficient of restitution, but that doesn't have anything to do with ball pressure.

To avoid this potential bias in the results, we will run a linear regression for each game separately and then calculate the implied coefficient of restitution for the median pre-bounce impact velocity of 19.27 feet/second. To use the Game 6 example above, our linear regression parameters imply the following:

coefficient of restitution = 1.180 - 0.0184 x [impact velocity]

Plugging 19.27 feet per second into this formula, we get a coefficient of 0.825 for this game.

Aggregating results by venue

With a methodology in place to consistently measure coefficient of restitution for each NBA game, we can then aggregate those results by home team to see if patterns emerge, and if those patterns persist over time.

Here are the game by game coefficients charted chronologically for all Cleveland Cavaliers home games for which I have data:

Each red dot represents a Cavaliers home game, which are played at Quicken Loans Arena. The gray dots represent all NBA games so we can see where the Cavs stack up relative to the league.

In general, the ball appears less bouncy for Cavaliers home games, especially for the 2013-14 season and for the truncated data I have on the 2015-16 season. Now, there are many possible reasons for this pattern:
  1. The Cleveland equipment crew could be inflating the basketballs to the lower end of the 7.5 to 8.5 psi requirement.
  2. Maybe Quicken Loans arena has "dead floors", meaning their construction or composition may dampen the ball's post bounce velocity (moreso than other arenas).
  3. This could all be an artifact of the SportVU system. Perhaps Oracle Arena's SportVU system was calibrated in such a way that produces biased results on pre and post bounce velocities.
  4. This could reflect differences in how each equipment crew conditions the balls before use. Teams are given their allotted stock of game balls two months prior to the start of the season, to allow time for "breaking in". And one study has found that a brand new leather ball does not bounce as high as one that has been broken in (more from that study later in this post).
  5. Or this could all just be random noise, and the patterns we see in Cleveland's data above could be a data mirage.
To that last point, here are two charts showing how a home team's average coefficient of restitution correlates season to season.

There is significant correlation from one year to the next for each home team. Clearly, we are measuring something real here, and not just random variation game to game.

You'll note from the charts above that both Utah and Philadelphia tend to be at the high end of the league's range. Here are their game to game charts:


That the Jazz may over-inflate their balls is not a new notion. Way back in 2006, Nets forward Scott Padgett accused the Jazz of overinflating their balls in an interview with the New Jersey Star-Ledger. Said Padgett:
They have an offense geared toward the layup, so they want your jumpers bouncing out.
Padgett was a member of the Jazz from 1999-2003, but his comments in the article imply that he never witnessed overt tampering by the Jazz. Rather he just felt that the "balls were hard".

In a couple of now-deleted tweets from 2015, famed NBA gambler Haralabos Voulgaris levied similar allegations against the Jazz (quoted for posterity here in SB Nation):
Some teams in late 90s early 2000s also inflated or deflated the ball. See Jerry Sloan Utah.
Jazz rarely took 3pointers or long shots, preferred banging inside, often used over inflated ball vs jumpshooting teams.
If the chart above is to be believed, old habits die hard for the Jazz. Over the span of my dataset, Utah has the highest average coefficient of restitution of any team (high coefficient = "hard balls"). Perhaps Scott Padgett was onto something.

Note that all of the 76ers numbers above are from the Sam Hinkie era. As I mentioned previously, there are many possible explanations for these patterns, but it wouldn't surprise me to learn that a Hinkie-led organization would exploit every possible edge - up to and including ball pressure for home games.

What is somewhat puzzling though is that an overly inflated ball doesn't necessarily lend itself to the 76ers style of play. They ranked 5th in 3 point rate and 9th in free throw rate for the 2014-15 season. I would assume a team that shoots a lot from long distance or from the free throw line would prefer an under-inflated ball that would lend itself to more friendly caroms off the rim and backboard.

Or maybe, *places tinfoil hat atop head*, perhaps this was all part of The ProcessTM. If the whole point of the 76ers strategy was to tank for draft picks, playing Moreyball with a ball inflated at league maximum pressure could just be a sneaky way to tank, without being obvious about it.

Speaking of Moreyball, Daryl Morey's Houston Rockets appear to be at the opposite end of the spectrum when it comes to "bounciness". Here are the game to game charts for Houston and Memphis, another team at the bottom end of our coefficient of restitution range.


The Houston data is noisy but definitely shows a bias towards the lower end of the NBA's range. With an offense built around three pointers and free throw attempts, it wouldn't be unreasonable to instruct your equipment crew to keep ball pressure near the league minimum of 7.5 psi.

There are some interesting patterns in the Memphis data. They began the 2013-14 season at the very bottom of the NBA's range. But in early March of 2014, their coefficient of restitution shifts upwards for the remainder of the season. Of course, we can't say for certain this was due to changes in ball pressure, or whether this is just a data artifact of the SportVU system. And whatever the cause, the Grizzlies' coefficients reverted back to their 2013 levels for both the 2014-15 and 2015-16 seasons.

As mentioned above, there could be biases hiding in each arena's SportVU system that could create misleading data on bounce velocity. However, the Staples Center in downtown Los Angeles plays host to both the Clippers and the Lakers and, presumably, uses the same SportVU system. This makes for a relatively well controlled experiment. Here are the charts for the Lakers and Clippers:


Despite playing in the same arena, there do appear to be clear differences in the coefficients of Clippers games versus Lakers games, most notably for the 2014-15 season, where the Lakers were near the bottom while the Clippers had a slightly above average coefficient of restitution.

This is the best evidence so far indicating a home team bias towards a particular inflation pressure for their basketballs. Are there other data sources we can look at to validate these initial results?

Correlations with Free Throw Percentage

If the variations we see by home team represent real differences in ball pressure, and we assume that lower ball pressure will lead to more made shots, one way to validate our results is to examine the correlation between the coefficient of restitution and free throw success.

I had explored venue bias and free throw percentage in a prior post, and I will take a similar approach here. For each game, we can calculate an expected free throw percentage for each player, and then compare it to their actual free throw percentage for that game. Note that I am only including road players in this analysis.

At the player level, this data will be noisy, but I can aggregate it across all players for the game to get a measure of how well or how poorly players were shooting free throws for that particular game. We would expect higher coefficients of restitution to be associated with poor free throw shooting.

And directionally, this appears to be the case. With a sample of 2,914 games, the correlation between the game's coefficient of restitution and actual vs expected FT% is -0.0126. So, at least it's pointing in the right direction. But if we build a linear regression for FT% as a function of the game coefficient, we fall far short of statistical significance. There is a 50% chance the coefficient of restitution has no correlation with free throw success.

What if we aggregate by home team and season? There is still a lot of noise in the game to game numbers and if the phenomenon we are trying to capture is truly a function of home team, then aggregating may strip away some noise from the signal.

The chart below shows how free throw success, when aggregated by home team and season, correlates with the average coefficient of restitution.


Even at this level, we still fall short of standard statistical significance, but the fit does improve, and continues to point in the right direction. There is now just a 15% chance that free throw success rate is not correlated with the coefficient of restitution.

Comparison with public research

Another way to validate these numbers is to compare against prior research on the basketball's coefficient of restitution.

In 2006, the NBA introduced a new, synthetic basketball to replace the traditional leather basketball. Despite the NBA's assurances that the new ball was superior, complaints were widespread. In response, Dallas Mavericks owner Mark Cuban commissioned the University of Texas at Arlington's physics department to test the physical properties of both the synthetic ball and the leather ball.

The report covers a variety of topics, including the coefficient of restitution of both balls. The UT team found that the coefficient of restitution for the leather basketball was 0.81 when dropped from a a height of 4 feet 3.7 inches. From that height, the impact velocity of the ball is 16.5 feet/second. If I build a simple regression model over my entire NBA bounce dataset (~2.5 million bounces), it predicts a coefficient of restitution of 0.86 for that impact velocity. This is somewhat higher than the UT team's value of 0.81.

While the NBA regulates the bounciness of its balls via pressure, the International Basketball Federation (FIBA) takes a more direct approach. According to official FIBA rules, the basketball must:
Be inflated to an air pressure such that, when it is dropped onto the playing floor from a height of approximately 1,800 mm measured from the bottom of the ball, it will rebound to a height of between 1,200 mm and 1,400 mm, measured to the top of the ball.
From that height, the impact velocity is 19.3 feet per second. Applying some simple physics regarding gravity and air resistance, the requirement that the ball must rebound to a height between 1,200 and 1,400 mm is equivalent to a coefficient of restitution between 0.74 to 0.81. From my NBA data, the implied coefficient for a 19.3 ft/s impact velocity is 0.82, which is outside of the implied FIBA range of 0.74 to 0.81.

Note that some sites (such as this one) incorrectly calculate the FIBA range as 0.82 to 0.88. But that calculation misses the fact that the rebound height is measured from the top of the ball, while the starting height is measured from the bottom of the ball (there is also a small impact due to drag that must be accounted for as well).

The NCAA has a rule similar to FIBA's for men's basketball:
The ball shall be inflated to an air pressure such that when it is dropped to the playing surface from a height of 6 feet measured to the bottom of the ball, it will rebound to a height, measured to the top of the ball, of not less than 49 inches when it strikes its least resilient spot nor more than 54 inches when it strikes its most resilient spot.
From that height, the impact velocity is 19.4 feet per second and the bounce height requirement is equivalent to a coefficient of restitution between 0.76 and 0.80, lower than my value of 0.82.

I also found this study from Indiana University Purdue University Fort Wayne, which measured coefficients of restitution for a variety of drop heights and ball pressures. From a drop height of 4 feet (impact velocity of 15.9 ft/s), the coefficients were 0.84 for 5.9 psi, 0.87 for 7 psi, and 0.87 for 9.25 psi. My data implies a coefficient of 0.87 for this impact velocity, consistent with these values.

For a drop height of 8 feet (impact velocity of 22.4 ft/s), the coefficients were 0.64 for 6 psi, 0.71 for 8 psi, and 0.74 for 10 psi. My data implies a coefficient of 0.78 for this impact velocity, somewhat higher than the IUPU study's range.

I even performed my own set of backyard experiments, with the help of my awesome lab assistant:


We tested pressure levels of 7.5 psi (league minimum) and 8.5 psi (league maximum). Video was analyzed using the Tracker Tool from Open Source Physics. For four trials, our average impact velocity for the 7.5 psi test was 18.7 feet per second, and the average coefficient of restitution was 0.87. For four trials of the 8.5 psi test, the average impact velocity was 18.3 feet per second and the average coefficient of restitution was 0.89. Note that these values may be biased due to the surface - we are bouncing the ball off of driveway concrete, rather than NBA hardwood.

The table below summarizes the data I was able to gather regarding the bounce of the basketball, and how it compares to my analysis of the NBA SportVU data.

study bounce velocity
(ft/s)
pressure (psi) coefficient of restitution
published valuemy NBA data
UT Arlington / Cuban 16.5 8.5 0.81 0.86
IUPU Fort Wayne 15.9 7.0 0.87 0.87
IUPU Fort Wayne 15.9 9.3 0.87 0.87
IUPU Fort Wayne 22.4 6.0 0.64 0.78
IUPU Fort Wayne 22.4 8.0 0.71 0.78
IUPU Fort Wayne 22.4 10.0 0.74 0.78
FIBA Rules 19.3 0.74 - 0.81 0.82
NCAA Mens Rules 19.4 0.76 - 0.80 0.82
Beuoy backyard 18.7 7.5 0.87 0.83
Beuoy backyard 18.8 8.5 0.89 0.83

In general, my values tend to be higher than the published values. There may be a valid reason for the discrepancy, but it is a reminder to treat my numbers with appropriate skepticism.

Deflategate 2? Or reasonable variation?

It does appear that there is meaningful variation in how the ball bounces in each venue. And it is not unreasonable to consider whether these variations are due to differences in ball pressure. We know the NBA allows ball inflation to vary between 7.5 psi and 8.5 psi. The question is, are the variations in coefficient of restitution we see from game to game (and team to team) within the range one would expect for balls inflated between 7.5 and 8.5 psi?

Or.....is there a smoking gun lurking in the data? One that would point to a consistent team practice of inflation levels outside of NBA allowable levels? Or perhaps a specific game?

The IUPU study above implies that for every 1 psi increase in ball pressure, the coefficient of restitution increases by ~0.02. My backyard experiments imply a similar value.

Thus, if everything was on the level, one would expect each game's coefficient of restitution to fall within an 0.02 band. Do we see that type of variation in the actual NBA bounce data? Not exactly. For games with at least 350 clean bounces, the coefficient of restitution varies by 0.12, from a low of 0.74 to a high of 0.86. But that range is skewed by just a handful of outliers. 90% of the games fall within a much narrower band, from 0.80 to 0.84, and 60% of the games fall within the "expected" 0.02 band.

If we aggregate by team, we see a variance of about 0.03 from the lowest to highest team within each season. Given the inherent messiness in the underlying data, I'm inclined to towards two conclusions:

  1. Accusations of cheating are serious, and should be supported with strong evidence. I don't believe I have any smoking guns in my data to support a charge of cheating. At best (or worst), my analysis could be used as a starting point for a more in depth investigation. More on that below.
  2. I think it is highly likely some NBA teams deliberately and consistently inflate balls to one end of the allowed psi range to better suit their playing style. While not the makings of a juicy scandal, I think this is an interesting result nonetheless, and would love to hear from those in the know whether my team by team analysis has any validity. 

As mentioned above, if one were inclined to hunt further for evidence of actual foul play, my analysis can help point the way. I have compiled a "top 10" list of games with the most suspect bounce data.

Here are the five most likely "dead ball" games. These games had the lowest implied coefficient of restitution. Note that I have applied some filters to the data to throw out games with odd looking results (e.g. games outside the norm when it come to average pre-bounce velocity).

Top 5 "Dead Ball" Games
date game home team bounces coefficient FT% eFT% diff
2014-11-21 CHI 87 , POR 105 POR 1,041 0.776 67.9% 73.6% -5.7%
2014-10-29 OKC 89 , POR 106 POR 857 0.788 80.8% 76.5% 4.3%
2014-12-13 DEN 96 , HOU 108 HOU 838 0.789 65.4% 67.6% -2.2%
2013-12-04 DEN 88 , CLE 98 CLE 1,184 0.792 76.9% 75.9% 1.0%
2013-11-01 CLE 84 , CHA 90 CHA 933 0.792 72.7% 74.4% -1.7%

The columns on the right compare actual road team free throw percentage to expected. For dead ball games, we would expect road teams to shoot better from the stripe.

And here are the top live ball games (note that three of the top five games in this list are Jazz home games).

Top 5 "Live Ball" Games
date game home team bounces coefficient FT% eFT% diff
2015-02-02 MIN 94 , DAL 100 DAL 783 0.861 88.5% 77.4% 11.1%
2015-01-16 NOP 81 , PHI 96 PHI 876 0.860 65.5% 76.9% -11.4%
2015-02-20 POR 76 , UTA 92 UTA 1,136 0.858 90.9% 78.6% 12.3%
2015-04-10 MEM 89 , UTA 88 UTA 1,067 0.857 63.2% 80.9% -17.7%
2014-04-08 DAL 95 , UTA 83 UTA 1,126 0.854 54.5% 83.1% -28.6%

This analysis began well over a year ago as an offshoot of my analysis on shot arcs. With the NBA no longer sharing SportVU data publicly (as is their right), I realized that the data I was able to stockpile had a limited shelf life. There are a couple additional analyses I have had in the works on free throw shooting and shot mechanics that I hope to release soon, before the data gets too stale.

In addition, in the interests of transparency, I plan on sharing some additional data underlying this analysis on bounces and ball pressure.

Appendix

Here are the average coefficients of restitution by team and season. The table is sortable by its column headings. A reminder that a higher coefficient of restitution implies a "bouncier" ball, and, possibly, a higher internal ball pressure.

teamseason
2013-14rnk2014-15 rnk 2015-16rnk
ATL 0.825 15 0.825 12 0.821 16
BKN 0.836 5 0.835 4 0.832 4
BOS 0.827 13 0.829 9 0.824 12
CHA 0.811 29 0.819 18 0.809 28
CHI 0.839 3 0.832 6 0.832 5
CLE 0.814 27 0.814 23 0.812 25
DAL 0.833 6 0.836 3 0.829 8
DEN 0.830 8 0.831 7 0.836 2
DET 0.828 11 0.826 11 0.822 13
GSW 0.818 23 0.808 30 0.819 17
HOU 0.813 28 0.813 27 0.815 20
IND 0.820 21 0.814 24 0.813 23
LAC 0.828 12 0.825 14 0.833 3
LAL 0.830 9 0.812 28 0.821 14
MEM 0.810 30 0.813 26 0.808 30
MIA 0.821 18 0.813 25 0.814 22
MIL 0.821 17 0.815 22 0.811 27
MIN 0.843 1 0.829 8 0.825 11
NOP 0.816 26 0.816 20 0.812 24
NYK 0.820 20 0.821 16 0.828 10
OKC 0.829 10 0.825 13 0.821 15
ORL 0.817 24 0.815 21 0.819 18
PHI 0.832 7 0.842 1 0.829 7
PHX 0.837 4 0.827 10 0.831 6
POR 0.822 16 0.810 29 0.809 29
SAC 0.819 22 0.819 17 0.818 19
SAS 0.820 19 0.817 19 0.811 26
TOR 0.825 14 0.832 5 0.829 9
UTA 0.839 2 0.841 2 0.837 1
WAS 0.817 25 0.822 15 0.815 21

8 comments:

  1. This is an amazing analysis. You should get this published in an academic paper

    ReplyDelete
  2. I wonder if the Jazz result has much to do with the altitude, looks like Denver's CoR is above average too. It's a truism that those two cities have thin air.

    Google tells me atmospheric pressure in Denver is about 80-85% of sea level, so I guess that would have some effect there even if it's small enough to be irrelevant.


    I'm also curious how the data would look with ONLY free throw bounces included. It decimates the sample size I'm sure, but I'd suspect the other bounces are affected by factors like spin etc that SportsVu doesn't capture (?).

    Very cool article anyway!

    ReplyDelete
  3. You need to get out more.

    ReplyDelete
  4. Great article. I happen to think variations in floor construction (incl. underlying ice) and atmospherics (humidity/pressure) are driving most of the anomalies observed.

    The interesting ones, to me, are the Lakers vs. Clippers, and when an CoR change occurred in coincidence with a coaching change. I.e., when the actual arena and floor are held constant.

    ReplyDelete
  5. Fantastic analysis!

    I find the two bottom teams to be quite significant. Memphis has constantly been among the five slowest teams in the league during the last few years and Charlotte under Steve Clifford has made a concerted effort to not turn the ball over. They've lead the league in TOV% the last three seasons in a row. So those two teams would have obvious reasons for playing with less bouncy balls, right? It's easier to control the pace or turnovers when the ball is more predictable and doesn't bounce off the rim or the floor as hard.

    Have you considered contacting the players or coaches who have been responsible for some of the quotes in the media? They might be interested in your work and could offer some additional evidence. For one, Scott Padgett is now the head coach at a relatively small college. You should be able to get through to him.

    ReplyDelete
  6. https://the-masters.org/

    https://2017masters.co/

    https://mastersgolf.co/

    https://2017masters.org/

    https://themasters2017.info/

    ReplyDelete
  7. One way to reduce the noise is to only analyze bounces during free throws since free throw shooters tend to bounce the ball in the exact way for every shot.

    Also, you could isolate the analysis even further to just a single free throw shooter - like harden. Calculate hardens home coefficient and then his coefficient for each arena. Then see if hardens ft% is better than his home ft% in arenas where the hardens coefficient is lower than his coefficient in Houston.

    ReplyDelete