Sunday, March 1, 2015

Explaining Team Win-Loss Records

A useful, sometimes ignored distinction in sports analytics is the difference between a "narrative" stat and a "predictive" stat. Narrative stats tell you what happened. Predictive stats tell you what will happen. Predictive stats are what your general managers, bookies, and gamblers care about, but it's usually the narrative stats that get the most attention. In addition, narrative stats are often mistaken for their predictive cousins (it happens with elections too).

Win probability added is, perhaps, the ultimate narrative stat. It can distill any play down to its impact on what we care about the most: winning. In last week's post on how NBA games are won, I used my win probability model to assess the relative importance of the "four factors" of basketball success, leading to the ground breaking conclusion that shooting is very, very important to the game of basketball (take that, Chuck).

In this post, I introduce a new feature to the site: NBA Team Profiles. In the same way I can use my narrative win probability added stat to take stock of the four factors, I can do the same for team win-loss records. The new feature breaks down a team's win loss record according to the four factors, and further split by offense and defense. Here is an example:

The Golden State Warriors currently have the league's best record, with an 80% win percentage. That record can be "explained" as follows:
  • Generic win-loss percentage: 50%
  • + Offense Win Probability Added: 13% (see Team Profile page)
  • + Defense Win Probability Added: 17% (see Team Profile page)
  • = Actual Win Loss Percentage: 80%
The Warriors lead the league in points scored per game, their points allowed per game is about league average, and they possess the best scoring duo in the league with Steph Curry and Klay Thompson. So, it might be surprising (to some) to find that the Warriors owe more of their success to defense than to offense. Of course, the Warriors' per game stats are misleading because of the high pace at which they play their games, leading to more scoring opportunities for opponent and Warrior alike. For that reason, the win probability added numbers above are adjusted to a per 100 possessions basis.

We can also run the same explanation using the four factors (combining the offensive and defensive components - select the "Collapse Off/Def" option):
  • Generic win-loss percentage: 50%
  • + Field goals: +26%
  • + Free throws: -2%
  • + Rebounds: -1%
  • + Turnovers: +7%
  • = Actual Win Loss Percentage: 80%
Nearly all of the Warriors' success is attributable to shooting (on offense and defense). And that seems to hold in general for both the good teams and the bad. In fact, 83% of the variance in this season's win-loss records can be explained solely by shooting. Free throws explain just 16% of the variance, Rebounds 2%, and Turnovers 28%. Here are those same numbers for the past four seasons:

Variance in Win-Loss Record Explained By:
Season Field Goals Free Throws Rebounds Turnovers
2011-12 81% 13% 23% 18%
2012-13 81% 15% 9% 21%
2013-14 83% 17% 3% 21%
2014-15 83% 16% 2% 28%
Average 82% 15% 9% 22%

In my previous post on how NBA games are won, I found that, according to win probability, shooting had seven times the impact as the next important factor, turnovers. In the comments to that post (and APBR as well), it was pointed out that shooting has a higher variance than the other three factors, which could lead to shooting being overvalued in my analysis. As long time commenter Nate put it, "non-linear scaling of variance with sample size".

But since the above analysis aggregates team performance over an 82 game season, presumably some of that game-to-game variance washes out. By this measure, shooting is not seven times as important as turnovers, but closer to four. This is lower than my original seven to one ratio, but still higher than the prior estimates of EvanZ and Dean Oliver.

The rebounding results are interesting. My first analysis showed that turnovers, rebounds, and free throws were all roughly equivalent in their ability to explain wins. But when aggregated across teams over a whole season, rebounding drops in significance, compared to the other two. Perhaps rebounds, while important to secure, are more due to luck, and less of a repeatable team skill. The 2011-12 results are the outlier here, with rebounding more important than everything but shooting. It is tempting to draw a trendline through those four data points, but it is probably best to wait until I have a chance to add more seasons to my win probability database before making any conclusions.

1 comment:

  1. It's a little odd that the 'importance' of shooting is so stable at 82-83% when the other numbers jump around relatively wildly.

    I'm not entirely clear about the methodology you're using. If the approximate formula for 'importance' is variance x significance, then we can make an estimate of the expected relative importance of shooting and rebounding:

    So let's make some assumptions...
    Possession is roughly worth one point.
    About one quarter of shooting attempts are three point attempts.
    Three pointers go in at about 1/3 of the time.
    Two pointers go in at about 1/2 of the time.
    The offense rebounds about 1/4 of the time.
    Every missed shot is a rebounding opportunity.

    Then we expect the ratio of 'importance' to be around 58 to 12 - or just under 5 to 1, which is pretty far from what you found in the data set from 2012 on.

    However, another approach for approximating:

    Looking at the ESPN stats, the effective team field goal percentage ranges from 53.8 to 45.8 and the rebounding rate ranges from 52.5 to 47.9.

    If rebounds are worth roughly one point, then each percentage point of effective field goal average is worth four times as much as a point of rebounding rate. (Doubled once because there are 2 points per basket, and again because there are twice as many field goal attempts as rebounds.) Now, because the 'range of averages' really corresponds to the deviation rather than variance so we have to square both ranges. That gives a rough ratio of 50:1 which is more in-line with your numbers.

    Mostly, I guess this tells me I have no clue. That said, I can't help but suspect that the difference between than 5:1 and 50:1 somehow corresponds to differences in team quality.