Sunday, January 26, 2014

Early Season Power Rankings - 2013 Update

Exposition universelle de 1900 - portraits des commissaires généraux-Charles Spearman
Charles Spearman
Nothing drives pageviews like portraits
of Victorian-era statisticians
In this post from last year, I took a look at the accuracy of various NFL power rankings, where I defined accuracy as the ability to predict future wins. In a follow up post, I added additional years to the study. Using the Spearman rank correlation coefficient as my measure of accuracy, I found that the betting market rankings that I publish here are the most accurate in terms of predicting future win-loss records. Surprisingly, the second most accurate ranking is what is known as the "Simple Ranking System", which outperformed both the statistical models and the "experts" at ESPN, despite being based solely on scoring margin.

The 2013 NFL Season

As promised, here is the 2013 season update. In order to shed more light on my methodology, I will breakdown the rankings and win-loss records at a team level below, so we can see exactly how right (and wrong) each ranking system was.

As a reminder, I am looking at the rankings after week 4 of the regular season, and seeing how those correlate to each team's win-loss record for weeks 5-16. The ranking systems I'm comparing are: the ESPN power rankings, Football Outsiders' DVOA Rankings, the Advanced NFL Stats team efficiency model, my betting market rankings, and the Simple Ranking System.

Weeks 5-16 Week 4 Rankings
Team Record Rank ESPN DVOA ANS MARKET SRS
CAR 10-2 1 21 6 12 18 3
SF 9-2 2 8 18 23 4 21
ARI 8-3 5 19 27 22 20 26
CIN 8-3 5 11 15 18 12 23
DEN 8-3 5 1 1 1 1 1
PHI 8-3 5 27 26 5 23 24
SEA 8-3 5 2 2 3 2 2
IND 7-4 9.5 6 5 6 9 5
KC 7-4 9.5 5 3 8 13 8
NE 7-4 9.5 4 7 15 6 6
PIT 7-4 9.5 29 25 25 19 30
BAL 6-5 14.5 14 23 29 15 7
DAL 6-5 14.5 18 16 21 16 14
NO 6-5 14.5 3 4 2 3 4
NYG 6-5 14.5 30 31 17 27 29
SD 6-5 14.5 17 12 10 26 13
STL 6-5 14.5 26 30 32 25 31
GB 6-5-1 18 12 9 19 5 18
CHI 5-6 20 9 11 28 8 20
MIA 5-6 20 7 19 14 14 9
NYJ 5-6 20 22 20 13 30 19
BUF 4-7 23.5 23 10 20 29 11
DET 4-7 23.5 10 13 4 7 16
JAC 4-7 23.5 32 32 31 32 32
TB 4-7 23.5 31 22 26 21 22
MIN 3-7-1 26 24 21 24 24 27
ATL 3-8 28 16 14 11 10 15
OAK 3-8 28 28 29 30 31 25
TEN 3-8 28 13 8 9 22 10
CLE 2-9 30.5 20 24 16 28 17
WAS 2-9 30.5 25 28 27 17 28
HOU 0-11 32 15 17 7 11 12

The team with the best win-loss record from weeks 5-16 was the Carolina Panthers. Who saw that coming? DVOA had the Panthers at a respectable #6, while the Simple Ranking System had them as the #3 team in the league (the Panthers had only played three games at that point: 2 losses by a combined 6 points, and one win by a margin of 38; SRS loves it when you do that). The rest of the rankings had the Panthers in the middle of the pack.

Each ranking seemed to have its share of hits and misses. The market had the 49ers pegged right, despite their slow start. The Advanced NFL Stats model was the only one to predict the Eagles' strong finish. And nobody anticipated just how awful the Texans were going to be this year.

Combined 2007-2013 Results

But our eyeballs can only get us so far in evaluating which ranking system was the best, which is why we have Spearman's coefficient. Here are the results, alongside the 2007-2012 results from my previous post. The higher the percentage, the more accurate the ranking.

Week 4 Ranking Correlation to Future Wins
ranking average 2007 2008 2009 2010 2011 2012 2013
espn 46% 55% 42% 51% 55% 43% 39% 34%
dvoa 47% 57% 45% 47% 46% 41% 65% 30%
ans 37% 50% 42% 51% 15% 49% 32% 23%
market 52% 68% 36% 67% 45% 52% 56% 37%
srs 48% 70% 46% 47% 38% 56% 54% 25%

What surprised me was how low the percentages were this year, in comparison to prior seasons. It has already been pointed out that this year's playoffs were not very random, with the consensus top 4 teams in the pre-season all making it to the conference championships. But the regular season appears to have been just the opposite, with all 5 rankings having their worst, or second worst year in 2013. I'm at a bit of a loss to explain why that is, as this season didn't appear to be particularly "shocking" to me. I think the particularly poor performance of the usually playoff-bound Falcons and Texans had something to do with it.

The market rankings continue to be the most accurate of the bunch, although a 37% correlation is nothing to brag about.  SRS had a rather poor season this year, although it is still number two, by a slim margin, in the 2007-2013 average.

Mid-season Ranking Accuracy

I can also apply this same methodology to the week 8 rankings to see if any models improved (or regressed) with an additional four weeks of data. The challenge here is that the more weeks I give the rankings to figure things out, the fewer weeks I have to test their accuracy. The later into the season I get, the more noisy my measurements of future accuracy become, both due to small sample size volatility as well as strength of schedule imbalances.

Here are the 2013 results, alongside the previously published 2007-2012 data.

Week 8 Ranking Correlation to Future Wins
ranking average 2007 2008 2009 2010 2011 2012 2013
espn 46% 58% 42% 51% 43% 41% 41% 26%
dvoa 53% 77% 53% 59% 44% 35% 46% 41%
ans 49% 70% 47% 57% 38% 39% 42% 34%
market 55% 62% 50% 62% 56% 55% 42% 40%
srs 53% 75% 51% 55% 46% 42% 50% 35%

You'll notice that ESPN's correlation actually got worse with more data, while the rest of the rankings improved. This is roughly consistent with prior seasons, and a phenomenon I pointed out in the previous post on this topic. ESPN's mid-season regression illustrates the folly of ranking teams by win-loss record, which is pretty much all the ESPN power ranking is, with a deviation from win-loss here and there just to make it appear as if they're trying (not to single out ESPN though, pretty much all the major sports sites that do power rankings employ the same approach). Bill Parcells may disagree, but a team is more than their win-loss record.

2 comments:

  1. Not sure this is very useful measure for a number of reasons but to throw out just one If I just give every team a rank of 16. My correlation % is 50 which beats everything but the market over 7 years at 4 weeks. And is in the running for the 7 year average at 8 weeks.

    ReplyDelete
    Replies
    1. Actually, the Spearman coefficient scales from -100% to +100%. So, ranking every team #16 would give you a score of 0%, not 50%.

      -100% would be if you ranked the #1 team as #32, #2 as #31, and so on.

      Delete