Thursday, January 1, 2015

2014 Early Season Power Rankings - The Results

Bill Belichick 2012 Shankbone
Fashion icon Bill Belichick
This post is a follow up on this post from October in which I test the accuracy of various NFL power rankings - based on their ability to predict future win/loss records.

Like God and his beetles, the internet has an inordinate fondness for ordered lists. Case in point: NFL power rankings. Nearly every major (and minor) sports site has their own subjective assessment of the relative strength of each NFL team - NFL.com, CBS, ESPN, and SBNation just to name a few. There are also a variety of objective stat-based rankings from which to choose, from Advanced Football Analytics simple and open-source team efficiency rankings, to Football Outsiders' more complex and proprietary DVOA model.

Falling somewhat in between a subjective and objective approach are my betting market rankings. They are objective in that the recipe is fixed ahead of time, and requires no judgment on my part (I'm just turning the crank). But the inputs to the model are the Vegas point spreads, which are subject to the whims and prejudices of the market - and the bookies who do their best to keep their books balanced.

There are also very simple (but surprisingly accurate) models that rely solely on scoring margin. There is what's known as the Simple Ranking System (or SRS), which you can find a version of at Pro Football Reference. SRS is based on average scoring margin for each team, with an iterative adjustment for strength of schedule.

Even simpler (in some respects) is the Elo-based ranking system published this year at FiveThirtyEight. Elo rankings were first developed to rank chess players. Nate Silver has extended that methodology to rank NFL teams. Elo and SRS share some similarities. Both are based solely on scoring margin. And both attempt to adjust for each team's strength of schedule. But a key feature that distinguishes Elo from SRS is that Elo uses results from the prior season as a starting point to develop the rankings for the current season. SRS is based on solely on current season results.

Despite their ubiquity, it has never been clear to me what power rankings are really for. A survey of sports forums and article comment sections indicates that the most popular use is to start internet slap fights. But for this post, as I did last year, I will judge each ranking system on its ability to predict future wins. Following week four of the 2014 NFL season, I archived the power rankings for six different systems. We will now see how well each system predicted wins from weeks 5 through 16 (I am excluding week 17 results - too many garbage games). The table below ranks teams by week 5-16 win percentage:


Weeks 5-16 Week 4 Rankings
Team Record Rank SRS DVOA 538 ESPN MARKET AFA
NE 10-1 1 26 23 7 16 14 27
GB 9-2 2 15 6 16 12 4 5
DEN 9-3 3.5 6 3 2 3 2 4
SEA 9-3 3.5 3 2 1 1 1 2
DAL 8-3 6.5 14 13 9 10 11 18
DET 8-3 6.5 11 7 17 8 7 3
IND 8-3 6.5 7 17 10 11 5 17
PIT 8-3 6.5 23 15 13 18 18 23
ARZ 8-4 9 2 8 6 4 10 10
CIN 7-4-1 10 1 1 5 2 3 1
BAL 6-5 13.5 5 5 8 7 9 9
BUF 6-5 13.5 13 10 24 23 22 24
KC 6-5 13.5 9 18 12 14 17 14
MIA 6-5 13.5 24 20 21 22 23 6
PHI 6-5 13.5 16 9 11 6 13 28
SD 6-5 13.5 4 12 4 5 12 8
CLE 6-6 17 19 16 29 25 28 16
HOU 5-6 19 17 24 25 13 20 21
NO 5-6 19 20 21 15 21 8 7
SF 5-6 19 10 19 3 9 6 19
STL 5-7 21 29 29 23 30 27 30
CAR 4-6-1 22 27 27 14 20 21 25
ATL 4-7 24 12 4 22 17 16 22
MIN 4-7 24 22 26 20 24 26 26
NYG 4-7 24 8 11 18 19 15 11
CHI 3-8 27.5 18 14 19 15 19 20
JAC 3-8 27.5 32 32 32 31 32 31
OAK 3-8 27.5 30 30 31 32 31 29
WAS 3-8 27.5 25 22 30 27 24 13
NYJ 2-9 30 28 25 27 26 25 12
TB 1-10 31.5 31 31 28 29 30 32
TEN 1-10 31.5 21 28 26 28 29 15

The Patriots' 2-2 start to this season proved to be a head-fake that broke the ankles of many NFL pundits and commentators. From an October 4, 2014 USA Today article:
The New England Patriots are not a good football team. For fans expecting Bill Belichick and Tom Brady to pull a redux of 2012, when the team also started 2-2, leading to whispers that the dynasty might be crumbling, then rallied for an improbable 12-4 season that ended a tipped pass from the Super Bowl — don’t kid yourselves.
Following an embarrassing Monday Night loss that dropped the Patriots to 2-2, Steve Young and Trent Dilfer accused the Patriots of rebuilding - thus wasting a year of Tom Brady in his prime. Dilfer went further, insinuating that the Patriots were no longer trying to win a Superbowl. Quoth the former Raven:
Let's face it: They're not good anymore. They're weak.
To be fair to everybody piling on at the time, the New England Patriots were terrible, when judged on their 2014 play. DVOA, SRS, and Advanced Football Analytics all had the Pats in the bottom third of the league.  ESPN, FiveThirtyEight, and my market rankings fared somewhat better, ranking New England as slightly above average. This is because these rankings don't restrict themselves to just current season data. The FiveThirtyEight Elo method does this explicitly, using the prior season's (regressed) Elo ranking as the starting point for the current season rank. ESPN and Vegas clearly viewed the Patriots' slow start in context. New England still had Tom Brady and Bill Belichick, making for a heck of a Bayesian prior.

In what can only be considered a sign of impending doom for the universe, Skip Bayless may have had it right when he refused to join the pile-on:
The New England Patriots will rise like the phoenix from the ashes of Monday night's 41-14 loss in Kansas City and land at the University of Phoenix Stadium in Glendale, Arizona, playing in Super Bowl XLIX.
But enough about New England and who got it wrong. This was actually a good year for most ranking systems, when judged by the Spearman Rank Correlation coefficient. This number measures how well two ordered lists agree with each other. It is what I used in my prior posts on this topic to judge the accuracy of each ranking system. Here are the results (alongside the 2007-2013 results):

Week 4 Ranking Correlation to Future Wins
ranking average 2007 2008 2009 2010 2011 2012 2013 2014
espn 49% 55% 42% 51% 55% 43% 39% 34% 72%
dvoa 49% 57% 45% 47% 46% 41% 65% 30% 64%
afa 38% 50% 42% 51% 15% 49% 32% 23% 44%
market 55% 68% 36% 67% 45% 52% 56% 37% 75%
srs 49% 70% 46% 47% 38% 56% 54% 25% 56%
elo 73%

For the second year in a row, the market based rankings published here best correlated with future wins - evidence of market efficiency. And when averaged over the past eight years, the market rankings are the clear winner, with an average correlation of 55%. The FiveThirtyEight Elo rankings had a good first year, coming in second with a 73% correlation to future wins. And in general, the models that didn't rely solely on current season results fared better this year.

3 comments:

  1. I'm not affiliated with the site, but please consider adding massey-peabody.com to your analysis:

    http://massey-peabody.com/nfl-2014-rankings-week/

    Also, maybe an analysis of weeks 1-8 vs. weeks 9-17? Thanks.

    ReplyDelete
    Replies
    1. I took a quick look at the 2014 rankings. Massey-Peabody would have scored 66% this year, putting them in the middle of the pack.

      Delete
    2. this is good stuff.

      i compete in pickem pools, and sometimes use this type of stuff to help me with my picks. i've made a spreadsheet where I average several standardized power ratings, toss out any outliers, and use that as a tool to compare potential match ups.

      I would like to see a weekly analysis using ratings as opposed to rankings, which would be more helpful to someone like me.

      I know elo has applied a value to home field, so that could be corrected for when predicting a game winner, but I'm not sure about a lot of the other metrics

      in addition to the ones listed here, I also incorporate nERD
      https://www.numberfire.com/nfl/teams/power-rankings/
      Predictive Rankings
      https://www.teamrankings.com/nfl/ranking/predictive-by-other
      a new this year, ESPNs FPI
      http://espn.go.com/nfl/story/_/id/13539793/espn-nfl-football-power-index-debuts

      Delete