Advances in computing power and data collection, along with smarter algorithms to make sense of it all, has led to improvements in prediction accuracy across an array of fields, from weather forecasting to what kind of movies you might like. The purpose of this post is to see whether similar gains have been made in the prediction of NFL game outcomes, where I am using the Vegas point spread as a proxy for overall advances in predictive accuracy.

Are the Bookies Getting Smarter?

To be honest, I know very little about how the sports books set their point spreads in the "old days", but my mental image of how it worked involves some cigar-chomping guy named "Lefty" or "Ace", poring over the sports pages and taking in tips from his network of informants (in other words, I have watched Casino several times). Surely the army of stat-crunching nerds that has amassed in that time and the wealth of data they now have at their disposal would lead to a more efficient market? One that is demonstrably better at picking winners and predicting margin of victory? Even the analytical muscle of Wall Street is now in the game, with Cantor Gaming (a division of Cantor Fitzgerald) providing point spreads for various major sports books.

Thanks to sportsdatabase.com, we can see if this is actually the case, as they have archived game results and point spreads for the NFL, going as far back as the 1989 season. Here are the results.

66.8% of the time, we're right all the time

The graph below charts the predictive accuracy of the point spread season by season. In other words, how often the Vegas favorite actually won.

If you can spot a trend in that scatter of data points, you may have a future in reading palms. What the data seems to show is that Vegas can pick winners correctly in 2 out of 3 games, and while they have some up years and down years, they don't seem to be getting any better or worse. If you split the data in half, the point spread accuracy was 66.7% from 1989-2000 and 66.9% from 2001-2012.

Margin of Victory

Binary predictions like win/loss can be noisy, so maybe a clearer signal will emerge by focusing on how closely the point spread predicts margin of victory? The graph below charts, by season, the Mean Absolute Error (MAE) of the point spread (i.e. the absolute value of the difference between the point spread and the actual margin of victory).

Once again, there doesn't appear to be any improvement, with the average miss at around 10.3 points.  The average miss from 1989-2000 was 10.2 points and 10.5 points from 2001-2012.

Uncertainty is a feature, not a bug

Maybe it's just not possible to consistently predict the outcome of NFL games more than the long term average of 66.8%. The NFL (and all professional sports) owe their popularity to a proper combination of drama and athleticism. Unpredictability is there by design. The rules of the game, both on the field (two minute warning, clock stoppages late in games) and off the field (draft position, salary caps) are crafted with that need for drama in mind. A league in which 80% of games have outcomes that can be predicted accurately ahead of time would simply not be as successful.

Bonus graph: accuracy by week of the season

From the same dataset, I was also able to summarize accuracy of the Vegas point spread by each week of the season. Once again, the results surprised me. Accuracy doesn't seem to improve as the season goes on. From my data-centered view of things, I would have expected better results in the second half of the season. There's only so much one could learn (one would assume) by analyzing roster moves, practices, and pre-season games, so I expected early season games would be more of a crapshoot. Not the case: