Thursday, December 4, 2014

My Team's Proxy Can Beat Up Your Team's Proxy

In a few days, the College Football Playoff Selection Committee will release their final team rankings. The committee ranks the top 25 teams in the nation, but it's the top four that everyone will pay attention to.

As a public service/argument starter, I have created a comparison tool for any two teams currently ranked in the top 25 by the Selection Committee. The purpose of this tool is to find "connections" between the two teams, with the goal of determining which of the two is superior.

The basic idea is to view the combined output of all college football games as a map. But the terrain we are mapping has some peculiar and inconsistent topology. Let's take the week 12 matchup between Alabama and Mississippi State. Alabama won that game by five points. One way to interpret that is to say that Alabama is five points "better" than Mississippi State. Or, to use our map analogy, "Mount Alabama" is 5 points higher in elevation than "Mount Mississippi State".

What if we wanted to do the same comparison between Alabama and TCU? Alabama and TCU have not played each other this year (yet), but we can use our "score map" to connect them. Alabama beat West Virginia by 10 points in week 1. And West Virginia lost to TCU by one point in week 10. So, based on that particular path, Alabama is 9 points better than TCU (plus 10 points minus 1 point).

As it turns out, that is the only path of length two connecting Alabama and TCU. So how about paths of length three? Here we go: Alabama beat Texas A&M by 59 points. Texas A&M beat Southern Methodist by 52 points. Southern Methodist lost to TCU by 56 points (in the house that Jack built). 59 + 52 - 56 = 55. So, Alabama is 55 points better than TCU, right? Maybe not. That's not the only path between those teams. Here's another:

Alabama beat West Virginia by 10 points. West Virginia lost to Texas by 17 points. And Texas lost to TCU by 38 points. 10 - 17 - 38 = -45. According to that path TCU is the superior team, to the tune of 45 points.

There are actually twelve distinct paths of length three connecting Alabama and TCU. The two examples above are the minimum and maximum values. The average value of those twelve paths is 4.3 points, in favor of TCU. And we can keep going.

There are 133 distinct paths of length four. I am not going to list them out, but the interactive tool will do that for you. The average point differential for those 133 paths is 3.1 points, once again in favor of TCU. As we increase the path length, the number of paths increases exponentially, and soon becomes far too lengthy to delineate. There are more distinct paths of length 20 between Alabama and TCU than there are seconds elapsed since the big bang.

So, while I can't specifically spell out each path, I was able to come up with a simple set of matrix operations that allows me to calculate the number of paths and average point differentials for an arbitrarily large path length. I have summarized those results in the "Chart" tab of the interactive tool. Here is the Alabama-TCU chart:

A few things to notice as we increase the path length:
  • There is often a zig zag pattern to the average point differential as we progress along the horizontal axis. It is not readily apparent in our Alabama-TCU example, but shows up more clearly for Alabama-Auburn.
  • The point differential eventually converges upon a single number as we increase path length. For Alabama-TCU, TCU looks better for shorter path lengths, but Alabama eventually overtakes them around path length 13, and the final value appears to converge to 1.9 points in favor of the Crimson Tide.
  • As the path length increases, the differential approaches what one would consider a more standard additive ranking. For the small path lengths, these comparisons are only valid pairwise. Meaning that just because for path lengths of two Team A is 5 points better than Team B and Team B is 5 points better than Team C, there is no guarantee that Team A is 10 points better than Team C. But for paths of length 50, this is the case. For example, Alabama is 6.7 points better than Oregon. And Oregon is 9.2 points better than Florida State. So, one would expect the Alabama-Florida State differential to converge to 15.9 points (= 6.7 + 9.2). Which it does.
So, is there a point to this? To be honest, I'm not sure. I'm a big believer in building what is interesting first, and worrying about practical use later. But let's talk practicalities:

Do these numbers have predictive value? I intend to test that out in a future post. It will be interesting to see whether the low path length / low volume differentials correlate better with future game outcomes than the high path length / high volume differentials. You can argue it both ways. The small path length differentials are made up of more immediate connections between the teams, and therefore should be more relevant. But the sample size tends to be small (head to head is often 0 or 1, or in very rare cases, 2). For large path lengths, you can traverse the full college football landscape and average over very large numbers of games, which would presumably help you filter out the noise in the actual results.

The current tool is for entertainment purposes, and ignores things like home field advantage. But that correction can be easily made by subtracting 3.5 points from the home team's margin. You could also smooth the margin results to make them less volatile, and presumably more predictive (like Chase Stuart does for his SRS College Football Rankings). Or you could ignore margin entirely and cap wins at +1 and losses at -1 to get a victory based ranking only.

I could also use Vegas point spreads instead of scoring margins, much in the same way I use them for my betting market rankings. It turns out that this approach (and its matrix implementation) is very amenable to weighting systems. You can assign a weight to each game (e.g. weight recent games more heavily) and the weight for any particular path is just the product of each game's weight, with the average differential being just a weighted average across each path.

I found the Alabama-Oregon matchup page interesting. There is a fairly broad consensus that these are the two best teams in college football: Alabama with the best defense (or maybe second best to Stanford) and Oregon with the best offense (or maybe second best to Baylor). If you look at the matchup tables, there are very few connections between these two teams. Not only are there no paths of length two connecting them, there are no paths of length three either. And there are only eight paths of length four, requiring you to wind your way from the SEC to the Big 12 to the Big 10 to the Pac 12 to get there. With the Ducks and the Crimson Tide largely operating in separate social circles this season, it seems only fair that we see them collide for the National Championship game.

I'll update this tool with Saturday's results (and the new top 25 from the committee). If there is interest, I could also do matchup pages for all the Bowl games once they are announced.


  1. This is great Michael! I have seem some of your posts through pick monitor on FB and elsewhere and am always impressed at what you put together.

    I was wondering if you have considered doing something like this with a similar teams analysis like numberfire and some other sites use. You may have already done this somewhere else, if so let me know! My thought would be you could use a weighting for how similar a team is to a previous team and find the match-ups between those 2 similar teams. For example, in Oregon 2014 matches Oregon 2013 95% and Alabama 14 matches Bama 13 94%, then run the same path lengths analysis for those two teams. I think this would allow you to expand to more paths of 1, 2, and 3 for two teams like these that don't play in the same circles. The best case would be to find that Oregon and Bama have similar teams that play each other head to head at least that are somewhat similar. Right now, Numberfire shows Oregon having 92% match with 2007 Kentucky, who would have a number of at minimum length 2 matchups with bama that year, so comparing the strength of similarity between this bama team and that bama team would give you a large sample to look at.

    Again, you may have already done this, but I just thought it was interesting, great work!

    1. Thanks! It's an interesting idea. I hadn't given much thought to "similarity" measures. I think I will probably start simple (focusing on margin), but I can see a lot of different ways to apply this general idea.

  2. Your nice work is very attractive and impressive. Holborn Assets