Monday, January 30, 2012

College Basketball Team Rankings - January 30, 2012



Here is the inaugural version of the Betting Market Power Rankings for Men's College Basketball. I first created these for the NFL at the Advanced NFL Stats Community site.  I added the NBA earlier this month when I launched this blog. After some tinkering, I've come up with an approach that works for college basketball.  


I initially attempted to apply the same method I use for the NFL and NBA rankings to College Basketball, but after testing out the results, I realized that something different was required.  The issue with college basketball is that from early January to early March, most teams are only playing within their own conference.  My NFL/NBA method only looks back a few weeks to determine the relative strength of each team.  But if the Big Ten teams have only been playing against other Big Ten teams and the Colonial Athletic Association (CAA) teams are only playing other CAA teams, my method has no idea where to rank the relative strength of these two conferences.  



I needed a method that had "memory".  One that would retain the information it obtained early in the season (Nov-Dec) about inter-conference strength, while still updating intra-conference strength appropriately in January through March.  The method I arrived at works very similar to an Elo rating system (originally developed for chess, but has since been applied across a broad spectrum of competitive contests).  


From the Wikipedia link on Elo rating:
Each player has a rating, which is a number. A higher number indicates a better player, based on their results against other rated players. The winner of a contest between two players gains a certain number of points in his rating and the losing player loses the same amount. The number of points won or lost in a contest depends on the difference in the ratings of the players, so a player will gain more points by beating a higher-rated player than by beating a lower-rated player.
The rankings update themselves "real time" after the outcome of each match. Contrast that to my NBA method, which effectively bootstraps its way into a set of rankings by looking at all recent games simultaneously.  It doesn't require the specification of a prior set of rankings.


In practice, the method I came up with for college basketball is even simpler than Elo.  Elo requires a logistic transformation to update the rankings, as you're essentially trying to map a discrete outcome (win, lose, draw) to rankings that are continuous.  However, with my ranking system, both the outcome (the point spread) and the ranking itself (GPF) are continuous (to within a half point) and on the same scale, which makes things much simpler.  Here is how the rankings are updated (I'll use a real example from this season):


Last week, Indiana played at Wisconsin.  Going into that game, Wisconsin's Generic Points Favored (GPF) was 12.5 and Indiana's was 10.5.  So, prior to knowing that game's point spread, Wisconsin was considered 2 points stronger than Indiana.  The point spread for that game had Wisconsin favored by 8 points.  Assuming that home court advantage is worth four points, that implied Wisconsin was actually considered 4 points stronger than Indiana, not 2 points.  So, we adjust the rankings of the two teams on a zero-sum basis, similar to Elo.  The total combined rank of Wisconsin and Indiana cannot change, but the relative ranking between them needs to change to reflect a four point differential, not two.  Simple algebra results in adjusting Wisconsin's rank up by one point from 12.5 to 13.5 and Indiana's rank down by the same amount from 10.5 to 9.5.


At the beginning of the season, every team starts out with a rank of 0, and then the model just churns through each game in chronological order to update the rankings.  It takes about a month or so from the beginning of the season for the ratings to settle down into something reasonable.


On a somewhat unrelated note, last year I participated in a contest at Kaggle.com in which the goal was to develop a ranking system for chess players that had better predictive power than the standard Elo system.  I was just able to crack the top ten before the deadline (I'm 'boooeee'), but was nowhere near the top 3 or 4.  The reason I mention it is Tim Salimans' winning solution.  I'm still grappling with the math, but I wonder if the ideas he came up with could be employed by the sports analytics community to better predict game outcomes; particularly sports like college basketball where accounting for opponent strength is crucial.


Anyway, back to the rankings.  Just like the NBA/NFL rankings, I also adjust the model a bit in response to actual game outcomes.  So if a 5 point favorite actually loses by 15 points, I "nudge" the point spread down to account for this, under the assumption that the betting market does the same (remember, I'm trying to predict point spreads here, not actual wins and losses).  The size of the nudge was optimized to maximize predictive accuracy over past seasons.  For this method, the nudge is only 5% of the miss.  Meaning that a 5 point favorite that loses by 15 would only be adjusted down to a 4 point favorite (5% of the 20 point miss).  In contrast, the optimal "nudge factor" for the NBA/NFL rankings was 15% (but given the differences in methodology, I don't think much can be concluded from the numerical difference between the two adjustment factors).


The Rankings


According to the betting market (or at least my attempt to read its mind), the best team in the nation is the Ohio State Buckeyes, with a point and a half cushion over number 2, Kentucky.  


Rather than show all 300+ Division I teams, I have cut it off at 64 teams, plus any team outside the top 64 that shows up in either the AP Top 25 or the ESPN Coaches Poll Top 25.  The only team satisfying that latter condition is undefeated Murray State (who has a good chance to remain undefeated until the tournament).  They're a top ten team according to the poll rankings, number 46 according to kenpom.com, and number 74 according to these rankings.  In general, these betting market rankings line up more closely to the KenPom rankings than to the polls.  Which isn't too surprising, given that subjective power rankings tend to be explanatory, and largely track to a team's record.  Obviously, with College Basketball, it's not all about record, or Murray State would be #1, but the rankings seem to focus more on rewarding recent wins and punishing recent losses.  In contrast, the KenPom rankings, like point spreads, are designed to be predictive, not explanatory, so it's not surprising that they are in general agreement.


Speaking of agreement, Marquette gets the Will Smith award this week: everybody agrees on Marquette.  KenPom, the AP, and the Coaches all have them at #15 and these rankings have them at #14.


R-E-S-P-E-C-T.  Wichita State isn't getting it from the polls, despite being number 12 according to both the Betting Market and KenPom.  Saturday's 3OT loss to Drake certainly didn't help.  Prior to that loss, Wichita State's three losses were to quality opponents (Alabama, Creighton, and Temple).


Indiana has had the most difficult schedule to date, with five of their games coming against the top 10 (including two against the number one Buckeyes).


Here are the rankings (full glossary below):

Other Rankings
Rank Team GPF Conf SOS GWP KenPom AP Coaches
1 Ohio St. 18.5 B10 1 5.5 10 0.96 1 3 3
2 Kentucky 17.0 SEC 1 1.5 88 0.94 3 1 1
3 North Carolina 16.5 ACC 1 3.5 43 0.94 7 5 6
4 Syracuse 16.0 BE 1 3.0 50 0.94 6 2 2
5 Kansas 16.0 B12 1 5.5 9 0.94 4 8 8
6 Michigan St. 14.5 B10 2 5.0 14 0.92 5 9 10
7 Missouri 14.5 B12 2 3.5 48 0.92 8 4 4
8 Duke 13.5 ACC 2 4.5 26 0.91 13 7 5
9 Baylor 13.5 B12 3 7.0 4 0.90 9 6 6
10 Wisconsin 13.5 B10 3 6.5 5 0.90 2 19 20
11 UNLV 12.5 MWC 1 2.0 72 0.89 21 11 13
12 Wichita St. 12.0 MVC 1 2.0 80 0.88 12
13 Florida 12.0 SEC 2 4.0 38 0.88 11 12 11
14 Marquette 11.0 BE 2 4.5 24 0.87 15 15 15
15 Kansas St. 10.5 B12 4 4.5 31 0.86 24
16 Indiana 10.5 B10 4 7.0 1 0.86 10 20 20
17 Alabama 10.5 SEC 3 5.5 12 0.85 25
18 New Mexico 10.5 MWC 2 0.0 132 0.85 23
19 California 10.5 P12 1 2.0 71 0.85 19
20 BYU 10.0 WCC 1 0.5 118 0.84 36
21 Gonzaga 10.0 WCC 2 0.5 119 0.84 31 24
22 Saint Louis 10.0 A10 1 1.0 109 0.84 14
23 Georgetown 10.0 BE 3 5.0 16 0.84 18 14 14
24 Florida St. 9.5 ACC 3 4.0 36 0.83 16 21 24
25 West Virginia 9.5 BE 4 5.5 13 0.83 28
26 Texas 9.5 B12 5 7.0 3 0.82 20
27 Louisville 9.0 BE 5 3.5 41 0.82 32 25
28 Vanderbilt 9.0 SEC 4 4.5 23 0.82 29 25
29 Creighton 9.0 MVC 2 1.5 86 0.82 27 13 12
30 Connecticut 9.0 BE 6 4.5 30 0.82 40
31 Harvard 9.0 Ivy 1 -3.0 217 0.81 30 23
32 Purdue 9.0 B10 5 3.5 44 0.81 35
33 Virginia 9.0 ACC 4 1.5 96 0.81 17 16 18
34 Long Beach St. 9.0 BW 1 1.5 87 0.81 41
35 UCLA 8.5 P12 2 1.5 100 0.81 51
36 Michigan 8.5 B10 6 6.0 7 0.81 33 23 22
37 Cincinnati 8.5 BE 7 4.0 39 0.81 50
38 Minnesota 8.0 B10 7 2.5 59 0.80 47
39 St. Mary's 8.0 WCC 3 1.0 110 0.79 22 18 16
40 Pittsburgh 8.0 BE 8 5.0 19 0.79 80
41 Illinois 7.5 B10 8 5.0 17 0.78 42
42 Memphis 7.5 CUSA 1 3.0 53 0.78 26
43 Arizona 7.5 P12 3 2.5 68 0.78 52
44 Temple 7.0 A10 2 2.0 70 0.76 37
45 Xavier 7.0 A10 3 3.5 49 0.75 58
46 Mississippi St. 6.5 SEC 5 1.0 101 0.75 60 22 19
47 Miami Florida 6.5 ACC 5 1.5 93 0.75 56
48 North Carolina St. 6.5 ACC 6 3.0 58 0.75 59
49 Washington 6.5 P12 4 3.0 55 0.74 75
50 San Diego St. 6.5 MWC 3 2.0 78 0.74 53 17 17
51 Iona 6.5 MAAC 1 -3.5 225 0.74 48
52 Virginia Tech 6.5 ACC 7 2.0 81 0.74 55
53 Seton Hall 6.0 BE 9 3.5 47 0.74 57
54 Northwestern 6.0 B10 9 7.0 2 0.74 71
55 Northern Iowa 6.0 MVC 3 1.0 107 0.73 69
56 Oregon St. 6.0 P12 5 1.0 103 0.73 66
57 VA Commonwealth 6.0 CAA 1 -2.0 177 0.73 44
58 Iowa St. 6.0 B12 6 3.5 46 0.73 34
59 Stanford 5.5 P12 6 1.5 97 0.72 62
60 Middle Tenn St. 5.0 SB 1 -3.5 224 0.71 39
61 Southern Miss 5.0 CUSA 2 0.0 124 0.70 54
62 Notre Dame 5.0 BE 10 6.0 6 0.70 63
63 Clemson 5.0 ACC 8 0.0 123 0.69 104
64 Oral Roberts 4.5 Sum 1 -2.5 206 0.69 68
74 Murray St. 4.0 OVC 1 -5.0 247 0.65 46 10 9

Glossary

  • GPF - Stands for "Generic Points Favored".  It's what you'd expect the team to be favored by against a Div-I average opponent at a neutral site.
  • Conf - The conference the team plays in and their GPF rank within that conference
  • SOS - Stands for "Strength of Schedule".  It's the average GPF of the teams faced thus far this season (along with the corresponding rank).
  • GWP - Stands for "Generic Win Probability".  I converted the GPF into a win probability using the following formula: GWP = 1/(1+exp(-GPF/6))
  • KenPom - The team's rank according to kenpom.com.  The best college basketball stats site you're gonna find.
  • AP - The team's rank (if it exists) according to the AP Top 25 poll.
  • Coaches - The team's rank (if it exists) according to the ESPN/USA Today Coaches poll.

2 comments:

  1. The validity of this set of rankings will be most obvious come the first round (not the zeroth round!) of the Tournament, when we move out of primarily conference play again. I hope you revisit how strongly these ELO-esque rankings predict the opening (and closing, because movement in your favor speaks to effectiveness) spreads of those 32 games.

    ReplyDelete
  2. I've done back testing on past seasons, and the average error does spike a bit come tournament time. I'm testing out an approach that may or may not shrink that error.

    But I definitely plan on comparing the rankings to the tournament lines in March. I'm fairly confident in how these rankings work within conference at this point. What the tournament lines will tell me is how off these rankings are in assessing relative strengths of the various conferences.

    ReplyDelete