Here are the rankings as of July 17:
GPF stands for "Generic Points Favored". It is what you would expect a team to be favored by against a league average opponent on a neutral court. By combining the betting over/under with the point spread, I can decompose GPF into its offensive and defensive components, oGPF and dGPF (note: offense and defense are on a points allowed per game basis, rather than points per possession - there is no way to derive implied per possession metrics from the betting data). GOU stands for "Generic Over/Under" and it is what you would expect the betting over/under to be for that team when playing an average opponent.
These rankings largely track to win-loss records, although there are some differences. The defending champion Minnesota Lynx are at the top of the rankings, despite being several games behind the Los Angeles Sparks in the overall standings. The Atlanta Dream hold a better record than the Phoenix Mercury, but would be 3 point underdogs against them on a neutral court.
Mathematically, these rankings work very similarly to the NBA rankings. Home court advantage is worth 3.25 points to the point spread, consistent with the NBA. Unlike my NBA rankings, I do not make any adjustments for teams playing on no days of rest (not enough time to do it properly).
The model itself is a fairly straightforward weighted linear regression of the form:
- Team A - Team B = point spread
Each team is assigned a dummy variable and the point spread is adjusted for home court advantage. For example, the Los Angeles Sparks are 8 point favorites on the road today against the Atlanta Dream. If home court advantage is worth 3.25 points, then the point spread would imply that the market thinks the Sparks are 11.25 points "better" than the Dream.
Because I want to know what the market is thinking now, I weight recent games more heavily. In my original formulation of these market rankings for the NFL, I used a simple 3-2-1 weighting for the most recent three weeks of games. Eventually though, I hit upon a more theoretically sound approach that also resulted in better accuracy in predicting future lines. The weights used in these rankings have the following form:
- weight = 1 / (elapsed games + 0.25)
For example, for the most recent game for a particular team, the elapsed games is zero, and the weight would be 4 (= 1/(0 + 0.25). For the game immediately prior to that, the elapsed games is 1, so the weight would be 0.8 (= 1/(1 + 0.25)). And so on.
The form of the weight function, 1 / (elapsed games + constant), wasn't arrived at arbitrarily. It turns out you can derive the weight function by assuming the market's evaluation of team strength follows a random walk process. Under that assumption, the error term for a more distant game is larger than that for a more recent one. Using that modified error term and then deriving the linear regression equations via maximum likelihood results in a weighted regression with a weight term of the form above.
Random walk processes are sometimes referred to as a "drunkard's walk", and we can extend that analogy to this model. Imagine you are trying to figure out where your bar-hopping, inebriated friend is from a series of drunken texts they have sent you. The five minutes old text is a more reliable indicator of their location than the text you got from them an hour ago. But neither is completely reliable because, well, they're drunk. Today's point spreads are like that most recent text, and are given the most weight in the model.
I also include actual game results in the rankings because I found that it helps improve accuracy in predicting the Vegas point spread. However, the the error in a single game's results is far greater than the market point spread, so game results are weighted significantly less. The weight function is of the same form, but with a different denominator:
- results weight = 1 / (elapsed games + 3.5)
The 3.5 term in the denominator means that the most recent game's results is treated with less than 10% of the weight that we give the point spread. Both parameters in the weight function were chosen because they optimized the accuracy of predicting future point spreads.
I'm glossing over some of the technical details in the modeling here, but if there is interest, I can lay this out a bit more formally in a future post. As with my other market rankings, these update daily with the latest game results and point spreads.