Thursday, December 27, 2012

Expected Points and Relative Team Strength

This is a continuation of a series of posts on in-game cover probability.  The purpose of this post is to show how relative team strength (as reflected in the point spread) can be factored into the concept of Expected Points.

In my kickoff post on the topic, I laid out two enhancements to the Advanced NFL Stats Win Probability Model that would be needed in order to develop a proper cover probability model:

1. Probability estimates that factor in the a priori relative team strength implied by the point spread.
2. Estimates of scoring margin distribution, not just a binary win/loss probability

I have not made much progress on either front at this point, largely due to the difficulties in just recreating Brian Burke's original work on Win Probability.  The problem is the sparseness of the data.  There just aren't enough actual game situations that feature, for example, a team down by 4, with 3rd and 2 at their 36 yard line and 2:30 to go in the 3rd quarter.  So, some careful interpolation is needed. Adding in the point spread as an additional independent variable only makes the data that much sparser and requires even more care when interpolating.  Adding margin distributions on top of that gives me LOST-style nosebleeds if I think about it too much.

Expected Points

So, I decided to scale things back and try to work through a problem that is a bit simpler and suffers far less from sparseness of data: Expected Points.

Expected Points quantifies expected net point differential for the team with possession, given down, yards to go, and field position.  The way it is calculated is simple, yet brilliant: For each possible state (down, to go, yardline), track the ultimate score outcome.  That could be a touchdown on that same drive, a field goal for the opposition after punting, a touchdown after trading punts, and so on.  After calculating the average net point differential for each state, you apply some LOESS smoothing to the data and you're done.

For additional background on the concept see these posts at Advanced NFL Stats.

Factoring in the Point Spread

The Advanced NFL Stats Expected Points model is based on league averages, but one would expect a 7 point favorite to have a higher expected points at any point on the field, compared to say, a 7 point underdog.  I decided to see if I could derive that from actual NFL data and point spreads.

My first task was to recreate the ANS Expected Points Model as best I could.  I compiled the play by play data (generously provided here by Brian Burke) and ran separate LOESS models for each down, using two independent variables: yardline and distance to go.  My results seemed fairly consistent with the ANS Expected Points (give or take a few tenths of a point) so it appeared I was on the right track.

The next step was to add a third independent variable, the point spread, and rerun the LOESS analysis.  I plan on sharing the full results in a seprate, related post, but for now, here are the results for 7 point favorites and 7 point underdogs.  The chart below shows expected points for 1st and 10 at the given yardline.

I've charted both the raw data as well as the LOESS fit. The data is noisy, but there is a clear difference in expected points.  Note how the gap widens the further away from the endzone you get.  Even a bad team is pretty likely to score a touchdown when given the ball at their opponent's one yard line with first down.  But start that team 80 yards away from the endzone, and the cumulative impact of multiple plays results in a wider divergence in expected point differential.

The Kickoff Penalty

A key callout on the graph above: when calculating expected points, I assumed that touchdowns were worth 7 points and field goals were worth 3 points.  In his work on expected points, Brian Burke rightly points out that touchdowns are worth closer to 6.4 points and field goals 2.4 points because you have to kick the ball off to the opposing team, with the average field position resulting in roughly 0.6 expected points for the receiving team (this number may be outdated as I think it is based on kickoffs from the 30 yard line).

The interesting thing about point spread-dependent expected points is that the kickoff penalty varies depending on the point spread.  For a seven point underdog, the penalty to kick off to its 7 point superior is 1.5 points, meaning that a touchdown is actually worth 5.5 points and a field goal 1.5 points.    On the flip side, when a seven point favorite kicks off to a seven point underdog, the underdog's expected points is a negative 0.8.  Meaning that a touchdown is actually worth 7.8 points and a field goal 3.8 points.

Implications for Game Strategy

One potential use for these results could be to optimize game strategy and play calling when a team knows it is a heavy favorite or heavy underdog.  Using the results above on kickoff penalties, to a seven point underdog, a touchdown is worth nearly four times as much as a field goal (5.5 points vs. 1.5 points).  To a seven point favorite, a touchdown is worth about twice as much as a field goal (7.8 points vs. 3.8 points).

Next Steps

In a companion post, I plan on publishing a tool that will output expected points for any chosen point spread.  I will then see if these results can be used in a proper cover-probability model.  It may turn out that this post is just a "proof of concept" to show how point spread can be factored into core statistical concepts like Expected Points.

I may also go in the other direction and look at First Down Probability as a function of down, distance, and point spread.  We know the league average conversion rate for a 3rd and 5 (47%).  How does that vary for a seven point underdog vs. a seven point favorite?