Thursday, December 27, 2012

Expected Points and Relative Team Strength

This is a continuation of a series of posts on in-game cover probability.  The purpose of this post is to show how relative team strength (as reflected in the point spread) can be factored into the concept of Expected Points.

In my kickoff post on the topic, I laid out two enhancements to the Advanced NFL Stats Win Probability Model that would be needed in order to develop a proper cover probability model:

1. Probability estimates that factor in the a priori relative team strength implied by the point spread.
2. Estimates of scoring margin distribution, not just a binary win/loss probability

I have not made much progress on either front at this point, largely due to the difficulties in just recreating Brian Burke's original work on Win Probability.  The problem is the sparseness of the data.  There just aren't enough actual game situations that feature, for example, a team down by 4, with 3rd and 2 at their 36 yard line and 2:30 to go in the 3rd quarter.  So, some careful interpolation is needed. Adding in the point spread as an additional independent variable only makes the data that much sparser and requires even more care when interpolating.  Adding margin distributions on top of that gives me LOST-style nosebleeds if I think about it too much.


Expected Points

So, I decided to scale things back and try to work through a problem that is a bit simpler and suffers far less from sparseness of data: Expected Points.

Expected Points quantifies expected net point differential for the team with possession, given down, yards to go, and field position.  The way it is calculated is simple, yet brilliant: For each possible state (down, to go, yardline), track the ultimate score outcome.  That could be a touchdown on that same drive, a field goal for the opposition after punting, a touchdown after trading punts, and so on.  After calculating the average net point differential for each state, you apply some LOESS smoothing to the data and you're done.

For additional background on the concept see these posts at Advanced NFL Stats.

Factoring in the Point Spread

The Advanced NFL Stats Expected Points model is based on league averages, but one would expect a 7 point favorite to have a higher expected points at any point on the field, compared to say, a 7 point underdog.  I decided to see if I could derive that from actual NFL data and point spreads.

My first task was to recreate the ANS Expected Points Model as best I could.  I compiled the play by play data (generously provided here by Brian Burke) and ran separate LOESS models for each down, using two independent variables: yardline and distance to go.  My results seemed fairly consistent with the ANS Expected Points (give or take a few tenths of a point) so it appeared I was on the right track.

The next step was to add a third independent variable, the point spread, and rerun the LOESS analysis.  I plan on sharing the full results in a seprate, related post, but for now, here are the results for 7 point favorites and 7 point underdogs.  The chart below shows expected points for 1st and 10 at the given yardline.






I've charted both the raw data as well as the LOESS fit. The data is noisy, but there is a clear difference in expected points.  Note how the gap widens the further away from the endzone you get.  Even a bad team is pretty likely to score a touchdown when given the ball at their opponent's one yard line with first down.  But start that team 80 yards away from the endzone, and the cumulative impact of multiple plays results in a wider divergence in expected point differential.

The Kickoff Penalty

A key callout on the graph above: when calculating expected points, I assumed that touchdowns were worth 7 points and field goals were worth 3 points.  In his work on expected points, Brian Burke rightly points out that touchdowns are worth closer to 6.4 points and field goals 2.4 points because you have to kick the ball off to the opposing team, with the average field position resulting in roughly 0.6 expected points for the receiving team (this number may be outdated as I think it is based on kickoffs from the 30 yard line).

The interesting thing about point spread-dependent expected points is that the kickoff penalty varies depending on the point spread.  For a seven point underdog, the penalty to kick off to its 7 point superior is 1.5 points, meaning that a touchdown is actually worth 5.5 points and a field goal 1.5 points.    On the flip side, when a seven point favorite kicks off to a seven point underdog, the underdog's expected points is a negative 0.8.  Meaning that a touchdown is actually worth 7.8 points and a field goal 3.8 points.

Implications for Game Strategy

One potential use for these results could be to optimize game strategy and play calling when a team knows it is a heavy favorite or heavy underdog.  Using the results above on kickoff penalties, to a seven point underdog, a touchdown is worth nearly four times as much as a field goal (5.5 points vs. 1.5 points).  To a seven point favorite, a touchdown is worth about twice as much as a field goal (7.8 points vs. 3.8 points).

Next Steps

In a companion post, I plan on publishing a tool that will output expected points for any chosen point spread.  I will then see if these results can be used in a proper cover-probability model.  It may turn out that this post is just a "proof of concept" to show how point spread can be factored into core statistical concepts like Expected Points.

I may also go in the other direction and look at First Down Probability as a function of down, distance, and point spread.  We know the league average conversion rate for a 3rd and 5 (47%).  How does that vary for a seven point underdog vs. a seven point favorite?

9 comments:

  1. It's interesting. I would have expected to see a bigger kink in the line in the 30-40 yards to go area to account for field goal range.

    I would naively expect the 90-50 yards to go to be geometric (due to a fixed chance to get a new set of downs) and then another geometric curve down to 1st and goal at the 10, and then something more linear.

    ReplyDelete
    Replies
    1. The data I'm using may be too sparse to get "kinky", so to speak. If you look at the Advanced NFL Stats EP curve, there does seem to be a slight kink around the 30 yard line, in line with your expectations.

      I am also using the default settings for the LOESS smoothing (span = 0.75). I haven't had a chance to optimize the setting. A lower span should result in a less linear graph.

      Delete
    2. I'm running some queries on my data, and doing a bunch of learning about how to think about data...

      One interesting initial result is that it looks like the probability of extra yardage when getting a new set of downs is roughly an inverse square relationship.

      Delete
  2. So, it looks like from a first down & 10
    There's a turnover about 5% of the time.
    A new first down at -10 yards happens about 1% of the time.
    A new first down at -5 yards happens about 2% of the time.
    A new first down at +5 yards happens about 1% of the time.
    A new first down at another yardage less than +10 yards happens about 3% of the rest of the time.
    A new first down at +10 yards happens about 4% of the time.
    A new first down at +11 yards happens about 8% of the time.
    Beyond that the extra yardage rate is roughly 10/(net yards)^2.

    Field goal, punt, and touchdown rates are a bit more position dependent, and get interesting around the opponent's 45. Before that it's around 28% punt, and 1% touchdown.

    ReplyDelete
  3. I've been trying to build an EPA model using first down conversions to new downs, and after watching the Bengals-Texans game today, I've got to wonder if it may make more sense to use the over-under and spread to try to regress based on implied points. (That is to say, half the OU plus/minus the spread.)

    The Texans converted 14 times (13 first downs and 1 td) in 24 opportunities. On average, a team that converts at that rate is a heavy underdog. However, the Bengals converted even less - 4 times in 12 opportunities which would correspond to an even lower expected differential.

    I was thinking about whether the EPA curve would be unaffected by scoring rate due to canceling offsets, but I would expect a high expectation of stops to correspond to a very flat section where both teams struggle to get into field goal range.

    ReplyDelete
    Replies
    1. Implied points doesn't seem to look much better. Guess I need to turn up the sophistication.

      Delete
    2. For first down conversions, I think it makes a lot of sense to use the OU combined with the spread, as first down is purely a contest of offense vs. defense. It gets hairier for EP since EP factors in possession changes, so you get a blending of each team's offense and defense.

      Delete
  4. Hmm... it looks like I may have misread the ESPN box scores - they only show first downs on offense. Accounting for the numbers gets me closer to the values that I expected. The Texans converted at .69, the Bengals at .57 predicting a difference of around 12.5 points - if we discount for the pick six, this is right on.

    That said, even as a backward looking indicator, first down conversion rate doesn't seem so great -- it suggests the Colts +0.5 points over the Ravens, instead of losing by 15. That said, there's not that much distinction between the teams in most of the other box score stats either. Maybe red zone defense is a thing after all, and the Ravens have it.

    ReplyDelete
  5. Maybe this is crazy talk, but thinking on this from a somewhat different perspective:

    If we think of teams marching the ball down the field, as a repeated stochastic process, then the EPA should from a first and 10 should roughly have the form:
    e^(k_1+k_2y)-e^(k_3-k_4y)

    Where k_1,k_2,k_3, and k_4 are constants, and y some variable describing field position.

    Something like this:
    e^(1.7-0.28y)-e^(-2.1+0.28y)
    where y is the distance from the defense's goal line in yards.
    Seems to produce relatively reasonable numbers.

    ReplyDelete