Saturday, January 26, 2013

In Game Squares Probabilities

The purpose of this post is to lay out my initial attempt at building an in-game probability model for Superbowl Squares.  This post focuses on probabilities for the 1st quarter square, with the hope that I can extend the methodology to all four quarters.

In Game Cover Probability

In November of last year, I sketched out a rough approach for building an in-game cover probability model for the NFL.  The basic idea was to extend the Advanced NFL Stats win probability model to calculate the in-game probability of a give team covering the point spread, as opposed to just winning outright.

I've made some progress, but I'm still far from the finish line.  Along the way, I've reported out on some of the baby steps I've made in working towards my final goal; examining how scoring margin distributions, expected points, and 4th and 1 conversion rates all vary by the point spread of the game.  My hope when beginning this project was to have a final product ready for the Superbowl, but with the game just a week away, that doesn't look too likely.

It's Hip to be Square

So, I decided to scale things back for now and see if I could build something a little less ambitious (but much sillier): a Superbowl Squares probability model.  There are plenty of sites out there that will tell you the probability that your particular square will payoff.  Pro Football Reference even has a mobile app that will update the probabilities based on the ending score for each quarter.

But as far as I can tell, there aren't any tools out there that will update the probabilities in-game, based on the situation (down, distance, yardline, time remaining, and current score).  Most likely because most people have better things to do with their time.

Having wasted time on far sillier projects, I pushed ahead.  Using the play by play data provided at Advanced NFL Stats, I compiled scoring outcomes as a function of game situation and interpolated like crazy.  Later in the post I will go into more detail on the methodology, but first I wanted to show the results for the first quarter of last year's Giants-Patriots Superbowl, which ended in the relatively rare 1st quarter score of 0-9 in favor of the Giants.

Tom Brady Can't Find a Receiver

From a squares perspective, the 1st quarter was full of drama.  Tom Brady was flagged for intentional grounding while throwing from his endzone, which, by rule, is a safety.  All of a sudden, those holding the dreaded '2' squares were in a much better position.  After the safety, the Giants went on to score a touchdown and the Patriots failed to score at all, making 0-9 the winning 1st quarter square.  Here is how the win probability for that 0-9 square evolved play by play (mouse over the graph for details):



The probabilities update as a function of time remaining, yardline, down, and score (adding yards to go for each down was a bit much for my Macbook Air to handle). The 0-9 square is not that great of a square, especially for the first quarter.  Until the safety, the probability of success was virtually zero.  That jumped to about 20% after the safety, and continued to climb as the Giants marched down the field.  After the Giants scored, the probabilities dipped for a few plays as the Patriots made their way into Giants territory, but the quarter ended before the Patriots could score.

Overall, I was pleased with these initial results as they seemed to be relatively reasonable and updated appropriately as a function of the game situation.

Next Steps

Up next is to extend the probabilities for the second quarter, third quarter, and the final score.  Second and third quarter should be straightforward, but final score will be more difficult, as the incremental scoring probability is highly dependent upon the current score differential (e.g. a team trailing by six will not kick a field goal in the last minute of a game - well, maybe Brian Billick would).

If I can put all of this together, I hope to have a tool on this site that will update the probabilities live as Superbowl XLVII plays out (probably a squares grid with probabilities that refresh every minute or so).

Methodology

I used actual game results from the 2002-2011 seasons.  In order to calculate a square's probability, I first calculated the probability of remaining scoring in the quarter.  For example, what is the probability that the team with possession scores 14 additional points and the opposing team scores 3 additional points.  I then add the incremental scores to the current score to get the projected score at the end of the quarter.  It's then just a matter of taking the last digit of each team's score and summarizing.

To estimate the incremental score probabilities, I combined LOESS smoothing with a multinomial logit model, effectively running a multinomial logit model multiple times in all the various neighborhoods of the independent variables (time, yardline, and down).  I didn't spend much time optimizing the parameters, but as I indicated above, the initial results don't look too bad.

Incremental scoring probabilities are based on league averages and are not a function of current scoring margin.  I think that's a relatively safe assumption for quarters 1-3, but as indicated above, probably won't work for the final score.

No comments:

Post a Comment