Saturday, June 21, 2014

In-Match Soccer Probability

In what is becoming somewhat of an obsession, I've added a new sport to my suite of win probability tools: In-Match Soccer Probability.

The tool right now is fairly bare bones, but I hope to add some additional features (and de-uglify it) as the World Cup progresses. As it stands, the model provides win, loss, and draw probabilities as a function of the following: game time (in minutes), goal differential, and pre-match odds. I'm not the first to build an in-match model for soccer. The soccer analytics site Soccer Statistically has an interactive model which displays probabilities as function of game time, goal differential, and home/away.

Home field advantage doesn't really apply to World Cup games (except for maybe the host country). So, the model I have is a bit more flexible, allowing you to input pre-match probabilities in a variety of formats. You can use the betting odds from your favorite bookie (Odds Portal is a handy reference), or you can input the probabilities directly from sites like FiveThirtyEight and numberFire.

The Data

I built the model from play by play data from the past two seasons of five of the major European leagues (English Premier, Bundesliga, Serie A, Eredivisie, and La Liga). This worked out to about 3,000 matches. The model itself is a modified version of LOESS, where instead of building local linear regressions, I'm building local ordered logistic regressions (ordered logistic regression was necessary because soccer outcomes are trinary, not binary).

No comments:

Post a Comment