Saturday, January 24, 2015

Australian Open Win Probability Graphs

Win probability graphs for the Australian Open are now available. Unlike my win probability models for the NBA and soccer, which required regression analyses and lots of smoothing, the development of the tennis model was more straightforward, if a bit tedious. Once you specify the probability of winning a point on serve, the rest is no worse than a college-level probability exercise.

The graphs come in two versions. The "50/50" version assumes the two competitors to be of equal strength, with a 67.5 percent probability of winning a point on serve, and 32.5 percent probability of winning a point when returning. The "Market" version of the graphs adjusts both the serve and return probabilities up or down so that the starting probability aligns with the pre-match betting odds. For example, take Andreas Seppi's third round upset of Roger Federer. The betting market gave Seppi just a 6 percent chance of beating Federer, which implied a 67.2 percent serve probability for Federer and a corresponding 57.6 percent serve probability for Seppi. Here is the graph:

Despite leading two sets to one in the fourth set, Seppi's win probability rarely crossed the 50 percent threshold; that is, until the final three points of the match. Seppi trailed Federer 4-5 in the fourth set tiebreaker. His win probability was 34 percent. Seppi won the next point on serve, and his chances increase to 52 percent (+18%). Federer served next, Seppi wins the point: win probability at 67 percent (+16%). Seppi now has match point and the serve. He wins the point and the match, his win probability increasing by 33% on a single point.

Check back daily for the latest matches.

1. Do you have a quick equation for estimating per point probability from match win probability?

1. No equation, unfortunately. I brute force it in R to derive point probabilities from match probabilities. But here's a few samples for a best of 5 sets match (no tiebreaker on final set):

Point probability: 51% -> Match probability: 63%
Point probability: 52% -> Match probability: 75%
Point probability: 53% -> Match probability: 84%
Point probability: 54% -> Match probability: 91%
Point probability: 55% -> Match probability: 95%
Point probability: 57% -> Match probability: 99%

2. please tell me you aren't assuming independence of points

3. I am. What do you suggest as an alternative?