Saturday, August 30, 2014

Daily NFL Rankings - With Playoff Seed Projections

My betting market rankings for the NFL are now up and running (see here for the first look). To improve access to the table, which had been spotty, I had to move the rankings to a new location. Here is the new URL: NFL Betting Market Rankings. The old url should redirect automatically to the new one, but it's probably a good idea to update bookmarks.

All of the old features are back: sparklines, playoff seed projections, and strength of schedule.

Playoff Seeds

There is a series of odd looking bar graphs under the "Projected Seed" column. Each bar in the graph represents the probability of a team achieving a particular playoff seed. The bars run left to right from seed 16 to seed 1. The top six seeds make the playoffs and are colored blue on the graph. The probabilities are based on a 5,000 round monte carlo simulation of the regular season. These will be updated daily with the latest game results and point spreads.

Here is a look at Seattle's seed probabilities:



The most likely outcome for Seattle is a #1 seed (26% probability). The second most likely outcome is a 5th seed (18% probability). This is due to Seattle being in the same division as the #3 ranked team, the 49ers (the top 4 seeds are reserved for the division winners). Here is San Francisco's corresponding seed probabilities:



To the left of each bar graph is a percentage, which represents a team's probability of making the playoffs. To open the season, the Broncos have the highest playoff probability at ~90%. They are ranked below the Seahawks, but have better playoffs odds due to a weaker schedule.

Strength of Schedule

There are two columns to the far right of the table: pSOS and fSOS. These columns are the average GPF (Generic Points Favored) of a team's past and future opponents, respectively (home field advantage is factored into the averages).

The team with the toughest schedule this season is the Arizona Cardinals, who get to face the Seahawks and 49ers twice, as well as matchups against top tier teams like the Broncos and Eagles.

The team with the easiest schedule is the Houston Texans, due to the extremely soft AFC South and their prior season finish as the last place team.

The Rankings

Eyeballing the latest version of the table, we can break down teams into a few broad categories:
  • The Elite: Seahawks, Broncos, 49ers, Packers, Patriots, and Saints
  • The Above Average: Panthers, Eagles, Bengals, Bears, Lions, Colts, Steelers, Chiefs, Falcons, Cardinals
  • The Mediocre: Chargers, Giants, Dolphins, Texans, Rams, Redskins, Buccaneers, Jets, Bills, Browns, Titans, Vikings
  • The Raiders and the Jaguars

Tuesday, August 26, 2014

NFL Home Underdogs - A Reminder

An update on last year's post on NFL home underdogs. From 1989 to 2003, NFL home underdogs went 53.5% against the spread. But for the next nine years (2004-2012), home dogs are just 47.7% against the spread.

2013 bucked this recent trend somewhat, with 88 bets averaging 52.3% against the spread (just a hair shy of break even against the standard vig). But as it stands, this still looks like a blip in what has been poor performance for some time. So be wary of anyone that claims that NFL home underdogs are a good bet. It was true when Steven Levitt published his seminal paper on betting markets in 2004, but it has been just the opposite in the ten years since. See below for year by year performance:


Saturday, August 23, 2014

How to improve your chances of scoring a goal in soccer? Concede one first.

"If you want to be a millionaire, start with a billion dollars and launch a new airline."

Apologies for the facetious (and somewhat clickbait-y) post title. In the same way that the opening quote from Richard Branson is not intended to be serious advice on how to become a millionaire, I am not suggesting that allowing your opponent to score is a viable strategy for winning soccer matches.

What I will suggest in this post is that it appears that teams play more optimally when trailing their opponent (or similarly, teams play less optimal when holding a lead). I found this result interesting for two reasons:
  1. Arriving at this conclusion provides a good example of the pitfalls of conflating correlation with causation.
  2. The scourge of modern sports strategy is loss aversion (and its cousin, risk aversion). This result appears to show that soccer is not immune.
I had already touched on this topic in a prior post (see On the Probability of Scoring a Goal). In this post, however, I have expanded my dataset, and in addition will do my best to illustrate my point with the raw data. The results from the previous post were the end result of a regression analysis, and somewhat of a black box from the point of view of the reader. I will try to be more transparent here.

Thursday, August 14, 2014

2014 NFL Rankings - First Look

As I did last year, and the year before, here is a pre-season NFL power ranking. My rankings are not based on stats, scouting, off-season moves, or draft grades. Well, they are, but not as explicit inputs. Instead, I use Vegas point spreads as a means to reverse engineer an implied power ranking. See my post at Advanced NFL Stats Community where I first laid out the basic concept (that post is also what ultimately led to the time-sink that is this blog). You can also refer to my methodology page for more details.

If the market is efficient, then the Vegas point spread is a distillation of any and all information relevant to the outcome of NFL games, whether it be touted draft picks, roster moves, or key players returning from injury. However, with three (interminable) weeks of pre-season to go, there is only an established market for the first two weeks of the regular season. Two weeks of games is not a large enough sample to derive a 32 team ranking. My model needs at least three weeks of games before it really gets going. So, while I can't use the market just yet, I can use what a significant portion of the market uses for its opening lines: Cantor Gaming.

Saturday, July 12, 2014

On the Probability of Scoring a Goal

Soccer goal low angleIn this post, I will describe my attempts to model the probability of a goal being scored in soccer. After correcting for team imbalances, I find that a trailing team has a higher probability of scoring in most situations. This result has potential implications for strategy and whether teams should be adopting a more aggressive style of play.

The Model

Using the same dataset I used for my win probability model (~3,000 matches from five of the top European Leagues), I employed LOESS smoothing to build a model that predicts the probability of a goal being scored within the next minute of game time. The model is a function of the following:
  • game time
  • goal difference
  • team strength
I derive the team strength from the pre-match betting odds, and convert it into an expected goals scored per game. Including team strength as a parameter is crucial for this type of analysis, because the model is also a function of goal differential. There is going to be heavy selection bias in the raw historical results. Favorites are going to be over-represented in game situations in which a team has a positive goal differential. As a result, the raw goal probability is higher for teams that have a lead (favorites tend to score more). But having a lead in and of itself does not lead to a higher probability of scoring more goals. This is correlation, not causation.

In fact, once we control for the bias in the results, the exact opposite conclusion emerges: For most of the game, a team trailing by one goal is more likely to score than when leading by a goal or tied. See below for the (smoothed) goal probabilities as a function of game time. The probabilities reflect a team that would be expected to score 1.4 goals per game, on average.

Friday, July 4, 2014

Tennis matches and luck

Tennis, like most sports, is largely a matter of scoring more points than your opponent. But the game-set-match scoring system used in tennis differentiates it from other contests. In basketball, scoring more points than your opponent defines victory. In tennis, scoring more points tends to lead to victory, but it's not a guarantee. It also matters when you score your points, and whether those points help you win sets.

In a recent post for FiveThirtyEight, Carl Bialik covered this topic, referring to matches in which a player wins, despite winning fewer points, as a "lottery match". Using data from Tennis Abstract, he found that 7.5 percent of mens' matches ended in this way.

For this post, I will take a closer look at these lottery matches and use it to define a "luck" measure for tennis, which will be added to my tennis win probability graphs.

Sunday, June 29, 2014

Top Match Finder for Tennis

Win probability graphs are up and running for Wimbledon matches. With 127 matches played in each major tournament, it can be difficult to track down particularly noteworthy matches. So, I've added a top match finder, which returns the top matches according to either Excitement Index or Comeback Factor (with filters for date and tournament). This is similar to the Top Games Finder I added for the NBA earlier this year. Here is the link for Tennis: Top Match Finder.