Sunday, July 15, 2012

Ballpark Adjustments and Home Field Advantage

Shibe Park 1943
Philadelphia's Shibe Park
The left field fence had the following message to fans:
"WARNING: Persons throwing bottles or other
missiles will be arrested and prosecuted"
The purpose of this post is to investigate to what extent the betting market factors in ballpark when setting moneylines and run totals.  What I found is that there are clear ballpark adjustments on the run total side.  However, when it comes to moneylines and win probability, betting line home field advantage does not vary significantly from team to team, with most teams getting around a 0.4 run advantage and an expected win probability of 54% when playing a team of comparable strength.

Why Ballpark Adjustments?

Baseball seems unique among the major sports in that teams are allowed considerable leeway when designing their parks.  An NBA team isn't permitted draw the three point line at 22 feet on the left corner, instead of the standard 23 feet 9 inches, or shrink the size of their backboard (maybe they should?).  Not so with baseball, with outfields, home run fences, and power alleys that vary from ballpark to ballpark.  Add to that the impact that weather and climate can have on how far a baseball travels and you end up with some significant variations in runs scored, depending on the venue.

It's Not the Rockies Pitchers' Fault (maybe)

You can see the impact of ballpark in the team rankings that I publish daily on this site (see here).  My ranking methodology currently does not factor in ballpark adjustments (as Ben Moore pointed out in the comments to my original post on pitcher rankings).  As a result, teams in "hitter friendly" parks like the Rockies' Coors Stadium are most likely having their offensive ranking inflated and their defensive ranking deflated by my methodology.  Similarly, teams in "pitcher friendly" venues like the Padres' Petco Park are getting penalized in the offensive rankings and boosted in the defensive rankings.

Does Home Field Advantage Depend on the Team?

So, the purpose of this post is to see if I can derive these "park adjustments" from the betting information.  In addition, this post will also address home field advantage and whether the betting market varies that advantage from team to team.  See here for Nate Silver's take on home field advantage variation by team, and here for Brian Burke's take on home field advantage in general (with an interesting sidetrack into evolutionary biology).

What I found is that while the betting market is clearly making ballpark adjustments for total runs scored, there was little evidence of team-specific home field advantage being factored into the moneylines.  With a few exceptions, home field advantage appears to be worth 0.4 runs in the betting lines, regardless of the team.

Methodology (nerds only)

To derive the betting market's adjustments for ballpark, it was simply a matter of adding a couple extra terms to my regression.  As detailed in my Methodology post on my approach to baseball team rankings, I first use the moneylines and run totals from the betting market to derive an implied expected runs scored for each team (the post I linked to has the calculations).  So, let's say that the betting information implied the following:

HomeTeam Expected Runs: 5.0 runs
AwayTeam Expected Runs: 3.5 runs

My original approach was to subtract 0.2 runs from the home team score and add 0.2 runs to the away team score, and build a linear regression model as follows:

HomeTeamoff + AwayTeamdef = 4.8 runs
HomeTeamdef + AwayTeamoff = 3.7 runs

My new approach is to do the following:

HomeTeamoff + AwayTeamdef + BallParkhome = 5.0 runs
HomeTeamdef + AwayTeamoff + BallParkaway = 3.5 runs

Where BallParkhome and BallParkaway are two additional terms in the regression, specific to the venue where the game is being played.  These two terms tell you how many runs the betting market adds to (or subtracts from) the home team's expected total and how many runs the betting market subtracts from (or adds to) the visiting team's expected total.  After deriving these adjustments from my regression model, I can use them to estimate both home field advantage and total runs adjustment.  The two differ only by a plus/minus sign:

Home Field AdvantageBallParkhome - BallParkaway
Total Runs AdjustmentBallParkhome + BallParkaway

Results for the 2007-2011 Seasons

The table below summarizes the implied adjustments from the moneylines and totals, averaged over the past five seasons (2007-2011).  All the numbers below are in terms of runs (or wins, for the WinProb column).   It's an important lesson I picked up on from reading Brian Burke's work at Advanced NFL Stats: avoid unit-less stats (e.g. "factors" instead of points/runs, the NFL QB Rating, DVOA, etc.).

As you can see from the Home Field Advantage section, there is not a lot of variation, with most teams right around a 0.4 run advantage and an expected win probability of 54% (I used pythagorean expectation to convert the run adjustments into an expected win percentage).

Total runs, on the other hand, shows clear variation from park to park.  The results seem pretty consistent with the park factors you can find at Fangraphs or ESPN, with well-known "hitter's parks" like the Rockies' Coors Field, Rangers' Stadium, and Boston's Fenway Park at the top, and notoriously stingy Petco Park in San Diego at the bottom.

Comparing the Home Field and Total columns, there appears to be a correlation between home field advantage and parks that favor hitters.  This could imply something about home field's relative impact on hitting versus pitching, or it could just be an artifact of my methodology.

Next Steps

The next steps are to factor these adjustments into my team rankings, so as to get a more accurate estimate of true offensive and defensive strength (look for the Mariners' Felix Hernandez to drop in the pitcher rankings) .  I'll probably use a rolling average of the ballpark adjustments for the past two or three years.  I also intend to publish the factors by year (including 2012 season-to-date) on the Ticker page.

Squaring away these adjustments also gets me closer to re-launching the Today's Games feature I had published previously for the NBA.  I had been testing it out, but found that I wasn't able to predict the moneylines and totals very well.  I'm thinking my lack of ballpark adjustments was the missing factor.

No comments:

Post a Comment