Individual: Stats | Heisman    Team: Rank | Picks | Predict All | Champ Odds    Conf: Rank | Standings | VS.  [?]

Tuesday, July 8, 2008

An Interesting Null

My interest in this blog is, one, to rate and rank teams, but, ultimately, to be able accurately quantify college football teams so that I can more accurately forecast game outcomes. While I'm revving up for another season, I thought it might be interesting to take a closer look at the industry standard in college football forecasting-the Vegas line. And since I'm writing about it now, you can guess I found something that at least I consider interesting.

I'll start with a quick note of the Vegas line. The line is not created to forecast results--its sole existential purpose is to split bets 50/50 above and below. If too many bets are made above or below the line then the line is adjusted. Therefore, the line is a product of the interaction of two forecasting methods. The first method uses a single model-part statistical, part qualitative-that attempts to predict the public attitude. The second method employs market forces, allowing the public to aggregate information and, thus, move the line up or down according to public sentiment. The public responds to the line and the line responds to the public. The Efficient Market Hypothesis tells us that if the Vegas casinos provide an open market, all available information should be aggregated in adjusting the line and it should be impossible to consistently outperform the line without special insider information (which can be purchased from your neighborhood crooked NBA ref). If someone can find a model that can consistently outperform the Vegas line (after it has been adjusted to bettor response) they can establish that the line does not satisfy the EMH-and they can make themselves millionaires. I will not, here, provide any evidence that the line does not satisfy the EMH.

Now, to the numbers. In 2007, the Vegas line and the actual game outcome (both in terms of point differentials) had a correlation of r=.4368. This is relatively high; as I mentioned before, this is the industry standard, but it is not overwhelming. For a little interpretation, if we were to guess the point differential using the line, we would, on average, be about 18% closer than if we just guessed that every game would end in a tie. And that's the industry standard.

The line is 12.25 points off from the actual point differential on average. But as you can see in the graph, the distribution is skewed--the average is pulled up by a few cases where the Vegas gamblers really missed the boat.

My first theory was the the Vegas line would have a tougher job accurately predicting the point differential in higher scoring games or games with a larger expected point differential. But with a correlation of .0635 of the total (total) combined scoring and the absolute difference between the line and actual outcome (difference). There was a slight increase in difference as the total score increases, but when we consider that the total has to be large in many cases for the difference to be large, we have to rule this out as a viable theory. So, is the line less accurate when one team is definitely better than the other (which leads to quirky 4th quarters with backups and such)? The answer is, again, a resounding no. In fact, if anything, the trend runs in the opposite direction.

Does the line give preference to favorites or underdogs? If you were to put one dollar on the underdog in the 688 games in 2007, you would have gone home a winner 346 times (50.3%), raking up a $4 profit. More impressive than a split that is almost exactly 50/50 is the fact that the mean and standard deviation of outcomes on both sides are almost exactly the same--in other words, the line is right in the middle of its own error distribution.

These null results were to be expected and they fit nicely with the efficient market hypothesis--the actually outcomes are normally distributed around the line. But one other null result was not expected. The line does not become a more accurate predictor of outcomes as the season progresses. One would think, as the season progresses, we get a larger data set that we can use to make more accurate predictions, but instead the predictions don't get more accurate. My only explanation is that injuries through the season cause enough fluctuations to offset the increased sample size-but I still find it surprising that the average error doesn't have more of a downward trend as the season progresses.

Be the first to comment on this post

Labels

2003 2004 2005 2006 4th down ACC Aggies Air Force akron Alabama Arizona arizona state Arkansas arkansas state army Army vs. Navy ASU ATS Auburn Bad Sign ball state basketball Baylor BCS Big 10 big 12 north big east Big XII Big XII South Boise State Boston College bowl picks bowling green Buffalo BYU Cal california central michigan championship cincinnati clemson college football College football rankings colorado colorado state computer rankings Connecticut cRPI demography Dixon Donald Brown Duke east carolina Eastern Michigan error excessive celebration expected points field goal Fiesta Bowl FLorida Florida Atlantic Florida International Florida State football Fresno State games Georgia Georgia Tech Georiga Good Sign Hawaii heisman home field advantage hot teams Houston Idaho Illinois indiana Iowa Iowa State Iowas State Jacquizz Rodgers Juice Kansas Kansas State Kent State Kentucky line locker louisiana tech Louisiana-Lafayette Louisiana-Monroe Louisville LSU margin of victory Marshall Maryland Massey Matrix McFadden memphis methodology Miami Miami (OH) Michigan Michigan State Middle Tenn. St. minnesota Mississippi State missouri MLB model Mountain West Conference MWC National Championship Game Navy NC State nebraska nevada New Mexico New Mexico State North Carolina North Texas Northern Illinois northwestern Notre Dame odds ohio ohio state oklahoma Oklahoma State Ole Miss Orange Bowl oregon Oregon State OU Pac 10 Penn State people's poll picks Pitt population predictions Preseason polls Purdue rankings ratings recruiting Rice Rich Rod rivalries Rose Bowl Rutgers Ryan Perilloux San Diego State San Jose State SEC SMU South Carolina South Florida Southern Miss speed spread Stanford Stanford beats USC Sugar Bowl syracuse talent TCU Temple Tennessee Texas Texas Aggies Texas Tech Tim Tebow toledo tournament Trend-O-Meter Troy tulane Tulsa UAB UCF UCLA UConn unexpected win unlikely wins UNLV USC USF Utah Utah State UTEP Vanderbilt Virginia Virginia Tech WAC Wake Forest washington Washington State week 1 week 12 picks Weis west virginia western kentucky Western Michigan White Wisconsin Wisconsin at Minnesota wyoming