Lindy's Five Essential Websites (Non-Major Media) for 2013
[+] Team Summaries

Tuesday, September 4, 2012

Reverse engineering Vegas and the NFL

I don't gamble, but I love Vegas. Sports gambling creates a semi-liquid information aggregator. Like the stock market, people make gambling decisions based on the information they have. The more confident they are in that information, the more they will bet. The price goes up or down to reflect the general direction of all this accumulated knowledge. As long as people are rational (which, of course, is not always the case in sports*) odds in Vegas, like prices in the stock market, should reflect all available information (see efficient market hypothesis). So, even though people are not forced to disclose their information, the consequence of that information is made public.

And I can use that.

I started with NFL win total over/unders. Using those numbers and team schedules, I estimated team power rankings (T), strength of schedule (O) and game-by-game probabilities. With those game-by-game probabilites, I calculated playoff (P=Playoff, WC=wild card) and division title probabilities. In short, I used 32 numbers generated by a sports gambling market to map out the NFL season.

[The power rankings are scaled to point margins, so to estimate an average margin of victory subtract O from T and add or subtract 3 or 3.5 points for home field advantage.]

* On that note, these specific win totals add up to 262, which is remarkable since there are only 256 games . I assume that, because most fans are irrationally optimistic about their own team and they are also more likely to place a bet on their team, win totals are impossibly optimistic as well.


  1. Interesting, but you really need to factor in the money line odds to get at the true implied win totals. E.g., Ari is under 7 -210, which is closer to a 6.5 implied win total than 7.

    1. That's a fair point. Had this been more than a demonstration exercise I definitely would have needed to account for money lines - for example, if you were to try and use this method to arbitrage win total-based playoff odds against quoted playoff odds.

      I would have needed to set the simulation to match probabilities above and below the win totals instead of the win totals themselves. I quickly abandoned that idea 1) because I didn't want this to become anything more than a quick demonstration exercise and 2) I found that the money lines emerged somewhat spontaneously from the results (corr=.3).

    2. Not sure what you mean by number 2? The money lines simply adjust the true implied o/u numbers. For there to be a correlation between your projections and the direction of the money lines would be fully expected. Or am I not understanding what you are saying?

    3. One way of looking at it is that the o/u money lines compensate for the rounding of the expected win total to the bet-able win total. The expected win total might be 10.9, but 11 is reported with higher odds on the under, therefore implying a value closer to 10.9.

      My model did not use money lines, so it assumed the bet-able win totals were the expected win totals. If, for example, the reported win totals were all set at 8 and o/u money lines were used to imply expected win totals, my model would have rated all 32 teams evenly across the board. This would give them a 50% chance in each game (a little bit better for home games and worse for road games) and an expected win total of 8.

      Now, when I actually ran the model, the final expected win totals from the model were different than the bet-able win totals (one reason for this is that it would not predict 262 total wins). What I am saying is that the error between the model's expected win totals and the bet-able win totals is positively correlated with the difference between the implied win totals from the money lines (the actual expected win totals) and the bet-able win totals. Given the hypothetical example above, there is no reason to inherently expect this (but there are reasons this correlation could be non-random).

  2. I think the win totals add up to 262 because of the vig -- vegas makes it's money off the fact that their win odds add up to over 100%. That's why normal games -110 vs -110 (or 52.3% to 52.3% which is over 100%).

    1. That would make sense if they were giving odds for each game. Then a 10 percent overround would produce 25 extra wins. But here the vig comes from the o/u money lines. You don't bet on win totals, you bet on the probability of it being over or under an arbitrary line.