Lindy's Five Essential Websites (Non-Major Media) for 2013
[+] Team Summaries

Wednesday, September 26, 2012

Who's College Football's Kevin Bacon?

Last week we looked at how few games connect some conferences (Six Degrees of College Football). This makes it virtually impossible to evaluate teams across those conferences, particularly when we are picking two teams to play for the MNC.

Because there are so few games stitching the nation together, those games and the teams playing in those games critically shape our perception of the national landscape. For example, there are only two 1st and 2nd order connections between the SEC and Pac-12. This amounts to six total games, only two of which actually involve both SEC and Pac-12 teams. If Oregon and LSU each have one loss at the end of the season and we are trying to decide which one gets to lose to Alabama in the MNC game, those six games, and especially the two 1st order linkages, will most strongly influence our opinion of the two conferences and the teams from those conferences. (Looking back to last season, Alabama ultimately got a second shot at LSU because LSU whooped up on Oregon in the first week.) If Washington plays below its ability against LSU and then over its head in conference games, Washington will make it significantly harder for Oregon to play for a national championship.

So, who's our Kevin Bacon?

We can connect every FBS team with every other team using one (A-B), two (A-C-B), three (A-C-D-B), or four (A-C-D-E-B) games. Almost every team has between 11 and 12 1st order connections (the teams they play), teams average 43 unique 2nd order connections (meaning there is no 1st order connection between the two teams), 65 unique 3rd order connections, and four unique 4th order connections.

Before we get into team by team results, we need a quick discussion of the transitive property. In college football, and sports generally, the transitive property is used to evaluate teams that do not play each other. For example, A does not play B, but A beat C by 10 and C lost to B by 10, so if A and B did play we could expect a close game. Now, I've heard smart college football guys dismiss the transitive property as a useless way to evaluate teams. That's absurd. We have to use the transitive property. If you don't believe that yet you haven't been paying attention.

The key is to understand how the transitive property actually works. In every college football game there is randomness or, as we say in the business, error. Houston should have beat Texas St, and if we replayed that game hundreds of times they would win most of the time. But Texas St won that day by 17. Now, error is not predictible (or it wouldn't be error), but the distribution of error-the amount of error over a series of games-is very predictable, or standard. We call the standard amount of error the standard deviation. (Standard error is something different that I'll get in to later.) If two teams played 1,000 times (and conditions like weather and player health varied normally from game to game), 45% of the time the final point margin would be within a range of two touchdowns and 75% of the time it would be within four touchdowns.

When we compare two teams based on a head-to-head matchup we should keep that error in mind. If all we know is that Texas St beat Houston by 17 we can use the standard deviation to determine that there is an 88% chance that Texas St is the better team, a 10% chance that Houston is the better team, and a 50% chance Texas St is 17 or more points better than Houston. But we cannot say that Texas St is 17 points better than Houston!

And we definitely cannot say that Texas Tech is 65 points better than Houston because they beat Texas St by 48. The problem with the transitive property is that each new game introduces more error. If the standard deviation when A plays B is 12.2 points, then the standard deviation for comparing A and B based on their scores against C is 12.2*1.4142136. If A played C who played D who played B the standard deviation of the transitive sum between A and B is 12.2*2=24.4. With each order, the standard deviation balloons. The transitive property is only useful because we can add up expectations across dozens of linkages, not just the one, and increasing the sample size (the number of linkages we are using to compare teams) reduces the standard error (the standard deviation of the expectations across the dozens of linkages).

But if there are only two 1st order connections and two 2nd order connections between the SEC and Pac-12, we don't have dozens of linkages to draw on. Our sample size will  be small, our standard error large, and individual games and individual teams will shape national rankings more than the hundreds of head-to-head matchups in conference games. In network analysis we say that those people that bridge networks hold key positions of power. The same is true of college football. Those teams are our Kevin Bacons.

The table below lists all 124 teams and their Kevin Bacon score (KBS). The Kevin Bacon score is based on the average amount of error introduced when connecting any one team to any other team. If that doesn't work for you, just think of it as how good each team is at connecting other teams.

In general, Kevin Baconess increases as a team plays fewer conference games, so independents have an advantage. Navy plays teams in six different conferences and two independents. Notre Dame has a similarly diverse schedule, and given the number of top-level programs on its schedule, Notre Dame is clearly college football's kingmaker in 2012. Southern Miss plays non-conference games against teams all around the country.

The bottom end of the table is dominated by the Pac-12.

No comments:

Post a Comment