College Football by the Numbers: The Network Rankings

Lindy's Five Essential Websites (Non-Major Media) for 2013

Tuesday, November 20, 2012

The Network Rankings

by J. Patrick Rhamey Jr., PhD

There are myriad college football rankings, and, as Scott has noted, the statistical choices that go into each are inherently subjective. I am going to propose one more ranking to the many that already exist, but as the below comparison demonstrates, most existing rankings are currently converging toward what the ranking I describe below already listed last week (Notre Dame 1, Alabama 2), providing a compelling case in favor of my method. The subjective choice about the relevant data to include is simple: none of it. To create my rankings I use only wins and losses, ignoring statistics such as margin of victory, total yards, turnovers, etc. for two reasons. First, we are asking an enormous amount from our data if we are generating predictions of probable team performance based upon a limited number of observations (at this point, 10 or 11 games). Second, statistics used in existing rankings are incomparable across games given the dramatically different contexts. If two teams both have 300 total yards of offense, but one team played in the snow in Ann Arbor while the other played on a sunny day in the Rose Bowl, we are essentially equating completely incomparable numbers to generate relative rank. Margin of victory is perhaps one of the more egregious variables, with large numbers frequently a better indicator of poor sportsmanship, personal grudges, or heated historical rivalries than a decisive indicator of excellence.

Criticism of existing rankings, however, accomplishes little without a proposed replacement that solves these problems. I’ll begin with the same conceptual principle as Scott Albrecht, Wesley Colley, and others and say that any ranking begins first with wins and losses. However, unlike alternatives, that is also where I’ll end. The only information we need to rank FBS teams is their win-loss records and all the other proposed metrics – margin of victory, total yards, turnovers – amount to little more than noise. Conceptually, if every team in the FBS played every other team in the FBS, there would be no question as to who was number one. However, that’s not the case, so other rankings resort to collecting additional statistical information to fill in the gaps. This is essentially what both Albrecht’s BPR ranking and the Colley Matrix do in their “second steps”. They generate a relative weighting of wins and losses by the predicted probability of victory, or strength of schedule. However, this second step is asking too much from limited data, and is more importantly, unnecessary. Using a technique called Network Analysis, we can limit ourselves to analyzing the pattern of wins and losses without having to predict the outcomes of probable matchups or include various other statistics.

By mid-season (week 7 this year) if we draw lines between teams that have played one another, every team is connected by some degrees of separation to every other team. The below figure is the network of games played between FBS teams following week 12. An arrow pointing at a team signifies that team is the winner of the game played.

As we expect, teams in the same conference cluster together tightly given most games are intra-conference. You can interpret team’s proximity to one another as the degree to which the teams share a similar schedule. Because teams are now linked in some way to every other team, we can generate an ordering of the quality of team wins based on how central they are in the web of wins in the network. To determine how badly their losses are, we can do the same thing for the web of losses. Think of it like this: we’re playing a big game of six degrees of separation and trying to figure out which team is Kevin Bacon, or the common denominator to which all other teams in the FBS are connected. The football team that reaches the most teams through their wins in the fewest degrees of separation (and likewise the fewest teams through their losses) is the highest rank team. As an example, if Alabama plays 10 teams, but then those 10 teams lose all their other games, Alabama is only connected to 10 teams through their network of wins. That’s not very good. If Alabama plays 10 teams, but those ten teams defeat all their other opponents, Alabama is now connected by two degrees of separation to 100 teams. That’s a lot better.

We can add up all these links between teams using a measure called “average reciprocal distance” (ARD), a centrality measure in network analysis (the program Ucinet 6 by Borgatti, Everett, and Freeman was used to generate the above illustration and the following rankings). ARD measures how far on average each team is from every other team in the network, which I calculate separately for the paths of wins and losses. The higher the value, the more central or connected a team is in the network, or the more teams it is connected to by fewer links. For the network of wins, this will translate to a higher rank, with a higher ARD value signifying greater centrality. In the network of losses, centrality to the network will result in a lower ranking. Because we can assume that the direct inverse of a win is a loss, we simply subtract the centrality of a team to the FBS network of wins by the centrality of a team to the FBS network of losses. For a full discussion of the underlying concept, see Steve Borgatti’s research on the key player problem (https://sites.google.com/site/steveborgatti/research/publications).

Since we now know the outcomes of the games played from this weekend, we’ll use last week’s rankings as a comparison. The below table lists (1) the ARD in the network of FBS wins, (2) the ARD in the network of FBS losses, (3) the win ARD minus the loss ARD to generate (4) the Network Ranking. The following columns compare the Network Ranking with the BCS, the AP Poll and Scott Albrecht’s (Hybrid) ranking for all teams in the top 10 in at least one of the rankings

Rankings preceding the Week of November 11-17.

Team	ARD Wins	ARD Losses	Wins – Losses	Network Rank	BCS	AP	Albrecht
Notre Dame	48.27	0	48.27	1	3	3	4
Alabama	45.56	2.58	42.98	2	4	4	3
Florida	44.81	2.28	42.53	3	6	7	5
Ohio State	41.96	0	41.96	4	-	6	7
Oregon	41.53	0	41.53	5	2	1	1
LSU	43.29	3.33	39.96	6	7	8	13
Georgia	42.26	2.58	39.67	7	5	5	6
Kansas State	39.36	0	39.36	8	1	2	2
Texas A&M	41.36	3.33	38.02	9	8	9	8
South Carolina	40.42	3.33	37.09	10	9	12	11
Oklahoma	34.19	2	32.19	12	12	13	9
Florida State	34.75	23.74	11.01	32	10	10	10

First, we see how the Network Ranking operates. LSU, for example, is more central to the network of FBS wins than Ohio State or Oregon going into the week, but its two losses, while not at all strongly central to the network of FBS losses, drag it down to number 6 (by comparison the ARD loss score for New Mexico State, the bottom ranked team, is 46.23).

Second, while no one could have predicted the outcome of the Kansas State and Oregon games, the Network Ranking is alone in suggesting that both teams were over-ranked. Meaning, they had not yet provided sufficient evidence demonstrating their centrality to the network of FBS wins to merit being ranked over Notre Dame, Alabama, etc. We see a similar dissonance present with Florida State further down the list. Prediction (and predictive modeling inherent in most computer rankings) is exactly the problem exposed by the Oregon and Kansas State collapses this weekend. Predicting based on such a limited number of observations (or in practice as Kirk Herbstreit refers to it, the “look test”) will result in correspondingly limited success.

Third, as we would expect, the rankings are now converging. Every ranking this week has Notre Dame #1 and Alabama #2, and the Network Ranking is no different. But, that’s what the network ranking had last week! In other words convergence is happening, but all other rankings are converging toward the Network Rankings.

The Network Rankings conceptually aren’t doing anything new – they are built on the same goal of ranking teams based on wins and losses. However, unlike alternatives, the method underlying the Network Rankings best corresponds with that goal, with the result being a ranking that actually ranks based on wins and losses rather than predicted probabilities from incomparable metrics. If we’re going to rank on wins and losses, all we need is wins, losses, and some careful counting of links between teams.

J. Patrick Rhamey Jr., PhD

Assistant Professor

International Studies and Political Science

Virginia Military Institute

rhameyjp@vmi.edu

BPR	A system for ranking teams based only one wins and losses and strength of schedule. See BPR for an explanation.
EPA (Expected Points Added)	Expected points are the points a team can "expect" to score based on the distance to the end zone and down and distance needed for a first down, with an adjustment for the amount of time remaining in some situations. Expected points for every situation is estimated using seven years of historical data. The expected points considers both the average points the offense scores in each scenario and the average number of points the other team scores on their ensuing possession. The Expected Points Added is the change in expected points before and after a play.
EP3 (Effective Points Per Possession)	Effective Points Per Possession is based on the same logic as the EPA, except it focuses on the expected points added at the beginning and end of an offensive drive. In other words, the EP3 for a single drive is equal to the sum of the expected points added for every offensive play in a drive (EP3 does not include punts and field goal attempts). We can also think of the EP3 as points scored+expected points from a field goal+the value of field position change on the opponent's next possession.
Adjusted for Competition	We attempt to adjust some statistics to compensate for differences in strength of schedule. While the exact approach varies some from stat to stat the basic concept is the same. We use an algorithm to estimate scores for all teams on both sides of the ball (e.g., offense and defense) that best predict real results. For example, we give every team an offensive and defensive yards per carry score. Subtracting the offensive score from the defensive score for two opposing teams will estimate the yards per carry if the two teams were to play. Generally, the defensive scores average to zero while offensive scores average to the national average, e.g., yards per carry, so we call the offensive score "adjusted for competition" and roughly reflects what the team would do against average competition
Impact	see Adjusted for Competition. Impact scores are generally used to evaluate defenses. The value roughly reflects how much better or worse a team can expect to do against this opponent than against the average opponent.

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

3rdLComp%	Completion % on 3rd and long (7+ yards)
SitComp%	Standardized completion % for down and distance. Completion % by down and distance are weighted by the national average of pass plays by down and distance.
Pass <=0	Percent of pass plays that are negative or no gain
Pass >=10	Percent of pass plays that gain 10 or more yards
Pass >=25	Percent of pass plays that gain 25 or more yards
10 to 0	Ratio of Pass >=10 to Pass<=0
%Sacks	Ratio of sacks to pass plays
Bad INTs	Interceptions on 1st or 2nd down early before the last minute of the half

College Football by the Numbers

Tuesday, November 20, 2012

The Network Rankings

No comments:

Post a Comment

YPC1stD	Yards per carry on 1st down
CPCs	Conversions (1st down/TD) per carry in short yardage situations - the team 3 or fewer yards for a 1st down or touchdown
%Team Run	Player's carries as a percent of team's carries
%Team RunS	Player's carries as a percent of team's carries in short yardage situations
Run <=0	Percent of running plays that are negative or no gain
Run >=10	Percent of running plays that gain 10 or more yards
Run >=25	Percent of running plays that gain 25 or more yards
10 to 0	Ratio of Run >=10 to Run <=0

Conv/T 3rd	Conversions per target on 3rd Downs
Conv/T PZ	Touchdowns per target inside the 10 yardline
%Team PZ	Percent of team's targets inside the 10 yardline
Rec <=0	Percent of targets that go for negative yards or no net gain
Rec >=10	Percent of targets that go for 10+ yards
Rec >=25	Percent of targets that go for 25+ yards
10 to 0	Ratio of Rec>=0 to Rec<=0

NEPA	"Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot
NEPA/PP	Average NEPA per play
Max/Min	Single game high and low

Points/Poss	Offensive points per possession
EP3	Effective Points per Possession
EP3+	Effective Points per Possession impact
Plays/Poss	Plays per possession
Yards/Poss	Yards per possession
Start Spot	Average starting field position
Time of Poss	Average time of possession (in seconds)
TD/Poss	Touchdowns per possession
TO/Poss	Turnovers per possession
FGA/Poss	Attempted field goals per possession
%RZ	Red zone trips per possession
Points/RZ	Average points per red zone trip. Field Goals are included using expected points, not actual points.
TD/RZ	Touchdowns per red zone trip
FGA/RZ	Field goal attempt per red zone trip
Downs/RZ	Turnover on downs per red zone trip

EPA/Pass	Expected Points Added per pass attempt
EPA/Rush	Expected Points Added per rush attempt
EPA/Pass+	Expected Points Added per pass attempt impact
EPA/Rush+	Expected Points Added per rush attempt impact
Yards/Pass	Yards per pass
Yards/Rush	Yards per rush
Yards/Pass+	Yards per pass impact
Yards/Rush+	Yards per rush impact
Exp/Pass	Explosive plays (25+ yards) per pass
Exp/Rush	Explosive plays (25+ yards) per rush
Exp/Pass+	Explosive plays (25+ yards) per pass impact
Exp/Rush+	Explosive plays (25+ yards) per rush impact
Comp%	Completion percentage
Comp%+	Completion percentage impact
Yards/Comp	Yards per completion
Sack/Pass	Sacks per pass
Sack/Pass+	Sacks per pass impact
Sack/Pass*	Sacks per pass on passing downs
INT/Pass	Interceptions per pass
Neg/Rush	Negative plays (<=0) per rush
Neg/Run+	Negative plays (<=0) per rush impact
Run Short	% Runs in short yardage situations
Convert%	3rd/4th down conversions
Conv%*	3rd/4th down conversions versus average by distance
Conv%+	3rd/4th down conversions versus average by distance impact

Plays	Number of offensive plays
%Pass	Percent pass plays
EPA/Pass	Expected Points Added per pass attempt
EPA/Rush	Expected Points Added per rush attempt
EPA/Pass+	Expected Points Added per pass attempt adjusted for competition
EPA/Rush+	Expected Points Added per rush attempt adjusted for competition
Yards/Pass	Yards per pass
Yards/Rush	Yards per rush
Yards/Pass+	Yards per pass adjusted for competition
Yards/Rush+	Yards per rush adjusted for competition
Exp Pass	Explosive plays (25+ yards) per pass
Exp Run	Explosive plays (25+ yards) per rush
Exp Pass+	Explosive plays (25+ yards) per pass adjusted for competition
Exp Run+	Explosive plays (25+ yards) per rush adjusted for competition
Comp%	Completion percentage
Comp%+	Completion percentage adjusted for competition
Sack/Pass	Sacks per pass
Sack/Pass+	Sacks per pass adjusted for competition
Sack/Pass*	Sacks per pass on passing downs
Int/Pass	Interceptions per pass
Neg/Run	Negative plays (<=0) per rush
Neg/Run+	Negative plays (<=0) per rush adjusted for competition
Run Short	% Runs in short yardage situations
Convert%	3rd/4th down conversions
Conv%*	3rd/4th down conversions versus average by distance
Conv%+	3rd/4th down conversions versus average by distance adjusted for competition

PPP	Points per Possession
aPPP	Points per Possession allowed
PPE	Points per Exchange (PPP-aPPP)
EP3+	Expected Points per Possession
aEP3+	Expected Points per Possession allowed
EP2E+	Expected Points per Exchange
EPA/Pass+	Expected Points Added per Pass
EPA/Rush+	Expected Points Added per Rush
aEPA/Pass+	Expected Points Allowed per Pass
aEPA/Rush+	Expected Points Allowed per Rush
Exp/Pass	Explosive Plays per Pass
Exp/Rush	Explosive Plays per Rush
aExp/Pass	Explosive Plays per Pass allowed
aExp/Rush	Explosive Plays per Rush allowed

BPR	A method for ranking conferences based only on their wins and losses and the strength of schedule. See BPR for an explanation.
Power	A composite measure that is the best predictor of future game outcomes, averaged across all teams in the conference
P-Top	The power ranking of the top teams in the conference
P-Mid	The power ranking of the middling teams in the conference
P-Bot	The power ranking of the worst teams in the conference
SOS-Und	Strength of Schedule - Undefeated. Focuses on the difficulty of going undefeated, averaged across teams in the conference
SOS-BE	Strength of Schedule - Bowl Eligible. Focuses on the difficulty of becoming bowl eligible, averaged across teams in the conference
Hybrid	A composite measure that quantifies human polls, applied to converences

EPA	Expected points added (see glossary)
oEPA	Defense-independent performance

EP3	Effective points per possession (see glossary)
oEP3	Defense-independent offensive performance
dEP3	Offense-independent defensive performance
EPA	Expected points added (see glossary)
oEPA	Defense-independent offensive performance
dEPA	Offense-independent defensive performance
EPAp	Expected points added per play