College Football by the Numbers: Rectifying a Stupid Conclusion

Lindy's Five Essential Websites (Non-Major Media) for 2013

Monday, August 18, 2008

Rectifying a Stupid Conclusion - Preseason Polls

Georgia is #1, but should we care?

This time of year, we often hear about preseason polls and, in response, we hear that preseason pollsters don't know much this early and so preseason polls are just entertainment. One might point out, for example, that only 10 times since the AP started preseason polling (1950) was the final #1 in the top spot before the season (17%) and, more condemning, 6 times the eventual national champion was not ranked in the AP preseason poll.

But to use these numbers to suggest that the preseason poll doesn't mean much is premature and, well, wrong. I used a simple logistic model and data from the AP Poll Archive and found that preseason rankings are more important than you might think.

First, a team in the top spot in the preseason is 29 times more likely to win the national championship than if they weren't in the top spot. To clarify, that doesn't mean that Georgia is 29 times more likely to take it all than USC, but that Georgia is 29 times more likely to win it all than the average college football team. But that shouldn't surprise anyone--of course the Dawgs have a better shot then, say, Wyoming.

But ranking matters even for those at the top. The top dog, no pun intended, is almost 5 times more likely to be #1 at the end of the season than the average ranked team, 2.6 times more likely to achieve that result than other time top 5 teams, and 1.5 times more likely than the #2 team to be on top at season's end. And for the statistically minded, those results are statistically significant.

Finally, I present the results for the most comprehensive model I have tried:

The important numbers for our purposes are the odds ratios, in red, that detail the probability of a team with a particular rank winning the national championship relative to the average unranked team. Teams that start off on top are 200 times more likely to win the national championship than teams that start off unranked, and teams that are #2 at the beginning are 133 times more likely to win it all than the unranked teams, etc.

In other words, preseason polls matter, and they matter a lot--the numbers presented here are large and significant. It's good to be #1.

9 comments:

Matt CrawfordAugust 20, 2008
But we would expect the #1 team to be better than the #2 team, right? We would assume the pollsters know something, however small. So the real question is, removing differences in team quality, does being ranked #1 in the preseason poll help you win the championship? I don't know how to answer that.
ReplyDelete
Replies
Matt CrawfordAugust 20, 2008
I wish I could edit comments.

Perhaps you could include another variable or two that tried to proxy for team skill -- points for, points against, or pythagorean winning percentage. Then see how important the preseason ranking is.
ReplyDelete
Replies
UnknownAugust 21, 2008
That's a really good idea - slightly different than what I was going for here - but it would require much more complete data than I now have. Here I just looked exclusively at the predictive power of preseason rankings because I only had the preseason rankings for future national champions (and then I could assume the rankings for the non-national champions because teams would fall in the preseason poll everywhere the future champs weren't (if that makes any sense). That's why I looked only at the probability of winning a national championship.

To do what you're talking about I would need complete rankings with names attached (and then I could match them to historical performance rankings that I have on hand). I would either need to enter it into excel as a data set by hand (which I'm not willing to do) or find a .csv file online somewhere. But I would be interested in seeing what comes out (my guess is that preseason rankings would have no effect for major conference teams and a moderate effect for mid-major teams).
ReplyDelete
Replies
Matt CrawfordAugust 26, 2008
Good point, and I see what you're saying that it would be slightly different than your question.

I think what bothers people (and me) about the pre-season ranking is that the voters tend to make the top-ranked teams "sticky." So if the top-ranked teams in the pre-season end up the year with the same number of losses, the #1 team will still be #1.

I guess I would try to model that by including a term for # of losses. Or # of losses - (least # of losses by any team).

Not suggesting you do that, since like you said you don't have the data and it would be a pain to get it. Just interesting to think about.
ReplyDelete
Replies
UnknownAugust 26, 2008
As soon as I can get my hands on the data I'm going to look into it. I agree that polls can be "sticky" sometimes, and I also hate it when Notre Dame and Michigan (as good current examples) start out way too high or jump 27 spots because they win a big game. I can already envision some interesting analyses of poll stickiness and poll love.
ReplyDelete
Replies
UnknownFebruary 27, 2017
tiffany outlet
coach factory outlet
adidas superstar
cheap nfl jerseys
michael kors outlet online
cheap mlb jerseys
ugg boots
nike air max 90
nike air max
ugg outlet store
20170227caiyan
ReplyDelete
Replies
AlbertLeeFebruary 17, 2021
I was very pleased to find this web-site. I wanted to thanks for your time for this wonderful read!! I definitely enjoying every little bit of it and I have you bookmarked to check out new stuff you blog post.

Visit Web
Scirra.com
Information
ReplyDelete
Replies
Levi FisherFebruary 19, 2021
Would you be interested in exchanging links?

Hackster.io
Information
Click Here
Visit Web
ReplyDelete
Replies
AlbertLeeJune 16, 2021
Very nice post, i certainly love this website, keep on it

Turnerstrainingacademy.co.za
Information
Click Here
Visit Web
ReplyDelete
Replies

Add comment

BPR	A system for ranking teams based only one wins and losses and strength of schedule. See BPR for an explanation.
EPA (Expected Points Added)	Expected points are the points a team can "expect" to score based on the distance to the end zone and down and distance needed for a first down, with an adjustment for the amount of time remaining in some situations. Expected points for every situation is estimated using seven years of historical data. The expected points considers both the average points the offense scores in each scenario and the average number of points the other team scores on their ensuing possession. The Expected Points Added is the change in expected points before and after a play.
EP3 (Effective Points Per Possession)	Effective Points Per Possession is based on the same logic as the EPA, except it focuses on the expected points added at the beginning and end of an offensive drive. In other words, the EP3 for a single drive is equal to the sum of the expected points added for every offensive play in a drive (EP3 does not include punts and field goal attempts). We can also think of the EP3 as points scored+expected points from a field goal+the value of field position change on the opponent's next possession.
Adjusted for Competition	We attempt to adjust some statistics to compensate for differences in strength of schedule. While the exact approach varies some from stat to stat the basic concept is the same. We use an algorithm to estimate scores for all teams on both sides of the ball (e.g., offense and defense) that best predict real results. For example, we give every team an offensive and defensive yards per carry score. Subtracting the offensive score from the defensive score for two opposing teams will estimate the yards per carry if the two teams were to play. Generally, the defensive scores average to zero while offensive scores average to the national average, e.g., yards per carry, so we call the offensive score "adjusted for competition" and roughly reflects what the team would do against average competition
Impact	see Adjusted for Competition. Impact scores are generally used to evaluate defenses. The value roughly reflects how much better or worse a team can expect to do against this opponent than against the average opponent.

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

Total <=0	Percent of plays that are negative or no gain
Total >=10	Percent of plays that gain 10 or more yards
Total >=25	Percent of plays that gain 25 or more yards
10 to 0	Ratio of Total >=10 to Total <=0

3rdLComp%	Completion % on 3rd and long (7+ yards)
SitComp%	Standardized completion % for down and distance. Completion % by down and distance are weighted by the national average of pass plays by down and distance.
Pass <=0	Percent of pass plays that are negative or no gain
Pass >=10	Percent of pass plays that gain 10 or more yards
Pass >=25	Percent of pass plays that gain 25 or more yards
10 to 0	Ratio of Pass >=10 to Pass<=0
%Sacks	Ratio of sacks to pass plays
Bad INTs	Interceptions on 1st or 2nd down early before the last minute of the half

YPC1stD	Yards per carry on 1st down
CPCs	Conversions (1st down/TD) per carry in short yardage situations - the team 3 or fewer yards for a 1st down or touchdown
%Team Run	Player's carries as a percent of team's carries
%Team RunS	Player's carries as a percent of team's carries in short yardage situations
Run <=0	Percent of running plays that are negative or no gain
Run >=10	Percent of running plays that gain 10 or more yards
Run >=25	Percent of running plays that gain 25 or more yards
10 to 0	Ratio of Run >=10 to Run <=0

Conv/T 3rd	Conversions per target on 3rd Downs
Conv/T PZ	Touchdowns per target inside the 10 yardline
%Team PZ	Percent of team's targets inside the 10 yardline
Rec <=0	Percent of targets that go for negative yards or no net gain
Rec >=10	Percent of targets that go for 10+ yards
Rec >=25	Percent of targets that go for 25+ yards
10 to 0	Ratio of Rec>=0 to Rec<=0

NEPA	"Net Expected Points Added": (expected points after play - expected points before play)-(opponent's expected points after play - opponent's expected points before play). Uses the expected points for the current possession and the opponent's next possession based on down, distance and spot
NEPA/PP	Average NEPA per play
Max/Min	Single game high and low

Points/Poss	Offensive points per possession
EP3	Effective Points per Possession
EP3+	Effective Points per Possession impact
Plays/Poss	Plays per possession
Yards/Poss	Yards per possession
Start Spot	Average starting field position
Time of Poss	Average time of possession (in seconds)
TD/Poss	Touchdowns per possession
TO/Poss	Turnovers per possession
FGA/Poss	Attempted field goals per possession
%RZ	Red zone trips per possession
Points/RZ	Average points per red zone trip. Field Goals are included using expected points, not actual points.
TD/RZ	Touchdowns per red zone trip
FGA/RZ	Field goal attempt per red zone trip
Downs/RZ	Turnover on downs per red zone trip

EPA/Pass	Expected Points Added per pass attempt
EPA/Rush	Expected Points Added per rush attempt
EPA/Pass+	Expected Points Added per pass attempt impact
EPA/Rush+	Expected Points Added per rush attempt impact
Yards/Pass	Yards per pass
Yards/Rush	Yards per rush
Yards/Pass+	Yards per pass impact
Yards/Rush+	Yards per rush impact
Exp/Pass	Explosive plays (25+ yards) per pass
Exp/Rush	Explosive plays (25+ yards) per rush
Exp/Pass+	Explosive plays (25+ yards) per pass impact
Exp/Rush+	Explosive plays (25+ yards) per rush impact
Comp%	Completion percentage
Comp%+	Completion percentage impact
Yards/Comp	Yards per completion
Sack/Pass	Sacks per pass
Sack/Pass+	Sacks per pass impact
Sack/Pass*	Sacks per pass on passing downs
INT/Pass	Interceptions per pass
Neg/Rush	Negative plays (<=0) per rush
Neg/Run+	Negative plays (<=0) per rush impact
Run Short	% Runs in short yardage situations
Convert%	3rd/4th down conversions
Conv%*	3rd/4th down conversions versus average by distance
Conv%+	3rd/4th down conversions versus average by distance impact

Plays	Number of offensive plays
%Pass	Percent pass plays
EPA/Pass	Expected Points Added per pass attempt
EPA/Rush	Expected Points Added per rush attempt
EPA/Pass+	Expected Points Added per pass attempt adjusted for competition
EPA/Rush+	Expected Points Added per rush attempt adjusted for competition
Yards/Pass	Yards per pass
Yards/Rush	Yards per rush
Yards/Pass+	Yards per pass adjusted for competition
Yards/Rush+	Yards per rush adjusted for competition
Exp Pass	Explosive plays (25+ yards) per pass
Exp Run	Explosive plays (25+ yards) per rush
Exp Pass+	Explosive plays (25+ yards) per pass adjusted for competition
Exp Run+	Explosive plays (25+ yards) per rush adjusted for competition
Comp%	Completion percentage
Comp%+	Completion percentage adjusted for competition
Sack/Pass	Sacks per pass
Sack/Pass+	Sacks per pass adjusted for competition
Sack/Pass*	Sacks per pass on passing downs
Int/Pass	Interceptions per pass
Neg/Run	Negative plays (<=0) per rush
Neg/Run+	Negative plays (<=0) per rush adjusted for competition
Run Short	% Runs in short yardage situations
Convert%	3rd/4th down conversions
Conv%*	3rd/4th down conversions versus average by distance
Conv%+	3rd/4th down conversions versus average by distance adjusted for competition

PPP	Points per Possession
aPPP	Points per Possession allowed
PPE	Points per Exchange (PPP-aPPP)
EP3+	Expected Points per Possession
aEP3+	Expected Points per Possession allowed
EP2E+	Expected Points per Exchange
EPA/Pass+	Expected Points Added per Pass
EPA/Rush+	Expected Points Added per Rush
aEPA/Pass+	Expected Points Allowed per Pass
aEPA/Rush+	Expected Points Allowed per Rush
Exp/Pass	Explosive Plays per Pass
Exp/Rush	Explosive Plays per Rush
aExp/Pass	Explosive Plays per Pass allowed
aExp/Rush	Explosive Plays per Rush allowed

BPR	A method for ranking conferences based only on their wins and losses and the strength of schedule. See BPR for an explanation.
Power	A composite measure that is the best predictor of future game outcomes, averaged across all teams in the conference
P-Top	The power ranking of the top teams in the conference
P-Mid	The power ranking of the middling teams in the conference
P-Bot	The power ranking of the worst teams in the conference
SOS-Und	Strength of Schedule - Undefeated. Focuses on the difficulty of going undefeated, averaged across teams in the conference
SOS-BE	Strength of Schedule - Bowl Eligible. Focuses on the difficulty of becoming bowl eligible, averaged across teams in the conference
Hybrid	A composite measure that quantifies human polls, applied to converences

EPA	Expected points added (see glossary)
oEPA	Defense-independent performance

EP3	Effective points per possession (see glossary)
oEP3	Defense-independent offensive performance
dEP3	Offense-independent defensive performance
EPA	Expected points added (see glossary)
oEPA	Defense-independent offensive performance
dEPA	Offense-independent defensive performance
EPAp	Expected points added per play