Bill wrote a piece identifying 5 key factors for the outcome of football, and several of the commenters there had the same reaction I did - indeed, the reaction any social science type would have: How do these 5 fit together? Are they all just things that happen if you have better athletes / coaches than the other guy? If so then they aren't really five factors, they're just five ways you can recognize what good teams look like. Or are they separate things that any team can develop more or less of?
From a technical point of view, the question is what is the correlation between the five factors - do the same teams score high or low in all of them, or are there teams that are better than everyone else at some of them, and worse than others? I don't have the numbers, but I suspect the answer is that those correlations will mostly be moderately high (except for turnovers recovered, which is random), but not perfect. That's what happens most of the time for things like this.
Consider intelligence as an analogy. One argument says that intelligence is a unified thing: some people more, others have less. Another view is that there are different kinds of intelligence (say, mathematical vs. verbal), and people can be high in one, but low in others. Empirically the truth seems somewhere in the middle - if you score high in one, then you TEND to score higher in the others, but won't necessarily. Being good at football is probably a bit like that.
Maximally different factors
An insight that may not be obvious to non quantoid people is that we actually want to pick out factors that are as unrelated to each other as possible. After all, if they are too similar, then what you really have is just better or worse measures of "team goodness". In that case you don't need to bother measuring them all separately, you can just pick the best one and safely ignore the rest.
In fact, what you REALLY want are the smallest number of factors you can get, that each cover the most different things, because then if you put them all together you will have the most concise explanation possible for as much of football as you can.
But what factors?
Bill's list of five is this:
Explosiveness, efficiency, field position, finishing drives, and turnovers.
From a game-mechanics point of view you could see these as actually just three things:
Keep the ball (efficiency + turnovers), move the ball (field position + explosiveness), and get it into the end zone (finishing drives).
From a coaching point of view they are all pretty separate sets of mechanics, though, so it does make sense to consider them separately.
Preamble A - Need a good control variable
When we look at the correlation between these variables, the first thing you need to do is control for team talent level. Consider a league which only had in it: The New England Patriots, The Steelers, Directional State U, and kansas. If you looked at data from this league, you'd find that all four of the non-random factors belonged together. The same teams who were higher on one would also be higher on all of them.
It's only when you are comparing teams that have roughly even amounts of athletic ability that you might start noticing relative differences in the particular skill sets. Given the very wide spread of talent in college football, really you want to compare teams with roughly similar levels of ability.
You could do that in a crude way by picking sub-groups of teams to analyse that were all at a roughly similar power level (according to some criteria), or you can do it the fancy way by picking a measure of talent (say, the star rankings of their recruits over the past few years), and including it as a "covariate" in your analyses. That will make sense to you if you speak stats, but if not then you'll just have to take my word for it :)
Preamble B - Interactions
The other thing a social scientist would automatically be doing once they had 4 sufficiently separate factors, would be looking at interactions between them. Is there some combination that is particularly deadly at producing wins? Maybe explosiveness plus red zone punch? But for now, that's getting ahead of ourselves.
How should we define and measure the factors?
Bill in his original post has a little side bar near the start which shows common ways. I have nothing to add to this on 3 of them, but I do have some thoughts on trying to make sure efficiency and explosiveness are measured in ways that might have less overlap (which, as we've discussed above, would be An Good Thing).
Note, the thinking behind these started in a comment I made on Bill's original, so if you already read that, some of this will be familiar.
- If efficiency is related to keeping the ball, then it should be about how consistent you are at making it to the first down marker, vs. having to punt or turn it over on downs.
- If explosiveness is about field position, then it should be measured by your ability to semi-regularly have plays that move you 10-40 yards past the first down marker (everything up to that marker is efficiency - we're trying not to double count here).
For those who care, here is the long version:
The way Bill measures efficiency now is with what he calls success rate. Basically a given play is defined as either successful or not-successful depending on whether it meets the following criteria:
50% of needed yards on first down, 70% of needed yards on second down, or 100% of needed yards on third or fourth down.
That's a perfectly sensible and productive way to go about doing things. But if the concept of efficiency is about not losing the ball (to punts or turnovers on downs), then that pushes you to think about it a little differently. In that case it makes sense to think about efficiency not as something that happens per snap, but rather per down-set (I'm struggling for the right word here. A "snap" is a single play, and a "possession" is the series of plays that ends with point or a turnover, but what is the term for the series of snaps that ends with a first down or a turnover? I legitimately don't know what to call it).
If you think about it like this, then for each game or season you could consider a ratio of the down-sets that end in a first down, vs. those that end in a punt or turnover on downs. Down-sets that end with an interception or fumble wouldn't count here, because they would go in the turnover column instead. Nor would down-sets that end in a score - they would either come from within a few feet of the end zone, and so fall into the "punch it in" category, or they would happen from further out, and would fall into the "explosiveness" bucket - although you could make a case for counting them as efficient movement anyway if they covered more than 10 yards. That is a legitimate debate.
Defining efficiency this way does not relegate success rate to the scrap heap though! Instead it becomes one of the ingredients to efficiency. Success rate is all about "staying on schedule" - making steady incremental advances on each down. Teams who have a good success rate are going to tend to be efficient because they make steady progress on most downs, and don't get stuck in many situations where they have a lot of yards to go, and not many downs to do it in - these are "passing situations", and they tend to fail at a higher rate, because the other team knows what you have to do.
But that's not the only way to be efficient. If your team operates by making a lot of medium length throws, then you might have a lot of "unsuccessful" plays in which you have incompletions that net zero yards. Yet you still might keep moving the chains just as regularly, as it only takes one 8-10 yard completion to get you a new set of downs. Success rate rewards steady incremental movement, but that doesn't fit if even your small-yardage plays are more all-or-nothing,.
To illustrate, take two teams; let’s call one Flabama, and the other Leach Tech. Each team picks up two first downs like this:
- Flabama runs for 5 yards, then 4 yards, then 1 yard (first down).
- Then, they do the same again: 5 yards, 4 yards, 1 yard (first down)
- Leach Tech throws an incompletion, then an incompletion, then a grab for 10 yards (first down).
- Next play they throw it once for 10 yards (first down).
The current success rate system says that Flabama was 100% efficient (6 plays all successful), whereas Leach Tech was only 50% efficient (2 plays successful, 2 plays unsuccessful). But really they were both equally effective at not turning the ball over on downs, Flabama just did it with steady down hill running, and Leach Tech did it by airing the ball out, having a more stochastic result that averages out to an identical overall success rate.
Keep in mind that there are tradeoffs with either system. With the Leach Tech system you probably face more third and longs, which are harder to make. But you might compensate by gaining your first downs in less snaps, on average. Every extra snap that is required to move the chains is another chance to be inefficient by fumbling, getting picked, dropping it, getting flagged, blowing a coverage and get stuffed, etc.
So high success rates and the ability to move the ball in 8-10 yard chunks are both precursors that enable you to be efficient - i.e., to rarely ever have to turn it over with a punt.
Parenthetical note on improving the measure
It may be possible to slightly improve the current success rate measure, because of the way it needlessly splits results from first and second downs into binary outcomes (i.e., success/failure), when they really come in many degrees of better or worse. It's only your third and fourth downs that definitively succeed or fail.
So what if we treat outcomes from the first two downs as continuous numbers? What if we subtly recast success rate as the degree to which a snap helps you towards moving the chains. For the first two downs this could be a percentage of the required distance that you covered - if you don't move it at all it would be 0% success, if you get half way it would be 50%, and if you get all the way it would be 100%. I don't know if you would allow negative numbers for losing ground, you’d have to look at game data to see if that works out. For downs 3 and 4 it would always be a binary outcome though: You either made it (100%) or you didn't (0%). These are the downs were close does not count.
So in the series above, Flabama ran for 5 yards (50% of the distance), then 4 yards (80% distance remaining), then 1 yard (100% distance), then did the same thing again. That would give them an efficiency of (50 + 80 + 100 + 50 + 80 + 100) / 6 = 77%.
Leach Tech, in contrast, would have an efficiency of (0 + 0 + 100 + 100) /4 = 50%
This continuous version, should, in theory, do a slightly better job at capturing the data because it uses more of it, and throws less of it out. Whether it makes any difference in practice I don't know - quite possibly not. Someone who has the numbers set up and in place is going to have to check that out, and given my current work commitments plus a standing start, that's sadly not going to be me. Call me lazy, but it's the truth.
Right now this is measured with yards per play, or a variant on that called PPP. I'm going to stick to discussing yards per play here because it's more intuitively accessible (plus I understand it a lot better), but PPP works in mostly the same way (I think!), so
Yards per play works because in most football games you are going to have lots of small and medium gains of 2 or 4 or 6 yards, and then a handful of big ones of 15 or 20 yards, and the odd monster 50 yarder. When you take the average of a highly skewed data set like that, it will be a function of whether those many small gains are more like 6 yards than 4 yards, and whether you have a bunch of big huge outliers that spike the average high. That latter part is a feature here, not a bug - it's those long plays spiking it up that we are INTERESTED in. In fact, the part where most plays are getting 6 yards rather than 4 is irrelevant noise to the measure of explosiveness - they get in the way by adding signal that we don't really care about. Essentially it's contaminated by the ability to efficiently gain small chunks of yards. Probably not a lot, but a little bit.
An incremental improvement to this could be to use a measure of statistical skewness rather than the mean, although that would suffer from a slightly problem. Because it is judging the shape of your distribution of gains, rather than the amount of those gains, you could get a pretty high explosiveness score simply by having exceptionally poor efficiency at small gains - if all of your snaps went for 2 yards, and then one went for 15, that would LOOK explosive, but only because the 15 yarder stuck out against the background of total futility.
Yet another problem with it is identified by Bill himself in the comments of the original post:
I do think there needs to be an efficiency aspect to go with yards/play, though. One big play can skew the number significantly, but if you gained most of your yards in one big play, you probably didn’t win
This makes sense if you stop to think it through: Consider that you can be scored as making 25 yards per play out of having one gain of 97 yards, followed by 3 plays of 1 yard each, but also by having 4 individual plays that each net 25 yards. In reality the second version probably means you were performing better as a team. The first one represents one really good play that was "lucky" to start near its own end zone, and then a bunch of worthless plays, whereas the latter represents the ability to consistently gain big chunks in moving the ball.
This last idea - the ability to consistently gain big chunks of yards, relates much better to the concept of explosiveness I laid out at the beginning of this piece (which hopefully you still remember despite being the approximate word count of War and Peace ago). There, I conceptualized explosiveness, along with field position, as being about moving the ball (vs.keeping it (efficiency + turnovers), or punching it into the end zone). Conceptually, grouping explosiveness in with punting makes sense - they are both ways of gaining huge chunks of field position, the difference being that you get to keep the ball at the end of an explosive play.
So, to keep explosiveness as separate from efficiency, we need a measure that:
- Doesn't count the yards that just get you to the first-down marker (that part is just being efficient)
- Rewards you for going longer distances...
- ...Up until a certain point.
To explain that last bullet, if you break a gain for 60 yards, that tells you 3 things. 1) You did something good, 2) You can't have started on the opponents side of the field, because otherwise you would have run out of grass long before getting 60 yards, and 3) If you had even more grass to run into you would probably have gone even further, because you don't get 60 yards without getting past pretty much everybody, so their only hope is to catch you from behind.
In a nutshell, what we are talking about is something that rewards you for regularly getting into, and past, the other teams secondary.
The sophisticated way to model this would be to create a mathematical function that starts counting yards once the ball gets to the first-down marker, and that increases the weight it assigns to each subsequent yard as you go further and further, until a certain point (say, 30-40 yards, once you are past the secondary), and that then decreases the weight assigned to each additional yard. That would be a smooth curve with two inflection points, but I’m not personally nearly good enough at math to pull that off (though maybe someone else can). You can get most of the same result, though, with a simpler rule like this one:
- You count the first 10 yards gained past the first down marker as 1 yard per actual yard gained
- You count the next 30 yards gained as 2 yards each
- Any yards after that are counted at 0.2 each – good, but they mostly just show you got past the secondary, and started from a bad field position.
Obviously you could improve those weights and the inflection points – I pulled these ones out of my posterior. But something like this would presumably provide a clear measure of how often you were moving the chains in big huge chunks at a time.
This notion of keep the ball / move the ball / punch it in, fits pretty well with a lot of ways that we naturally talk about games. As Bill notes in his main post, the bend-don't-break defensive philosophy is about trading off allowing the other team to keep the ball (i.e., be efficient about getting first downs) in return for preventing them from moving it (i.e., limiting big plays). The extreme version of this is the prevent defense when you sell out on preventing explosiveness, while letting them march efficiently down the field pretty much unopposed (whether or not this is wise being a whole other debate). It's kind of fun to think about what other tradeoffs get made too. Hail Mary passes trade-off a higher likelihood of a turnover for the chance of punching it into the end zone. There's probably a lot more of these out there.
Thank you for reading this novella. If any of you are worried about your personal security, you might want to print this off, and then carry it around with you as a blunt instrument for self defence purposes.