Michael_Floyd_Sun_Bowl.jpg

Comparing Marvin McNutt and Michael Floyd

101231-A-6528P-032

Mel Kiper currently has Michael Floyd ranked #18 on his Big Board, while he has mentioned Marvin McNutt as a potential 2nd or 3rd round type player.  I thought it would be interesting to compare McNutt and Floyd against common opponents.  McNutt’s overall season was easily better than Floyd’s, but Iowa did play a little easier schedule than Notre Dame did.  Here are the stat lines from the common opponents they faced:

McNutt had the better game against three of the four opponents, averaged more touchdowns per game than Floyd, and averaged a little better than 20 yards per game.  The interesting thing to note is that he did all of that on fewer catches, so this is probably not a difference in utilization rates.

I’ve also listed each player’s Market Share in terms of college team production.  Again, McNutt was responsible for a larger share of the Iowa passing offense than Floyd was for the Notre Dame passing offense.

It is worth noting that Floyd is maybe 10 pounds heavier than McNutt, so it will be interesting to see how each of them runs before the draft.  I am pretty confident that I can show a correlation between a weight/speed combination that would be similar to, although slightly different than the Speed Score that people are now familiar with for running backs. 

In any case, using my “If he can do it, he can do it” rule for college wide receivers, McNutt does seem like a relative bargain compared to Floyd.

fletch9.jpg

It’s All Ball Bearings Nowadays, or, It’s All Sample Sizes Nowadays

 

I’m starting to feel about sample sizes the way that Fletch felt about ball bearings.  If you haven’t seen the movie I don’t think I’m spoiling the plot at all by saying that he thought they were pretty important.

The cacophony (yeah, that just happened) of opinions surrounding the NFL’s spring draft is a target rich environment for the application of small sample sizes.  Let’s think about all of the way small samples end up with outsized importance being assigned to them.

The Combine.

The combine is a few days of drills that aren’t even football related except that they also involve speed and strength.  But the draftniks assign all manner of importance to the combine.

This guy needs to have a good combine, they might say.  This guy needs to run at least a X.XX.

But the combine is a small sample.  It’s a player’s performance on a single day.  Yet outsized importance is attached to that day and it can counteract a collegiate record that might be four years long.  Isn’t that screwed up?  Every Saturday in the fall the player goes out after a week of coaching and plays in a football game, and after 4 years you might have 50 or so games as a record, and that record can be affected by a single day of workouts?

Speed is pretty important to football, but the best wide receivers are rarely the fastest, and as far as running backs go, if you show me a running back who averaged over six yards per carry for his college career, I’ll show you a running back who will have a good Speed Score (and I’ll even bet you can pick one up in the fourth round).

Moving on.

Player comparisons.

Player X is like Larry Fitzgerald.  Wait, no he’s not.  Maybe he’s like Anquan Boldin.  Wait, no, he’s really like a bigger Santonio Holmes.

Saying someone could be like Larry Fitzgerald is applying a comparison that is supposed to have meaning, but is essentially just a small sample size.  Larry Fitzgerald is just one guy with a particular skillset who happened to be a good pro.  There’s no guarantee that another guy who came along with a similar skillset would have Larry Fitzgerald’s success.

But the other problem with player comparisons is that when you try to shoehorn them into a type, you ignore the possibility that they might not fit any type, yet might be good in spite of that.  They might be a new kind of good.  To ignore this possibility is to say basically that only a few types of successful player exist and every player coming out of college needs to fit into one of the pre-made molds.  This is preposterous.

Here’s another problem with player comparisons.  Even the EXACT SAME PLAYER can have different results.  Forget about trying to predict the future by coming up with a reasonable approximation (Player A will be successful because he has the same skillset as Larry Fitzgerald).  Players, not just similar players – exact players, have different results. Randy Moss had about as bad of a season as you can have in Oakland in 2006, then he had a record setter the next year.  If the exact same player can have a range of results based on situation, single player comparisons between pros and college players become ridiculous.  The college player is almost assured of walking into a different situation than the one that allowed the pro to acquire whatever reputation we assign to him.  Similarity is not destiny.

(Side note: This might seem like I’m making an argument that would counter my use of similarity scores.  However, my similarity scores compare 20 player seasons typically for the very reason that similarity does not equal destiny and because single player comparisons create a ridiculously small sample.)

If he can do it, he can do it.

This is a phrase I’ve been using more of lately.  It basically means that a player’s accomplishments on the field can generally speak for themselves.  If a guy catches 1500 yards in a season, he figured out a way to get open.  Don’t rob him of his skills because he doesn’t seem good in a way we’ve ever seen before.  He figured out a way to do it.  He went out every Saturday and figured out a way to beat double teams, but we’re going to downgrade him because he didn’t run a sub 4.4 forty and we don’t understand how he did it?

fletch9

Competence in Meetings and an Explanation for the Loud Mouth Ryan Brothers

800px-New_York_Jets_Head_Coach_Rex_Ryan

This is from an article that appeared on the Freakonomics blog yesterday.  It’s a guest post by basketball economist David Berri and he discusses an academic study that basically says that our perception of competence is pretty much related to how much people talk.  From the post:

A couple of years ago Cameron Anderson and Gavin J. Kilduff published a study examining how people in meetings evaluate each other.  Obviously we would like people in meetings to think we are competent.  And one might think, the best way to get people to think you are competent is to just be competent.  But that is not what Anderson and Kilduff found.  In a study of how people in a meeting – a meeting designed to answer math questions — were evaluated by their peers, these authors found (as Time reported) that actual competence wasn’t driving evaluations:

Repeatedly, the ones who emerged as leaders and were rated the highest in competence were not the ones who offered the greatest number of correct answers. Nor were they the ones whose SAT scores suggested they’d even be able to. What they did do was offer the most answers — period. 

“Dominant individuals behaved in ways that made them appear competent,” the researchers write, “above and beyond their actual competence.” Troublingly, group members seemed only too willing to follow these underqualified bosses. An overwhelming 94% of the time, the teams used the first answer anyone shouted out — often giving only perfunctory consideration to others that were offered.

Think about what this study says about meetings. If I want you to think I am competent, I need to talk.  But if all of us have this same incentive… well, maybe we better be standing.  A sit-down meeting can be endless (or at least seem that way).

I think we finally have an explanation for the Ryan Brothers.  They have incentives to be loud mouths!  They’ve probably been rewarded for it their entire careers.

Jeremy Lin and the Limits of Scouting

800px-Stephen_Curry_Jeremy_Lin

The first thing we can probably get out of the way in this post is to say that we still have a relatively small sample size on Jeremy Lin’s pro career.  Maybe his first few games in the league have been indicative of his talent, or maybe when we have more observations, those games will look more like outliers. 

But what we do know is that Jeremy Lin is probably better than you would expect based on the fact that he has been previously waived by not one – but two – NBA teams.  This is something of a black mark on the reputation of scouting.  After a four year college career, time spent on two NBA rosters, and a number of games in the D-League, Lin’s breakout was a shock to all involved. 

Lin played in the Ivy League, which has weaker competition no doubt, but looking at a player, and the way he plays, is supposed to be the domain of scouting.  Scouting is about eyeballs, and training those eyeballs on players to assess their skills.  It’s about measuring players against everything that the scout has seen to date, and then making an assessment.

In football there are a number of examples of similar things happening.  Tom Brady, Arian Foster, Victor Cruz are just a few names that come to mind.  They aren’t just good.  They’re basically at, or close to, the top of their position.  They all went undrafted (or 6th round in Brady’s case which is pretty much the same), which is to say that scouts didn’t consider them to be in the top 250 of available players in the year they came out.  Brady, Foster and Cruz are also black marks on the scouting profession.

It’s something of a cheap shot for me to pick three names that are examples of the failures of scouting, even though my intent here is not to take cheap shots, but rather to illustrate why scouting has limits.

Scouting has limits because it is broken and it can’t be fixed on its own terms.

Scouting reports are a series of anecdotes, strung together.  Player X shows ability to read the defense and move to his second and third receiver.  Player Y shows good burst in the open field.  Player Z has the vision to find cut back lanes.  Player A shows good top end speed.  Player K does not have a plus arm.  Player S finishes runs, while Player B does not show the ability to win collisions.  If Player C can shake off a reputation of being lazy, his upside might be in the range of Justin Tuck.

The problem with a list of anecdotes and comparisons strung together is this: there is no way to measure the effectiveness of the evaluation.  Measuring players through scouting is one thing.  Measuring the measuring is another.  The reason that scouting can’t be improved on its own terms is because it has no way to measure the measuring.

How do you know how good a quarterback who “shows ability to move through his progressions” will be?  How do you know how good a running back who “shows good burst for his size” will be?  But before you get there, how do you even know what any of that means?  How much better is great burst than good burst?

Scouts will often hang their hat on a player that everybody else passed on, but that the scout in question graded accurately.  But that’s easy.  If you grade enough players, and as long as you sometimes disagree with the consensus, you’ll get some players right that others got wrong.  That’s no more of an accomplishment than my cherry picking of three names (Brady, Foster, Cruz) that scouting got wrong.  And not just one scout got those guys wrong.  Every scout got them wrong.  Before you rush to pat the Patriots on the back for actually picking Brady, just keep in mind that they five times passed on probably the greatest quarterback ever, unwilling to use a 1st-5th round pick to insure themselves against missing out on him.

So how can scouts measure whether their measuring is actually working?  That’s where we get into the domain of stats.  The logical progression of any evaluation system ends up in the domain of stats.  Why?  Because statistics offers ways to measure the measuring.  Statistics will tell you whether a player’s burst, or probably more accurately, their measured speed, correlates at all with success in the NFL.  Statistics will tell you if a college quarterback’s completion percentage has any bearing on their pro success.  More importantly, statistics will tell you how much you don’t know.

This might be a good place to relate a failure of mine that still demonstrates why statistics have a natural advantage over scouting methods as we know them today.  Before the fantasy football season started last year, I did a lot of work on college receivers.  One of the models that I came up with while doing this work stressed the importance of wide receivers catching a disproportionate amount of their college team’s touchdowns and yards.  I was basically looking at each receiver’s market share of their college team’s passing game.  The model that I was using graded Leonard Hankerson and AJ Green higher than Julio Jones and I wrote an article saying to stay away from Jones.  It wasn’t just that Jones was relatively light on touchdowns at Alabama, it’s that the Tide threw a lot of touchdowns that Jones wasn’t involved in.  For a supposedly elite talent, that is odd. 

However, I am now expecting to be wrong on this prediction.  I expect that Jones will be a very good pro and saying to stay away from him was premature.  But here’s the thing.  My error was within what you could expect from the model.  The model only explains about 20% of the variance in a receiver’s pro production (if you don’t think that’s a lot, go ahead and test draft position as a variable and see how much of a receiver’s success it explains).  That leaves a lot of production to be explained by other factors like the system that a receiver ends up in, usage rates, health, and randomness.  And here’s another thing.  I can work to improve my model with the ultimate goal that it explains more of the variance in production.  I already realize that my model ignored expected usage which can be inferred from draft position.  I can add that variable to the model and retest it to see if it explains more than my first version.  I can keep doing this until the model can’t be improved any more.

Statistics account for what is unknown, and also contemplate improvement by adjusting the model until it either offers a full explanation, or just gets close, in which case the likely variance can be accounted for.

Scouting methods have no way of accounting for the unknown and also have no room for improvement on their own terms.  They start with an expert opinion based on years of experience, and have no room really to go upward – and any room to improve likely ends up moving more into the domain of statistics by quantifying rather than broadly describing.  In fact that is already happening.  A scouting report might contain a number of descriptions of the player’s play, followed by “I currently have him graded as a 1st rounder”.  Some reports might broadly describe a number of “intangibles” and then give the player a number based rating (presumably also rating the intangibles).  The scouts can keep moving slowly in this direction, but if they actually want to really improve their evaluations they’ll eventually have to engage in testing, or measuring the measuring.

If it sounds like I’m just taking cheap shots, I’ll offer this in defense of what I am saying.  The NFL draft is the culmination of the scouting profession’s year.  Each team presumably employs the best football scouts available to them, along with team management who are also generally ex-scouts.  If what I am saying about the limitations of scouting were bullshit, then the NFL draft would be an efficient market.  The first round picks would outperform the second round picks, who would outperform the third rounds picks, and so on.  If current scouting methods were working, it wouldn’t be possible to come up with a statistical model that could better explain the success of wide receivers.  Yet my model for college wide receivers does explain more of their production than does simply using draft spot.  And it doesn’t mildly improve a model based only on draft spot.  Including the variables that I use like Market Share of Team Yards and Market Share of Team Touchdowns basically doubles the explanatory value of a model that includes only draft spot.  That’s basically saying that while teams could easily review college wide receiver statistics, they opt for the much more difficult and costly task of using scouting methods.

Think maybe what I’m saying only applies to wide receivers?  Well it probably also applies to running backs as well.  A model that tries to explain running back rushing efficiency (yards per carry) using draft position doesn’t explain any of the variance in rushing yards per carry.  We know that the scouts don’t know the limits of what they’re doing because despite the NFL’s move to passing offenses, and despite the fact that a running back’s draft spot doesn’t mean that they’ll be any good, scouts continue to give 1st round grades to running backs each year.

I say that scouting has limits not because it can’t be improved.  It’s just that it can’t be improved on its own terms.  Any improvement is likely to be a move in the direction of quantification, which will require testing of the quantification, and then guess what?  You’re in the domain of the spreadsheet jocks.

Stats Are for Losers, the Weird Anti-Science NFL, and Football as Poker

800px-Shanahanbellichick

NFL people are fond of saying “stats are for losers,” (or SAFL because I’m too lazy to type it out 50 times) a phrase that they claim has to do with their observation that at the end of the game, the side you’re most likely to hear quoting statistics will be the losers.  The problem I have with the phrase SAFL is that it almost makes the people who say it the football equivalent of evolution deniers.  SAFL is a phrase that is the creation of people who don’t understand the use of statistics, making an ad hominem attack against anybody who might try (if I had cool footnotes like Grantland, I would explain here that the ad hominem attack that evolution deniers make is that they will say something like “Evolution scientists want us to believe that we all descended from monkeys… that idea is so preposterous, whoever came up with it probably is descended from monkeys!!”)

It’s very difficult to believe that a league fond of saying SAFL in public then goes behind the scenes and begins the serious work of crunching numbers to see where weaknesses might lie, or areas for improvement might exist.

Raheem Morris used the phrase SAFL after his team started the 2010 season 4-2, but ranked near the bottom of the league in yardage categories.  He said “Stats are for losers.  You keep looking at stats, we’ll keep looking at wins.”  The problem was that over Morris’ next 26 games, he only had about 10 wins to look at.  Bill Belichick once used the phrase SAFL when asked about Randy Moss’ performance against the Carolina Panthers’ secondary.  Only in hindsight is it possible to see that while Belichick was gloating over his team’s win, the warning signs were there for Moss – Belichick would ditch Moss the next season.  Jerry Jones did an interview with 60 Minutes following the 2010 season in which he refused to get into a discussion of stats because he said “Stats are for losers.  They relish in them.  The stat is the score.”  Jones was dismissing stats, insulting losers (who he was claiming moral superiority over) and also ignoring the fact that his team had been a loser that year!!

In some cases, coaches who say SAFL are just referring to the idea that yards gained does not equal wins.  That’s fine and everything, but it throws the baby out with the bathwater.  Yards actually do matter.  About 50% of the variance in scoring margin can be predicted by yardage differential.  If you go try to run a regression you’ll find that’s slightly off.  End of game yardage differential’s only explain about 40% of margin of victory.  But first half yardage differentials do explain about 50% of the variance in scoring margins.  The first half is the thing to look at because it takes away end of game strategy moves like running out the clock (which is going to produce less yardage) and defensive strategies like going into the prevent defense that will trade short completions for reduced chance of quick scoring plays.

The other problem with the NFL’s love of SAFL as a mantra is that it puts the NFL in a weird anti-science place.  I really have no doubt that NFL people see their game as being outside the realm of science.  They bristle at any attempt to quantify what they do, implying instead that their sport is beyond quantification.

Lions coach Jim Schwartz is a chess player, but once said that football can’t be reduced to a chess match because in chess there’s no chance that your rook might fall down while making its move.  This is a common refrain among the SAFL crowd.  In their mind their game is apart from comprehension or understanding by those who might seek to quantify it.

Schwartz is right that football isn’t a chess match.  But what Schwartz is saying is that because there is always some chance that your moves don’t work out, football is a game of variance.  So he’s right that football can’t be reduced to a chess match, he just doesn’t take the thought out to its progression.  There is another game of moves that is in some ways like chess, but is like football in the sense that it is made up of probabilities.  Poker is a game of moves from set positions, and it’s like football in the sense that it is made up of probabilities, not certainties.

In football there is always about a 35% chance that a pass isn’t completed.  In poker if you have a flush draw on the flop, you have a 30% chance of making that flush by the river.  In football you have to plan for the possibility that you might get blitzed on 3rd down and you have to select your play call accordingly.  When you raise a pot in poker, you have to plan for the possibility that one of your opponents might raise “all in” after your raise, and you have to know what you would do ahead of time – before you make your first raise.

Poker players, unlike the NFL, are not anti-science.  They embrace quantification.  Unlike the NFL, which attempts to merge decisions with outcomes, poker players work extremely hard to understand outcomes separate from decisions.

I’ve actually been thinking a lot about this stuff lately and I’m about 1/2 of my way through writing a book about it.  It turns out that if we look at football in terms of its similarities to poker, we can learn a lot about football as a decision making game.  If you read this site regularly, then you might have seen my post on coaches ages and their performance.  That’s part of the work I’ve been doing as well.  I think there’s a pretty compelling case to be made that NFL teams do not optimize their selections of decision makers like coordinators, head coaches, and General Managers.

Because of Scarcity, Efficiency Matters in the NFL (Cough, Thomas Jones)

Yesterday I posted a graph that showed the effect that getting older has on running back yards per carry averages.  The graph, which I’ve pasted below again for your ease of reference, is going to seem odd to some people because it shows running backs being affected by age much earlier than we typically like to think they do.

The reason that’s the case is because NFL teams employ running backs much later into their careers, and in some cases even give them significant workloads.  But broadly speaking, they are doing this in the face of decreasing efficiency.  That’s not always the case, Fred Jackson had a very efficient season this year in terms of yards per carry and he was over 30.  But in general running backs tend to be more efficient around 24 or 25 years old.  Even in the case of FJax, we might be able to award a significant amount of the credit to the Buffalo scheme because CJ Spiller came in and basically reproduced a similar yards/carry average.

I would offer that I think NFL teams should pay more attention to efficiency.  Why?  Because every play they run is part of a series of downs in which their goal is to get a first down.  Downs are only unlimited if you are successful in earning a new set of downs.  If you don’t pick up the 10 yards, your offense goes to the sideline.

When you understand play calls with that notion of scarcity in mind, efficiency starts to make sense.  The graph below shows the odds of converting a 1st down on 2nd down, based on the yardage to gain.  It obviously makes sense that a first down is more likely with less yards to go, but NFL teams seem to ignore that obvious idea when they give the ball to old running backs who are not efficient (Thomas Jones has a sneaking suspicion that I’m talking about him).

image

Jones is an extreme example because he averaged such a horrendous per carry average of about 3.4 yards per carry.  But even the difference between having 2nd and 5 is significant compared to having 2nd and 6.  You’re about 36% to convert a 1st down on 2nd down when you only have 5 yards to gain, and that drops to just 29% when you have 2nd and 6.  That’s a good example because a younger running back can very easily average 5 yards per carry while and older running back just averages 4 yards per carry.

The notion of scarcity might not seem like it applies to running backs because they often will get 200-300 carries in a season.  When there are that many carries to think about, they seem abundant and like we shouldn’t care much about efficiency.

But efficiency does matter.  A first down run that leaves you 8 yards to gain both impacts your chances of converting a 1st down, and it limits your play calls going forward.

Thomas Jones might be the most extreme example of suboptimal decision making on this issue, but a lot of NFL teams are guilty of ignoring the fact of life that efficiency matters in the NFL.

Running Back is a Young Man’s Job Title

I threw together the graph below because I’m always interested in the implications that age has on the NFL game.  Last week I wrote that careers for coaches probably peak around 51 or 52, and that a Super Bowl coach is likely to be younger than the average NFL coach.

I figured I would use the same methodology as I did in the post on coaches and apply it to running backs.  I looked at about 3,000 running back seasons going back to 1990, calculated a career yards per carry average for each back, and then looked at each year of age in terms of its relation to the back’s career average.  I do it this way because if I just look at the average yards per carry for say 28 year old running backs, the results will be fraught with survivor bias.  Basically, only good running backs stay in the league long enough to be a 28 year old running back.  But since I don’t want to compare good 28 year old running backs against all 23 year old running backs, I have to do it this way.

In short, when a running back is 25 years old, he’s probably about as good as he’s going to be in terms of efficiency, or yards per carry.  After 25, things tend to go downhill.  First they go downhill rapidly, then things are only getting worse slowly as they reach 30. The red line is career average.

image

Even though it looks like the line is going back up around 30, you have to keep in mind that’s relative to the running back’s career average.  They’ve already been in decline for a few years.  By the time they reach 30, they’re getting worse, but at a slower rate.

What’s actionable from this graph?  Don’t give out huge contracts to running backs who are passing 26.  From 27-29, running backs are pretty much just getting worse.

The difficult thing for NFL teams is that they often have to pay players for what the player did in the previous season, even though aging most of the time ensures that the team isn’t getting the same player in the next year.  I know this sucks for NFL running backs who take a beating and then are almost over the hill when it’s time to get paid.  The only thing I can think of is to pay running backs by the yard gained.  Then you won’t have situations like Arian Foster playing for $500k while Chris Johnson is playing for $8 million.

Loss Aversion in the NFL – Case in Point, Field Goal Kickers

I stumbled across this oddity today that fascinated me.  Field goal kickers are more likely to make a kick if the distance to a first down is greater, even if the kick is from the same distance.  For example, let’s say a field goal kicker has a 48 yard field goal.  They’re more likely to make that kick if it’s 4th and long, than they are if it’s 4th and short.

When a kicker attempts a 48 yard kick from 4th and 1, they only make it about 60% of the time.  But the same distance on 4th and 12 is make about 70% of the time.  They’re the same kick.  The line of scrimmage is the 30, and the distance is 48 yards in both cases.  It really shouldn’t matter whether it’s 1 yard for a first down, or 12.

I’ve cherry picked those distances to go, but this trend holds up over sample sizes of better than 200 kicks.

Here’s a graph which breaks kicks from similar distances into whether they were attempted with 7 yards or less to gain, or more than 7 yards to gain for a first down.  The kick attempted with more than 7 yards to gain were made at about 70%, whereas the kicks with fewer yards to gain (but from the same yard line on the field) were only made about 64% of the time.  I split them based on that division so that there would be about 300 attempts in each group.

image

So why would kickers make more field goals when the yards to gain are greater?  Probably because they are trying to salvage points that they had already assumed the team would get in their minds.  When a team attempts a kick from the 30 yard line and it’s 4th and 20, that means that at one point they had a 1st and 10 from the 20 yard line.  The kicker, and probably the rest of the team, already assumed they were getting points out of that drive.  So when the kicker lines up the kick, he’s extremely focused on making sure that the team leaves with points on the board.  But when it’s 4th and 1 from the 30, that series of downs started at the 39 yard line.  From the 39 yard line, they probably didn’t assume that they would be scoring for sure.  Maybe in that instance the kicker just isn’t concentrating as much.

Behavioral economists call this Loss Aversion.  Studies have shown that the motivation to avoid a loss might be twice as great as the motivation to make a gain.

The difference in kicker performance based on yards to gain is similar to studies that have been done on golfers’ putting performance.  Golfers looking at say an 8 foot par putt make that put more often than golfers putting from 8 feet for birdie.  When the putt is for par, their desire to avoid a loss (leaving the hole with a bogie) is strong enough to get them to concentrate and make sure that they sink the putt.  It looks like kickers exhibit the same aversion to losses.

Hiring a Head Coach? Experience Might Not Be As Important As Age

crennel

Earlier in the week I offered a little evidence that coaches tend not to get better with age.  Today I thought I would look a little more specifically at what happens when teams hire older coaches who have prior head coaching experience.  This is the choice that a lot of NFL teams are confronted with.  Do they hire the older coach with the longer resume?  Or do they opt for the younger coach who is less of a known commodity?

For the purposes of today’s exercise I’ll only look at the coach’s first three years with a team.  That’s an amount of time that would probably allow for public opinion to be set.  It’s also an amount of time that should in theory at least favor the older coaches.  The first three years of a younger coach’s tenure will include, at least in part, just learning how to be a head coach, whereas older coaches should already know.

The table below shows the winning percentage of coaches during their first three years with a team and split on my general dividing line of 52/53 years old.  My theory has been that a coach crests in ability at about 51 or 52.  What we see is that despite having significantly more experience, the older age group performs worse than the younger age group of coaches.

Even though the younger coaching group had on average 4 years less head coaching experience than the older group, they showed a higher winning percentage.  You’ll note that the winning percentage for each group is relatively low, but that’s because we’re just looking at the first three years for each coach’s tenure with a team.  For a coach to even get hired, it’s likely that the previous coach got fired, which means it’s likely the coach is inheriting a loser.

First Three Years w/Team

Winning Percentage Prior Years HC Experience (Average)
Coaches 53 & Older 43.4% 5.33
Coaches 52 & Younger 46.2% 1.01

Teams that struggle, but might think they’re close to being a winning team, might be tempted to go out and hire a coach with a lot of prior experience.  They might be inclined to go get the equivalent of the Redskins hire of Mike Shanahan, or the Saints hire of Mike Ditka, or the Cardinals hire of Dennis Green, or the Carolina Panthers hire of George Seifert, or the 49ers hiring of Dennis Erickson… you get the point.

But the chances for turnaround might actually be even lower if teams opt for the older coach.

NFL Coaches Aren’t Like a Fine Wine… They Tend Not to Get Better With Age

With the regular season over and several coaches fired already, I figured I would post the results of some of the stuff I’ve been working on recently.  I’m very interested in the age effects that NFL coaches might experience.  Last night I tweeted that NFL coaches typically have coached their best football by the age of about 51 or 52.  It’s not going to be the same in every case, but that’s the age that coaches tend to crest if you look at about the last 30 years of results.

In order to measure whether coaches decline in their 50’s, you can’t just break the NFL up into age cohorts and average the winning percentage for each year.  This doesn’t work because you run into what we call survivor bias.  Basically only the good coaches survive to coach into their 50s, so you end up comparing good coaches who are coaching at the age of 57 with bad coaches who might get fired before they even reach 50.

What I’ve done instead is to measure each coach’s winning percentage at each age, and then compared it with their career average winning percentage.  So let’s say that Bill Belichick posts a winning percentage of 1.000 in 2007 at the age of 55.  That’s 0.360 above his career winning percentage of 0.640, so that year counts +.36 for Belichick.  Another example is that in 2010, Wade Phillips posted a 0.125 winning percentage in his final year with the Cowboys when he was 63 years old.  So that counts as –0.41 for Wade because his career winning percentage is 0.540.

If we do that with every NFL coach who was a head coach since 1980, we can average the +/- relative winning percentage for each age.  Here’s a graph of the results:

image

The X axis is basically at 0 when compared to a coach’s career winning percentage.  The red line is a smoothed trendline, but the average relative winning percentages for all coaches are shown by the blue dots, just so you can see the data points in addition to the trendline.  Basically a coach is going to post winning percentages higher than their career average when they are in their 40’s and then things are going to slightly decline in their 50’s, which becomes a rapid decline when they’re almost 60.

If you want a simple way to interpret the graph, you could say that a 0.600 coach is likely to be closer to .500 in the year that they are 60 years old.

Before I move on to what implications this information might have for hiring coaches, let’s also look at the distribution of ages of NFL coaches, as well as the age distribution of Super Bowl winners.  The graph below shows a normal distribution that you get when you use the mean age and standard deviation for these subsets of coaches.  The basic takeaway – Super Bowl winners are younger than the average coach, but it’s not just that they’re younger on average, the distribution curve shows us that Super Bowl winners are a lot more tightly clustered around the age of 49.  Coaches with winning records were 1/2 year younger on average than coaches with losing records.

image

It might be easier to look at the same information in table form:

Average Age
Super Bowl Winners 48.9
Conference Champions 49.7
Coaches With Winning Records 50.4
Coaches with Losing Records 50.9

How then should this information be interpreted?

First, this information is also consistent generally with what science can tell us about the way the brain ages.  As we get older we continue to increase or maintain our knowledge and experience (crystallized intelligence) while our problem solving abilities (fluid intelligence) are deteriorating.  For an NFL coach it’s a matter of threading the needle and compiling enough experience to be a good coach, while your problem solving abilities are still generally intact.  Coaching football isn’t a contemplative profession, decisions have to be made quickly and in the fog of war so to speak.  So problem solving is extremely important.  The other thing to keep in mind is that often these older coaches are coaching against competitors who also have a lot of experience, but are younger and might be a little sharper from a cognitive standpoint.

For instance, when Jim Caldwell faces off against Sean Payton, both have a lot of football experience.  Payton has been coaching since he was 23, so he has over 20 years of coaching experience.  But Payton is also about 13 years younger than Caldwell, which means that it’s likely (not guaranteed) that he can process information faster.

For the purposes of making coaching personnel decisions, to me it seems like this information should be one of a number of factors that is looked at.  If you’re a team and you have a 0.500 coach at the helm, and that coach is also getting older (like has passed 53 years old), then you know that not only have they probably coached their best football, but their best football wasn’t that great anyway.  Unless there’s a compelling reason to keep them around, it might be time to get younger.  But alternatively, if you have a great coach who is entering his late 50’s, your replacement might not be as good as your older coach.  Maybe it’s just time to get him a little more help that might address some of the slowing in cognitive processes that occur as people get older.

Lastly, if you’re an NFL team and you’re on a coaching search, there’s probably not a lot of upside in hiring the known coaching names who are also older.  Jeff Fisher is 53 years old and is a lifetime 0.530 coach.  He’s an example of a guy who has a ton of experience and had the chance to use that experience throughout his 40s, but was really just above average.  He would be a rare case if he suddenly became a really great coach later in his career.

Maybe one cautionary tale is Mike Shanahan.  The Redskins are paying him something like $7 million per season because he’s a coaching legend with multiple Super Bowl titles.  But he won those Super Bowls when he was 45 and 46 years old.  He’s 59 now and will be 60 years old next year.  He was a 0.580 coach until he turned 53 and has only been a .500 coach since.  What are the chances he turns things around at the age of 60?

Checking in on the Tim Tebow Experiment

When the Tebow train started rolling I said that I was rooting for Tebow, and also he had to improve if the Broncos were going to keep winning at the rate that they were winning at.

I want Tebow to succeed because I think the highest level offense you can run in football is one where defenses are kept guessing by whether the quarterback might run.  NFL teams are rarely going to think outside the box, so I fear that if Tebow doesn’t succeed, we might have to wait longer for more teams to try what Denver is doing.

So where are we now?

The good news is that Tebow is 7-3 as a starter.  The bad news is that over those 10 games, Denver has actually been outscored by 42 points in total!!!  The Broncos somehow manage to win the close ones and lose the blow outs.

Going into the game with Chicago, Tebow had thrown 9 touchdowns and just 1 interception.  As I wrote in my original post on Tebow, variance usually catches up with people.  Since the start of the Chicago game Tebow has thrown 2 touchdowns and 5 interceptions.

To go back to my original point, the notion that someone “just wins” football games based on small samples, and when in some cases that person isn’t doing anything on the field to actually win the game, is an idea that tends to have a short life span.  Over time that player either has to do the things necessary to actually win the game, or they will eventually run into more uneven results, which Tebow has.

Here’s an excerpt from my original post:

The “just wins games” claim is problematic for quarterbacks because unless they are actually doing things that correlate with winning, they will eventually lose games even if they are putting forth the same types of efforts.  They’ll lose games because they’ll run into tougher opponents.  They’ll lose games because they won’t get a key defensive touchdown.  They’ll lose games because a tipped pass will get intercepted.  The same randomness that allowed an average or below average quarterback to compile the “just wins” resume will eventually go the other way and the media will have to create a new narrative to explain why the QB doesn’t just win anymore.

The “just wins” label is only attached to quarterbacks whose shitty performance on the field is somehow at odds with their quarterback record.  Nobody says that Aaron Rodgers “just wins” games.  They say he’s an awesome quarterback and the reason the Packers win is because of the things Aaron Rodgers does on the field.

Again, a portion of the Bronco’s results can be explained by Tebow’s performance.  Not throwing interceptions is worth something.  Extending drives with runs on third down is worth something.  But Tebow’s record as a starter is now 4-1 this year and Tebow’s performances are not the kind of quarterback performances that will lead to an 80% win rate over any extended observation period.

Top Links 12-28-2011

The Link Leaderboard

  1. The Snap Report – Week 16 Offense | ProFootballFocus.com (Source)
  2. TwitLonger — When you talk too much for Twitter (Source)
  3. Prelude to the Pro Bowl vote – Extra Points – Boston.com (Source)
  4. The impact of playoff scenarios on Week 17 | ProFootballFocus.com (Source)
  5. Changes coming for Chargers brass | SignOnSanDiego.com (Source)
  6. All Day’s Nightmare | National Football Post (Source)
  7. Free Online Radio – Internet Talk Radio – 404 Page | BlogTalkRadio (Source)
  8. NFL Communications – Video: Legal & Illegal Hits on Defenseless Players « (Source)
  9. Cleveland Browns quarterback Colt McCoy has a chance to face Steelers if he can practice this week | cleveland.com (Source)
  10. Roundup: Videos of a Brawl at the Mall of America, the Top 25 Pop Songs of 2011 Rolled into One & is it Tebow Time in Florida Politics? | The Big (Source)

Putting Drew Brees’ Record in Context

Drew Brees is now the record holder for passing yards in a season, passing Dan Marino’s 1984 season.  However, it’s worth a quick mention that Brees’ season isn’t nearly as impressive as Marino’s season in 1984.  To compare different eras maybe the easiest thing to do is to calculate each player’s percentage of total yards in the NFL that year.  However, the NFL has expanded since the ‘84 season, so that won’t work.

Another thing you could do is to standardize each statistic, which is just measuring how many standard deviations each player was above the mean for the season.  If we do that we see that Drew Brees’ passing yards this year are 2.2 standard deviations above the mean, while Marino’s ‘84 season was 2.7 standard deviations above the mean.

If you wanted to think about Brees or Marino’s accomplishments in terms of probability, Brees’ accomplishment is about 3 times as likely to happen in a season like the 2011 season, than Marino’s accomplishment is likely to happen in a season like the ‘84 season (although both players seasons would be considered long shots to happen).

But the simple translation is that Marino was further ahead of his competition, than Brees is ahead of his.

Top Links 12-27-2011

The Link Leaderboard

  1. Changes coming for Chargers brass | SignOnSanDiego.com (Source)
  2. Cleveland Browns quarterback Colt McCoy has a chance to face Steelers if he can practice this week | cleveland.com (Source)
  3. Week 16 NFL Rewind | National Football Post (Source)
  4. Aaron Rodgers the clear-cut MVP after dismantling Chicago Bears – Peter King – SI.com (Source)
  5. FOOTBALL OUTSIDERS: Innovative Statistics, Intelligent Analysis | An Early Look at 2011 CB Charting (Source)
  6. Falcons-Saints: What to watch for tonight | National Football Post (Source)
  7. NFL.com news: Bears must focus on building around Cutler; Week 16 notes (Source)
  8. NFL.com news: Giants’ Cruz transforms into record-setting receiver (Source)
  9. SPORTSbyBROOKS » 1915: ‘merry Christmas. We will not fire to-morrow’ (Source)
  10. Fantasy Daddy NBA Group | FanDuel (Source)

Top Links 12-26-2011

The Link Leaderboard

  1. Fantasy Football Live – Yahoo! Fantasy Sports (Source)
  2. RotoRadio Xperts Edge 12/24 by RotoRadio | Blog Talk Radio (Source)
  3. New York Jets host New York Giants in key NFL Week 16 matchup | NJ.com (Source)
  4. UPDATE: A Green Bay win puts Falcons in the playoffs | Atlanta Falcons (Source)
  5. NFP Sunday Blitz | National Football Post (Source)
  6. Ndamukong Suh named America’s most charitable athlete | ProFootballTalk (Source)
  7. Bears-Packers: What to watch for tonight | National Football Post (Source)
  8. Bad Request (Source)
  9. NFL Week 16: Five things to watch | National Football Post (Source)
  10. Twitvid (Source)