There has been substantial debate over the past number of years pertaining to the validity of the so-called “10,000 hour rule” (hereafter referred to as “the rule”) as it applies to development of expertise and excellence in performance. As first asserted by Ericsson, the rule provides that the development of an “expert” or “master” level of accomplishment requires a minimum of about 10,000 hours of “deliberate practice” and that this improvement follows a linear growth rate. “Deliberate practice” is focused (perhaps structured) training where one consciously addresses weaknesses whilst maintaining (and possibly improving) strengths. The 10,000 hours works out to about 10 years of focused training before one can attain an “expert” or “master” level in the endeavor. The underlying supposition is that “nurture” super-dominates “nature”, i.e. as some would say “talent is over-rated”. The egalitarian basis of “the rule” has resonated with a society that values a hard-work ethos that leads to success, something that is perhaps fundamental to any civil society. But reality is, in this case, something very different.
As applied to sport, many have noted that there are numerous examples of athletes who have invested much less than 10,000 hours of focused training yet exhibit “excellent” performance at the international and Olympic level. Similarly, many have also noted numerous examples of athletes who after investing substantially more than 10,000 hours of focused training have still not reached (or even come near to) excellence in their respective sports. All of this is, of course, contrary to “the rule” and there have been a number of excellent analyses that disprove the efficacy of “the rule” as a controlling, single factor in the development of expertise and excellence in performance. The best of these analyses that I have been exposed to are well represented by those of Ross Tucker here, here, and here. Tucker concisely and thoroughly shows that, in addition to the glaring lack of attention to the statistical variance in the data as first presented by Ericsson (and subsequently by others), a myriad of arguments and data can be brought forth that detail the many other factors that clearly play significant roles in performance excellence. Not the least of these factors is individual gene expression, the subject of the recent book “The Sports Gene” by David Epstein. Epstein’s thesis is that unique combinations of “hardware” (genes) and “software” (training and opportunity) are what lead to performance excellence- not just training time as Ericsson and his acolytes assert.
In this post I am providing yet another aspect of the debate that has generally been overlooked and not well recognized- that of the statistical rarity of excellence and approaches to defining such excellence.
How to define an “expert” or “performance excellence”?
One of the deficient parts of the debate has been in defining exactly what “expert” or “excellence” is. For chess Ericsson uses the “master” level achievement as the definition of “expert” and such an earned title is based on the performance of chess players in tournaments with other “ranked” players. To first order this is a reasonable approach for something like chess. For sport, other systems can be used but in the case of running, particularly track and road events, the finishing time is an almost absolute reckoning of the level of excellence of a particular performance. Analytical comparisons of an athlete’s best time with the world record provides a sound basis for establishing a scale upon which “levels” of achievement can be placed.
One approach to deriving an analytical basis for the determination of excellence (or “expert” (elite) level) in standard distance, timed events is via statistics. The collection of a large number of finishing times for a particular event (e.g. marathon, mile, 800 m, etc.) can be analyzed for distribution type (normal, log-normal, etc.) and then metrics can be applied defining “levels” of accomplishment. In the case of a normally distributed population of marathon times, for instance, one could use standard deviation from the mean as an analytical metric defining expertise/excellence, i.e., for example, “good”= 1-2 standard deviations from the mean (84.2-97.8 percentile), “very good” = 2-3 standard deviations (97.9-99.9 percentile) from the mean, and “expert” (elite) = >3 standard deviations (>99.9 percentile) from the mean. Similar distribution metrics can be utilized for other types of distributions, should such non-normal distributions be extant.
A problem with this approach is deciding exactly what population of finishing times to analyze. Using all available times from a particular event will likely skew the data to longer finishing times as many who participate in a given event are not “athletes”- this is particularly true of middle and long distance events (5 km-ultramarathons). Truncation of the population at a certain cutoff finishing time will clearly help (e.g. using only times less than 4 hours for analysis of men’s marathon finishing times) but such a protocol involves a somewhat arbitrary determination and without conducting a sensitivity analysis the results could still be skewed.
The “percentage back” approach
Another, more robust, approach involves a simple process of rank ordering of the best ever finishing times for a particular event and then calculating the percentage time back from the best ever finishing time. The best ever finishing time provides an absolute reference against which any other time can be compared. A plot of cumulative probability (percentile rank) versus percentage back from the best ever finishing time will yield at least two useful things:
- “levels” of expertise/excellence can be applied to the data (e.g. “expert” (elite) could be defined as a best result that is less than 5% back from the best ever finishing time, “very good” (sub-elite) could be defined by times less than 10% back, etc.)
- the analytic functionality of the “excellence curve” of that particular event can be determined and allow for scaling of a given effort
Such “percentage back” analysis approaches are utilized regularly in cross country skiing to calculate World Cup points and thereby rank all competitors. One reason it is used is because finishing times in cross country skiing is highly variable for the same distance as a result of snow and weather conditions playing a dominating role in skiing speed (skiing speed for a given race distance (say, 30 km) shows about a 30-40% variability across events depending on course conditions and weather). So for an individual event on a given day under whatever conditions are prevailing, the percentage back from the winning time is the most relevant metric for evaluation of a performance. Corrections are made for the “quality” of the field at each event to ensure that races where a strong field is present are more heavily weighted than those with a much lower level of competitiveness.
In the case of running, finishing times are much less affected by weather and prevailing surface conditions, particularly those finishing times that are among the fastest ever recorded. So the “percentage back” approach can be used to make comparisons between events and therefore one can include all finishing times for an event, independent of when and where it took place. Use of data sets that include something in excess of about the 500 fastest finishing times ever will accurately establish the “expert” or “elite” tail of the distribution of all recorded times- it is this tail of the distribution that is the important part for the purposes in this post.
I will suggest here that “excellence” (elites) could be reasonably defined by those finishing times that are less than 5% back from the fastest ever time. Similarly, “very good” (sub-elite) could be finishing times less than 10% back, etc. This is just a proposal, not a proclamation; other defendable choices are likely, but the 5%,10% are commonly used in evaluations of talent in cross country skiing.
The “excellence curve” – Competition in running is an “exponential world”
As an example, presented below is a plot of percentile rank (cumulative probability) of the 499 fastest men’s marathon times ever recorded against percentage back from the fastest ever finishing time (2:03:02, G. Mutai, 4/18/11 (Boston)). Note that this type of analytic normalization of rank order is utilized in calculation of percentile rank for the SAT test for each cohort taking the test. A truncated population is shown here for the fastest men’s marathon finishing times (i.e. the equivalent of test scores) because we are interested in the “excellence” end of the population, so the expected “S” curve is not extant.
Clearly the functionality is non-linear, in fact the functionality is exponential. This empirical curve is the current “excellence curve” for the men’s marathon in that it defines the functionality and magnitude of time improvement required to progress in the marathon event.
Presented below is the same data as in the first graphic with a fitted exponential function. The equation for the curve is shown on the graph showing an e-base exponent of about 1.26.
Clearly K. Ito’s time of 2:07:57 (Beijing, 1/19/86) is exponentially slower than G. Mutai’s 2:03:02 (Boston, 4/18/11). In other words Mutai is exponentially faster than Ito by a magnitude defined by the percentage back, in this case 3.996% back and Ito would have to improve exponentially to claw his way down the marathon “excellence curve”. There is no linearity in performance excellence for the marathon*. One will find similar exponential results for other distances. This analysis also clearly shows exactly how rare and ethereal the top performers are.
Using the suggested protocol for defining “excellence” (elite) and “very good” (sub-elite) mentioned above, “elite” marathoners would be those with results less than 2:09:09 (less than 5% back from the fastest ever time) and “sub-elite” marathoners would be those with results less than 2:15:18 but greater than 2:09:09 (less than 10% back but greater than 5% back from the fastest time ever).
The “10,000 hour rule” in an exponential world
A fundamental premise underlying the work of Ericsson (and others) who subscribe to the “10,000 hour rule”, is that increasing total volume of deliberate practice singularly leads to greater accomplishment until one reaches the “master” or “expert” (elite) level at total accumulated training times greater than about 10,000 hours. This is the reason that numerous books have been written describing various ways to go about becoming “expert” (elite) using deliberate practice. All of these books center around a basic tenet: More deliberate practice (and only more) is better, necessary, and sufficient to achieve excellence. Examples of books that espouse the “10,000 hour rule” tenet are “The Talent Code – Greatness Isn’t Born. It’s Grown. Here’s How” , “Talent is Overrated: What Really Separates World-Class Performers from Everybody Else” , and Outliers: The Story of Success, where the authors gleefully proclaim that anyone can be “expert” or attain “performance excellence” just by grinding away at deliberate practice for long enough.
Applying this principle to the marathon, it follows that if one were to accumulate the 10,000 hours in total volume of deliberate practice then one would be “expert” or have results that are considered “performance excellence” (elite). We all know that this is not true as there are thousands of dedicated, smart-training, 10,000 hour+ marathon runners who will never see the likes of a 2:07 finishing time. Just ask your local 2:15, 10,000 hour+ marathoner exactly what they think about ever finishing a race in 2:07 (note: this result would still only get one to within about 4% of the best time). Additionally, it is not possible to non-linearly increase training time for any meaningful period as there are only so many hours in the day and only so much training stress that one’s body can take without breaking down physiologically. Any meaningful non-linear increase in training time will rapidly run out of hours in the day and musculoskeletal tolerance**. What this and the data above show is that, for the men’s marathon (and I note that this holds for other distances as well), linear increases in deliberate practice (training) will make it impossible to improve along the exponential “excellence curve”. One must introduce some individual non-linearity into the improvement process in order to ever be able to compete at the highest levels. I will suggest that one origin of such non-linear improvement comes from what is colloquially called “talent”, i.e. an innate, likely genetic, predisposition to non-linear improvement with deliberate practice in a chosen sport. We have likely all experienced a training partner or fellow competitor who, with a very similar training program and volume, accelerates in performance excellence and leaves “the rest” behind in another category entirely. I’ve seen this not only in sport (tennis, road cycling, mountain biking, and cross country skiing) but also in academics (physics, chemistry, mathematics). To use the sub-title of one of the books noted above, what really separates World-class performers from everybody else is not deliberate practice alone but rather the combination of deliberate practice and innate abilities as well as other factors such as environment, access and, importantly, motivation (the subject of a future post). All of these elements combine to produce the exponential improvement that leads to population of the high performance tail of the finishing time distribution.
The “10,000 hour rule” is a linear concept which has no singular place in the exponential world of athletic performance in endurance sport; the data are clear.
* A similar analysis including many more marathon finishing times (say, 100,000) may eventually show a linear dependence at some point far out on the finishing time/percent back scale, however this part of the excellence curve is not defining “excellence”. The “excellence” part of the curve is exponential as shown here.
**Daniel Coyle, the author of the book “The Talent Code”, argues that under certain situations (he uses the example of a music camp in upstate New York) one can experience non-linear increases in training effectiveness through something that he calls “deep practice”. However, Coyle also notes that this happens only over a limited period of time (7 weeks in the example of the music camp).