A couple of recent posts have pursued an understanding of the concept of a potential “optimal” age for elite performance in the road marathon event.
Graydon uses top finishing times (times less than 2:11:00 for men) and “age on race day” to evaluate if there might be empirical support for determining an “optimal” age for excellence in this event and if there has been a change in the age distribution pre- and post-2000. He finds that there is essentially no change in average age between the groups and that this average age in the sub 2:11:00 population is about 28.3 years with a standard deviation of about 4 years. This result is essentially in agreement with the widely-held opinion that late 20’s to early 30’s is the peak performance period for elite marathon performance. No surprise here.
Alex finds that there is a statistically significant drop in the average age of the top 100 marathon times in each year since the 2008 Olympic Marathon 2:06:32 gold medal and Olympic record performance by 21 year old Samuel Wanjiru. It is suggested that Wanjiru inspired other younger runners to move to the marathon at an earlier stage of their careers. In another very nicely presented graphico-statistical piece by Alex and collaborators on what it will take to run a 2:00:00 marathon, it is argued that the person will be 5’6″, 120 lbs, very efficient, and in his early 20’s. The age part of this prediction is based on the same data as presented in Alex’s other post.
Presented here is an analysis of the same data that Graydon uses (the top 2500 men’s marathon times ever recorded, including individual repeat performances). The analysis is more granular in that rather than picking a “threshold” marathon time below which times are included, the same age correlated data are analyzed based on percentage back from the fastest ever time. It is argued elsewhere why “percentage back” is a more insightful way of looking at athletic performance in timed events, running in particular. A “percentage back” perspective on performance is used by many coaches to gauge progression for a developing athlete and in some sports (e.g. cross country skiing) “percentage back” is regularly used by National sports organizations to select competitive teams (e.g. Olympic, World Cup, and World Championship teams).
Graydon’s analysis approach utilizes a histogram count method to assess the statistics of the distribution of age for marathon performances below 2:10:00. Although insightful for such “elite” and “sub-elite” marathon performance, the analysis does not reveal any information on the shape of the distribution of performances within this “elite”-to-“sub-elite” category. Similarly, the analysis provided by Alex looks at the top 100 marathon performances for a given year with no upper time cutoff nor any granularity as to the shape of the distribution of finishing times (or “percentage back from the fastest ever time”) that are included in the analysis.
Here we will look at the fastest 2500 marathon times ever recorded as a function of the “age on race day” for each performance. The period for these data is from 1967-2014, in other words the oldest recorded marathon finishing time that is within the top 2500 times was in 1967 (2:09:37 by Derek Clayton (AUS) at Fukuoka on 12 December 1967). Similar to Graydon, we will also look at differences in two sub-periods contained in the dataset- 1967-2000 and 2001-2014 to attempt to detect any change in the age-correlated distribution of performances.
Presented below is a plot of all of the data (1967-2014) where the abscissa is the “age on race day” and the ordinate is percentage back from the fastest ever time (Dennis Kimetto, 2:02:57, Berlin 2014). Also provided are the finishing times associated with the percentage back values.
The data form a diffuse “whorl” terminating with an apex at about an age of 30 years for the fastest performances. Symmetric lines have been placed upon the data to guide the eye in looking at patterns in the data. This “whorl” shape and apparent deviations lead to numerous observations including:
- The obvious “bulge” at younger ages (i.e. to the left of the left hand guideline on the core “whorl”) indicating that there are many more sub-3% back performances at ages less than 25 years than there are at ages greater than 30 years. This observation supports Alex’s prediction that the future sub-2:00:00 marathoner might be significantly younger than the current world record holder.
- That the higher proportion of younger performances becomes even more pronounced at the sub 2% back performance level. My personal experience has been that athletes that show results in the sub-2% back category are the most likely to become consistent race winners and potential record holders; I will argue that it is more likely that a record will be set from this group and since this group is “younger” we may very well see a sub 25 year old record holder.
- The non-linear decay in the number of performances as the world record is approached- in fact this decay is approximated by a simple exponential function as is expected in athletic performance of timed events.
- The significant number of outstanding “old guy” performances to the right of the right hand guideline- this gives old guys (35 years+) some level of hope….
- the “center of mass” of the “whorl” is, of course, right about where Graydon calculates the average age in the dataset (28.3 years).
Now let’s take a look at the same dataset split into two temporal populations- one population comprised of the performances from 1967-2000 and another population of performances from 2001-2014, the populations that Graydon analyzed. Presented below is the same plot as above with these two populations shown in red (1967-2000) and blue (2001-2014).
Here we have the following observations:
- None of the performances in 1967-2000 population are within 2% of the current world record, i.e the world record time has decreased by about 2% over the last 14 years.
- Nearly all of the sub-3% back performances are from the 2001-2014 population and this population is significantly younger than either the dataset taken in it’s entirety or of either of the sub-populations. The average age of the sub-3% back population is 26.95 years.
- The average age of the sub-2% back population is 27.1, about the same as the sub-3% back population, but the average age of the sub-1% population is significantly higher at 29.7 years.
Points 2 and 3 above indicate that Alex’s prediction of a younger than 30 year old sub-2 hour marathoner may not be entirely supported by the data. As the performance progression proceeds from sub-3% back to sub-1% back, the average age is increasing significantly and may be indicative of the maturing of those 21-25 year olds that show exceptional promise (sub-3% back or less) toward their peak somewhere around 30 years of age. Of course the size of the sub-1% back population is small (11) compared to the sub-2% back (62) and sub-3% back (192) populations. Statistically the error is inversely functional with the square root of n (sample size) so the error associated with the 1% back population is much larger than that of the other two comparison populations.
Now let’s see if the “Wanjiru effect” mentioned above is supported by a more granular look at the data. Presented below is, once again, the same plot as previously only now the populations are divided into the pre-2008 group and the post-2008 group as Alex has done for the top 100 marathon times in each year. Here we are aggregating all years for each group but at the same time attaining a higher statistical power.
We have the following observations:
- An overwhelming majority of the “young and fast” times have occurred since 2008- this is what Alex calls the “Wanjiru effect”- perhaps due to younger runners moving into the marathon event earlier than was typical prior to 2008.
- The average age of the sub-3% back group from the 2009-2014 population is 26.6 years whereas the average age of the sub-3% back group from the 1967-2008 population is 28.2 years. This is a significant (1.6 year) difference and supports the assertion that elite marathoners are getting both faster and younger; this is perhaps the strongest support for Alex’s “early twenties” future sub-2 hour marathoner.
Aggregated data for elite-level marathon times indicates that a late 20’s to early 30’s age is currently “optimal”, however, since 2008, elite-level (sub-3% back from the world record) performances are showing a significant shift to younger ages when compared to the pre-2008 elite-level population. This trend, if it should continue, will clearly yield an increasing population of “young” marathoners, some of whom, on the right day, could take down the world record…. or go sub-2 hours?