World record marathon times have been falling in increments of roughly 30 seconds, each new record shaving roughly 30 seconds off the previous record. If someone were to set a new record, taking 20 seconds off the previous record, this would be exciting, but not suspicious. If someone were to take 5 minutes off the previous record, that would be suspicious.
One way to quantify how surprising a new record is would be to divide its margin of improvement over the previous margin of improvement. That is, given a new record y, and previous records y1 and y2, we can calculate an index of surprise by
r = (y − y1) / (y1 − y2)
In [1] the authors analyze this statistic and extensions that take into account more than just the latest two records under technical assumptions I won’t get into here.
A p-value for the statistic R is given by
Prob(R > r) = 2/(r + 2).
You could think of this as a scale of surprise, with 0 being impossibly surprising and 1 being completely unremarkable.
There are multiple reasons to take this statistic with a grain of salt. It is an idealization based on assumptions that may not even approximately hold in a particular setting. And yet it is at least a useful rule of thumb.
The current marathon record beat the previous record by 30 seconds. The previous margin of improvement was 78 seconds. This gives a value of r equal to 0.385 and a corresponding p-value of 0.84. This says the current record is impressive but statistically unremarkable. An improvement of 5 minutes, i.e. 300 seconds, would result in a p-value of 0.17, which is notable but not hard evidence cheating. [2]
The assumptions in [1] do not apply to marathon times, and may not apply to many situations where the statistic above nevertheless is a useful rule of thumb. The ideas in the paper could form the basis of a more appropriate analysis customized for a particular application.
Reports of a new record in any context are usually over-hyped. The rule of thumb above gives a way to gauge for yourself whether you should share the report’s excitement. You shouldn’t read too much into it, like any rule of thumb, but it at least gives a basis for deciding whether something deserves closer attention.
More rule of thumb posts
- How many errors are left to find?
- Probability of long runs
- Probability of secure hash collisions
- English to Spanish translation
[1] Andrew R. Solow and Woollcott Smith. How Surprising Is a New Record? The American Statistician, May, 2005, Vol. 59, No. 2, pp. 153-155
[2] I haven’t done the calculations, but I suspect that if you used the version of the index of surprise in [1] that takes into account more previous records, say the last 10, then you’d get a much smaller p-value.
The image at the top of the post is of Eliud Kipchoge, current marathon world record holder. Image under Creative Commons license. source.