Judging how surprising a new record is

Eliud Kipchoge, marathon world record holder

World record marathon times have been falling in increments of roughly 30 seconds, each new record shaving roughly 30 seconds off the previous record. If someone were to set a new record, taking 20 seconds off the previous record, this would be exciting, but not suspicious. If someone were to take 5 minutes off the previous record, that would be suspicious.

One way to quantify how surprising a new record is would be to divide its margin of improvement over the previous margin of improvement. That is, given a new record y, and previous records y₁ and y₂, we can calculate an index of surprise by

r = (y − y₁) / (y₁ − y₂)

In [1] the authors analyze this statistic and extensions that take into account more than just the latest two records under technical assumptions I won’t get into here.

A p-value for the statistic R is given by

Prob(R > r) = 2/(r + 2).

You could think of this as a scale of surprise, with 0 being impossibly surprising and 1 being completely unremarkable.

There are multiple reasons to take this statistic with a grain of salt. It is an idealization based on assumptions that may not even approximately hold in a particular setting. And yet it is at least a useful rule of thumb.

The current marathon record beat the previous record by 30 seconds. The previous margin of improvement was 78 seconds. This gives a value of r equal to 0.385 and a corresponding p-value of 0.84. This says the current record is impressive but statistically unremarkable. An improvement of 5 minutes, i.e. 300 seconds, would result in a p-value of 0.17, which is notable but not hard evidence cheating. [2]

The assumptions in [1] do not apply to marathon times, and may not apply to many situations where the statistic above nevertheless is a useful rule of thumb. The ideas in the paper could form the basis of a more appropriate analysis customized for a particular application.

Reports of a new record in any context are usually over-hyped. The rule of thumb above gives a way to gauge for yourself whether you should share the report’s excitement. You shouldn’t read too much into it, like any rule of thumb, but it at least gives a basis for deciding whether something deserves closer attention.

Surprisingly not that surprising

More rule of thumb posts