The Cauchy distribution’s counter-intuitive behavior

Someone with no exposure to probability or statistics likely has an intuitive sense that averaging random variables reduces variance, though they wouldn’t state it in those terms. They might, for example, agree that the average of several test grades gives a better assessment of a student than a single test grade. But data from a Cauchy distribution doesn’t behave this way.

Averages and scaling

If you have four independent random variables, each normally distributed with the same scale parameter σ, then their average is also normally distributed but with scale parameter σ/2.

If you have four independent random variables, each Cauchy distributed with the same scale parameter σ, then their average is also Cauchy distributed but with exact same scale parameter σ.

So the normal distribution matches common intuition, but the Cauchy distribution does not.

In the case of random variables with a normal distribution, the scale parameter σ is also the standard deviation. In the case of random variables with a Cauchy distribution, the scale parameter σ is not the standard deviation because Cauchy random variables don’t have a variance, so they don’t have a standard deviation.

Modeling

Some people object that nothing really follows a Cauchy distribution because the Cauchy distribution has no mean or variance. But nothing really follows a normal distribution either. All probability distributions are idealizations. The question of any probability distribution is whether it adequately captures the aspect of reality it is being used to model.

Mean

Suppose some phenomenon appears to behave like it has a Cauchy distribution, with no mean. Alternately, suppose the phenomenon has a mean, but this mean is so variable that it is impossible to estimate. There’s no practical difference between the two.

Variance

And in the alternate case, suppose there is a finite variance, but the variance is so large that it is impossible to estimate. If you take the average of four observations, the result is still so variable that the variance is impossible to estimate. You’ve cut the theoretical variance in half, but that makes no difference. Again this is practically indistinguishable from a Cauchy distribution.

Truncating

Now suppose you want to tame the Cauchy distribution by throwing out samples with absolute value less than M. Now you have a truncated Cauchy distribution, and it has finite mean and variance.

But how do you choose M? If you don’t have an objective reason to choose a particular value of M, you would hope that your choice doesn’t matter too much. And that would be the case for a thin-tailed probability distribution like the normal, but it’s not true of the Cauchy distribution.

The variance of the truncated distribution will be approximately equal to M, so by choosing M you choose the variance. So if you double your cutoff for outliers that are to be discarded, you approximately double the variance of what’s left. Your choice of M matters a great deal.

Related posts

3 thoughts on “The Cauchy distribution’s counter-intuitive behavior

  1. 1. Is the word “behave” missing in the last sentence of 1st paragraph?
    2. could you write up a follow-up post on some models using Cauchy distribution?

  2. Thanks Denny. I corrected the sentence you pointed out.

    One way the Cauchy distribution comes up is the ratio of normally distributed values. But in many ways the Cauchy distribution serves as a canonical example of a heavy-tailed distribution. It’s not that you look at some data and say “Aha! This looks like it’s Cauchy distributed.”, though you might. More likely you’d say “This looks heavy-tailed, like it could have a Cauchy distribution or some other heavy-tailed distribution.” You might do something like “Let’s look at what we should expect if this thing has a Cauchy distribution. I’m not saying it’s Cauchy per se, but that’s a good example of a fat-tailed distribution that’s easy to work with analytically.”

  3. In spectroscopy, a common theoretical lineshape is a lorentzian lineshape… Which seems to be the same as a Cauchy distribution. Despite having fit data with this lineshape many many times, the odd mathematical properties had never occurred to me. Thanks for pointing it out!

Comments are closed.