Here’s a way to find a 95% confidence interval for any parameter θ.
- With probability 0.95, return the real line.
- With probability 0.05, return the empty set.
Clearly 95% of the time this procedure will return an interval that contains θ.
This example shows the difference between a confidence interval and a credible interval.
A 95% credible interval is an interval such that the probability is 95% that the parameter is in the interval. That’s what people think a 95% confidence interval is, but it’s not.
Suppose I give you a confidence interval using the procedure above. The probability that θ is in the interval is 1 if I return the real line and 0 if I return the empty set. In either case, the interval that I give you tells you absolutely nothing about θ.
But if I give you a 95% credible interval (a, b), then given the model and the data that went into it, the probability is 95% that θ is in the interval (a, b).
Confidence intervals are more useful in practice than in theory because they often approximately correspond to a credible interval under a reasonable model.
Credible intervals depend on your modeling assumptions. So do confidence intervals.
“A 95% credible interval is an interval such that the probability is 95% that the parameter is in the interval. That’s what people think a 95% confidence interval is, but it’s not.”
That’s gotta be the best concise statement about intervals I’ve ever read. I’m stealing it (with attribution, of course).
One way of looking at the Baysian vs Frequentism debate is that they disgree on what should be called ‘random’. The Baysians think that anything unknown is random, whereas the Frequentists think that random things are things generated by random processes.
So a Baysian thinks that θ is random, but that the data x and the interval I generated from x aren’t random (even though x was ‘randomly generated’ from θ) because we know the data. Whereas the Frequentist thinks that θ is nonrandom whereas the data x and the interval I are random.
So it’s true for the Frequentist to say ‘the confidence interval has a 95% chance of containing θ’. They mean that P(θ∈I) = 0.95, where θ is fixed and I is randomly changing depending on it.
Whereas to a Baysian ‘P(θ∈I) = 0.95’ means that the random variable θ has a 95% of lying in the known interval I, a credence interval.