A discussion over lunch today brought up the fact that additional data does not always decrease the size of a confidence interval. This post will look at this from a Bayesian perspective.
In general, new information reduces your uncertainty regarding whatever you’re estimating. The posterior distribution becomes more concentrated as more data are collected.
That’s what happens “in general” but does it necessarily happen every time you get new data? Conceivably if you get surprising data, data that is very unlikely given your current prior, posterior uncertainty might increase.
Binomial-beta model
To show that this is the case, suppose the probability of success in some binary trial has parameter θ and that θ has a beta prior. You could imagine this prior to be the posterior after having made some number of previous observations. Can a new observation increase the posterior variance in θ? If so, under what conditions?
The variance of a beta(a, b) random variable is
ab / (a + b)²(a + b + 1).
After observing a successful trial, the posterior distribution on θ is beta(a + 1, b). We can calculate the ratio of the posterior variance to the prior variance and ask under what circumstances, if any, the ratio is greater than 1.
If 2a ≥ b the posterior variance will be strictly less than the prior variance. This says if the prior mean odds against a success are no more than 2 : 1, observing a success will reduce the variance. (So will observing a failure.) But for any value of b, you can find a small enough value of a that observing a success will increase the variance.
Normal-normal model
Whether an observation can increase the posterior variance depends on the data model. If your data have a normal likelihood function with known variance and a normal prior on the mean θ, the posterior variance is always less than the prior observation, and it reduces by the same amount, independent of the observation x. If x is very unlikely a priori then it will pull the posterior mean toward itself more than an observation that is more concordant with the prior would have, but the change in the posterior variance is the same.
Proof of beta theorem
Here is a proof in Lean 4 of the statement above that if 2a ≥ b the posterior variance will be strictly less than the prior variance.
import Mathlib
set_option linter.style.header false
noncomputable def f (a b : ℝ) : ℝ := a * b / ((a + b) ^ 2 * (a + b + 1))
theorem f_ratio_lt_one' (a b : ℝ) (ha : 0 < a) (hb : 0 < b) (hab : b ≤ 2 * a) :
f (a + 1) b / f a b < 1 := by
have hs : 0 < a + b := by linarith
have h2ab : 0 ≤ 2 * a - b := by linarith
have hprod : 0 ≤ (a + b) * (2 * a - b) := mul_nonneg hs.le h2ab
-- key polynomial inequality (∗)
have key : (a + 1) * (a + b) ^ 2 < a * ((a + b + 1) * (a + b + 2)) := by
nlinarith [hprod, ha]
-- nonzero facts needed to clear denominators
have ha' : a ≠ 0 := ne_of_gt ha
have hb' : b ≠ 0 := ne_of_gt hb
have hs' : a + b ≠ 0 := ne_of_gt hs
have hs1' : a + b + 1 ≠ 0 := by positivity
have hs2' : a + b + 2 ≠ 0 := by positivity
have ha1' : a + 1 ≠ 0 := by positivity
-- express the ratio as a single closed-form fraction
have hratio : f (a + 1) b / f a b
= ((a + 1) * (a + b) ^ 2) / (a * ((a + b + 1) * (a + b + 2))) := by
unfold f
have e : a + 1 + b = a + b + 1 := by ring
rw [e]
field_simp
ring
rw [hratio, div_lt_one (by positivity)]
exact key







