I was catching up on Engines of our Ingenuity episodes this evening when the following line jumped out at me:
If I flip a coin a million times, I’m virtually certain to get 50 percent heads and 50 percent tails.
Depending on how you understand that line, it’s either imprecise or false. The more times you flip the coin, the more likely you are to get nearly half heads and half tails, but the less likely you are to get exactly half of each. I assume Dr. Lienhard knows this and that by “50 percent” he meant “nearly half.”
Let’s make the fuzzy statements above more quantitative. Suppose we flip a coin 2n times for some large number n. Then a calculation using Stirling’s approximation shows that the probability of n heads and n tails is approximately
1/√(πn)
which goes to zero as n goes to infinity. If you flip a coin a million times, there’s less than one chance in a thousand that you’d get exactly half heads.
Next, let’s quantify the statement that nearly half the tosses are likely to be heads. The normal approximation to the binomial tells us that for large n, the number of heads out of 2n tosses is approximately distributed like a normal distribution with the same mean and variance, i.e. mean n and variance n/2. The proportion of heads is thus approximately normal with mean 1/2 and variance 1/8n. This means the standard deviation is 1/√(8n). So, for example, about 95% of the time the proportion of heads will be 1/2 plus or minus 2/√(8n). As n goes to infinity, the width of this interval goes to 0. Alternatively, we could pick some fixed interval around 1/2 and show that the probability of the proportion of heads being outside that interval goes to 0.
“As n goes to zero, the width of this interval goes to 0.”
The first “zero” should be “infinity,” no?
Hey John –
How fortuitous that I was led to this page from Twitter when I’ve been thinking about this very thing over the last few weeks. Here goes:
The strong law of large numbers (SLLN) states that the probability measure of events in an infinite bernoulli process (aka the infinite coin toss model) where S_N/N = 1/2 (note exact equality) is 1. Or alternatively the measure of the set of events where the proportion of successes is not equal to the expectation is 0. If I look at 2N trials like you do in the description above, the probability of exact equality for this condition (aka the truth set for SLLN) is proportional to 1/sqrt(N) . As N grows, the complement of the truth set for SLLN is gaining measure proportional to 1 – 1/sqrt(N). This makes SLLN very counter intuitive to folks like me who have a finite brain and cannot easily take the limit to infinity