Destination ImagiNation is a non-profit organization that encourages student creativity. This is my family’s first year to participate in DI and it has been a lot of fun. One of the things that impresses me most about DI is that they have strict rules limiting adult input.
This weekend I was an appraiser at a DI competition for an improvisation challenge. Teams could prepare for the overall format of the challenge, but some elements of the challenge were randomly selected on the day of the competition. This year the improvisations centered around endangered things. Teams were given a list of 10 endangered things ahead of time, but they wouldn’t know which thing would be theirs until just before they had to perform. Some of the things on the list were endangered animals, such as the giant panda. There were also other things in danger of disappearing, such as the VHS tape. The students also had to use a randomly chosen stock character and had to include a character with a randomly chosen “unimpressive superpower.”
There were 13 teams in the elementary division. What would you expect from 13 teams randomly selecting 10 endangered things? Obviously some endangered thing has to be chosen at least twice. Would you expect every item on the list to be chosen at least once? How often do you expect the most common item would be chosen?
In our case, three teams were assigned “glaciers” and five were assigned “the landline telephone.” The other items were assigned once or not at all. (No one was assigned “the Yiddish language”. Too bad. I really wanted to see what the students would do with that one.)
Is there reason to suspect that the assignments were not random? How likely is it that in a competition of 13 teams that five or more teams would be given the same subject? How likely is it that every subject would be used at least once? See an explanation here. Make a guess before looking at my answer.
Here’s some Python code you could use to simulate the selection of endangered things.
from random import random num_reps = 100000 # number of simulation repetitions num_subjects = 10 # number of endangered things num_teams = 13 # number of teams competing def maxperday(): tally = [0] * num_subjects for i in range(num_teams): subject = int(random()*num_subjects) tally[subject] += 1 return max(tally) total = 0 for rep in range(num_reps): if maxperday() >= 5: total += 1 print float(total)/num_reps
Spoiler: Don’t read my comment before you guess. (Unless you want to…)
xxx
xxx
Interesting questions. And pretty hard to do formally. I wonder if I can set up a simulation in Excel.
I guessed 10% on the 5 or more, and forgot to guess on the every item picked. It seems like that second question ought to be doable without simulation, but I’m not seeing a way yet.
why can’t we model the first questions as a binomial r.v.? if you have 13 experiments with a 0.1 percent chance of success, you expect to see 5 or more with probability 0.006460156, which matches your simulation.
rrs: If you looked at one particular item, you could say it has 1/10 chance of being selected each time, and you could use a binomial distribution to compute the probability of it being picked 5 or more times. That’s a good idea.
But we want to consider the probability that any item is selected five or more times. Will that be 10 times the probability of any particular one being selected five or more times? No, because the possibilities overlap. If the panda were selected 6 times and the VHS tape were selected 7 times, that possibility would be counted twice: once for the panda being selected more than 5 times and once for the VHS being selected more than 5 times. On the other hand, multiplying by 10 might give a reasonable approximation, though I don’t know because I haven’t calculated it.
It’s not too hard to calculate the explicit probabilities; just consider the partitions of 13 into 10 non-negative numbers. We’re looking for the partitions where any number >= 5 in the first case, and the partitions where all numbers are >= 1 in the second case. Also, remember to multiply by the appropriate multinomial factor when counting.
The exact probabilities work out to 322067741/5000000000 and 891891/62500000 respectively, so your estimates are pretty good.