Coupon Collector's Problem
Problem statement: There are 5 possible coupons. Everyday, you are given a random coupon. How many days, on average, would you need to buy to collect all 5 coupons?
Solving the problem
Let be the number of days to collect all 5 coupons.
Then, where is the number of days to collect the first coupon, is the number of days to collect the second coupon, etc. Less formally:
First coupon is trivial. is always one day because it takes one day to get our first unique coupon (can be any).
Second coupon is less obvious. How do we find the number of days for the second coupon? Note, that this coupon can be any coupon as long as it is not the card we got as our first coupon.
So the chance that we get a coupon that isn't the first coupon is since there are 4 possible uncollected coupons out of a choice of 5.
However, there is that chance we keep getting the coupon we've already collected, again and again....then we'd never get all 5 coupons?
The days for the second coupon () follows a geometric distribution. This takes into account the probability of things happening. It is random, but random in the sense it follows these probabilities that our problem establishes. This would consider the small likelihood that we keep getting the coupon we've already collected.
Formally, . The days for our second coupon are distributed among probabilities.
What does geometric distribution mean in our example?
The chance that it takes one day to collect our next coupon is ; two days is ; three days is ; four days is and so on...
This makes sense because you have to get the coupon you've already collected for the first few days before getting a different coupon.
Geometric distributions have the special property of . For our example . This means that on average, it'll take days for us to collect the second coupon. You can look at the proof for this expected value for geometric distributions in the references. More formally:
Third coupon is exactly the same reasoning as the second coupon. It is also a geometric distribution. Except, there is a chance that it is a new coupon and a chance that it is a coupon we've already collected. Hence,
Exactly like the second coupon above.
We can extrapolate:
To generalize for coupons:
References
- Fifty Challenging Problems in Probability with Solutions by Frederick Mostreller
- CS70 Notes (https://www.eecs70.org/assets/pdf/notes/n19.pdf)