Hypergeometric distribution
Have you ever heard about the hypergeometric distribution? I haven't up to at least a few weeks ago. It is related to the binomial distribution in a sense that both of these distributions describe the probability to have certain number of successes after a given number of experiments. The difference between them being that binomial distribution assumes experiments to be independent (drawing the balls from the box with replacement), while hypergeometric distribution assumes dependence (the balls are drawn without replacement).
Let us construct a simple model for the hypergeometric distribution, and run simulations!
Drawing the balls without replacement
Let us assume that the box contains
These questions, can be easily answered by figuring out the probability mass
function first. Let us first consider all possible ways to split
This gives as a total number of all possible draws. Now we need to determine
how many draws will have exactly
The first term in the multiplication counts all possible ways to allocate
exactly
As every allocation (or arrangement of the balls) is equally probable, we have that:
Note that, if
Now figuring out the mean and the variance is simply an algebraic exercise:
The mean has exactly the same form as the mean of binomial distribution. The
variance also has a similar form, but there is an additional term (the last
term) which represents drawing without replacement. Note that, due to this
term as
Interactive app
Interactive app below plots empirical histogram from
The red curve in the plot shows the empirical histogram of a draw without
replacement (follows the hypergeometric distribution), while the blue curve
show the empirical histogram of a draw with replacement (follows the
binomial distribution). Note that for small