Problem Setup¶
Let’s consider the following case: there are balls among which there are red ones and blue ones. We introduce as the number of balls we need to take without replacement in order to get red ones. What is the probability that the number of balls is ?
Our goal is to find .
PMF¶
We have an urn with balls, with red ones.
means that on the -th draw we took exactly the -th red ball.
Number of ways to get red balls among draws¶
Among the first draws we took exactly red balls. The number of ways to put red balls among draws is .
The -th draw is our -th red ball.
So we are left to count the number of ways to place remaining red balls among the remaining balls in the urn. Why? [1]
All possible outcomes¶
The number of possible outcomes is the number of ways to put red balls among all the balls:
Final probability¶
Combining everything we get:
Expectation¶
Intuition¶
Imagine you line up all draw positions in a row:
Now you drop the successes randomly into this line. Think of it like splitting the line into chunks:
before the 1st success,
between 1st and 2nd success,
…,
after the last success.
On average, each chunk has length about . So the -th success will typically sit at the end of the -th chunk.
That’s all the formula says: “the -th success is expected at about times the average spacing between successes.”
Variance¶
Just the formula of variance — no intuition here.
We have to count all the ways of placing all the balls. To make an analogy, imagine the case when we shuffle a deck of cards and take first cards. Even though we don’t care about the rest of the deck, we have to count the number of ways that the cards in the deck have positions in order to get the true probability.