Vous êtes sur la page 1sur 4

Hypergeometric Defination

In probability theory and statistics, the hypergeometric distribution is a discrete


probability distribution that describes the probability of successes in draws, without
replacement, from a finite population of size that contains exactly successes, wherein each
draw is either a success or a failure.

Hypergeometric Distribution
The probability distribution of a Hypergeometric random variable is called
a Hypergeometric distribution. This topic describes how Hypergeometric random
variables, Hypergeometric experiments, hypergeometric probability, and the
hypergeometric distribution are all related.

Notation
The following notation is helpful, when we talk about hypergeometric distributions and
hypergeometric probability.

N: The number of items in the population.

k: The number of items in the population that are classified as successes.

n: The number of items in the sample.

x: The number of items in the sample that are classified as successes.

h(x; N, n, k): hypergeometric probability - the probability that an n-trial

Cx: The number of combinations of k things, taken x at a time.

hypergeometric experiment results in exactly x successes, when the population


consists of N items, k of which are classified as successes

Hypergeometric Distribution
A hypergeometric random variable is the number of successes that result from a
hypergeometric experiment. The probability distribution of a hypergeometric random
variable is called a hypergeometric distribution.
Given x, N, n, and k, we can compute the hypergeometric probability based on the following
formula:

Hypergeometric Formula. Suppose a population consists of N items, k of which are


successes. And a random sample drawn from that population consists of n items, x of which
are successes. Then the hypergeometric probability is:
h(x; N, n, k) = [ kCx ] [

N-k

Cn-x ] / [ NCn ]

The hypergeometric distribution has the following properties:

The mean of the distribution is equal to n * k / N .

The variance is n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ] .

Example 1
Suppose we randomly select 5 cards without replacement from an ordinary deck of playing
cards. What is the probability of getting exactly 2 red cards (i.e., hearts or diamonds)?
Solution: This is a hypergeometric experiment in which we know the following:

N = 52; since there are 52 cards in a deck.

k = 26; since there are 26 red cards in a deck.

n = 5; since we randomly select 5 cards from the deck.

x = 2; since 2 of the cards we select are red.

We plug these values into the hypergeometric formula as follows:


h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]
h(2; 52, 5, 26) = [ 26C2 ] [ 26C3 ] / [ 52C5 ]
h(2; 52, 5, 26) = [ 325 ] [ 2600 ] / [ 2,598,960 ] = 0.32513
Thus, the probability of randomly selecting 2 red cards is 0.32513

Application and example:


The classical application of the hypergeometric distribution is sampling without replacement. Think of
an urn with two types of marbles, red ones and green ones. Define drawing a green marble as a success

and drawing a red marble as a failure (analogous to the binomial distribution). If the variable N describes
the number of all marbles in the urn (see contingency table below) and K describes the number
of green marbles, then N K corresponds to the number of red marbles. In this example, X is
the random variable whose outcome is k, the number of green marbles actually drawn in the experiment.
This situation is illustrated by the following contingency table:

drawn

not drawn

total

Kk

nk

N+knK

NK

Nn

green marbles

red marbles

total

Now, assume (for example) that there are 5 green and 45 red marbles in the urn. Standing next to the
urn, you close your eyes and draw 10 marbles without replacement. What is the probability that exactly 4
of the 10 are green? Note that although we are looking at success/failure, the data are not accurately
modeled by the binomial distribution, because the probability of success on each trial is not the same, as
the size of the remaining population changes as we remove each marble.
This problem is summarized by the following contingency table:

green marbles

red marbles

total

drawn

not drawn

total

k=4

Kk=1

K=5

nk=6

N + k n K = 39

N K = 45

n = 10

N n = 40

N = 50

The probability of drawing exactly k green marbles can be calculated by the formula

Hence, in this example calculate

Intuitively we would expect it to be even more unlikely for all 5 marbles to be green.

As expected, the probability of drawing 5 green marbles is roughly 35 times less likely than
that of drawing 4

Vous aimerez peut-être aussi