Vous êtes sur la page 1sur 20

Mathematical

Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo
Table of
contents

Mathematical Biostatistics Bootcamp: Lecture 7, Some


Common Distributions

The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

Brian Caffo
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
Johns Hopkins University

August 21, 2012

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Table of contents

Brian Caffo

1 Table of contents
Table of
contents
The Bernoulli
distribution

2 The Bernoulli distribution

Binomial trials
The normal
distribution

3 Binomial trials

Properties
ML estimate of

4 The normal distribution

Properties
ML estimate of

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

The Bernoulli distribution

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

The Bernoulli distribution arises as the result of a binary outcome


Bernoulli random variables take (only) the values 1 and 0 with a probabilities

of (say) p and 1 p respectively


The PMF for a Bernoulli random variable X is

P(X = x) = p x (1 p)1x
The mean of a Bernoulli random variable is p and the variance is p(1 p)
If we let X be a Bernoulli random variable, it is typical to call X = 1 as a

success and X = 0 as a failure

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

iid Bernoulli trials

Brian Caffo
Table of
contents

If several iid Bernoulli observations, say x1 , . . . , xn , are observed the likelihood

is

n
Y

The Bernoulli
distribution

Properties
ML estimate of

xi

(1 p)n

xi

i=1

Binomial trials
The normal
distribution

p xi (1 p)1xi = p

Notice that the likelihood depends only on the sum of the xi


Because n is fixed and assumed known, this implies that the sample proportion

xi /n contains all of the relevant information about p

We can maximize the Bernoulli likelihood over p to obtain that p


=

the maximum likelihood estimator for p

xi /n is

The normal
distribution
Properties
ML estimate of

0.4

Binomial trials

0.2

The Bernoulli
distribution

0.0

Table of
contents

likelihood

Brian Caffo

0.6

0.8

1.0

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

0.0

0.2

0.4

0.6
p

0.8

1.0

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Binomial trials

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution

The binomial random variables are obtained as the sum of iid Bernoulli trials
In specific, let X1 , . . . , Xn be iid Bernoulli(p); then X =

random variable
The binomial mass function is

Properties
ML estimate of

P(X = x) =
for x = 0, . . . , n

n
x

p x (1 p)nx

Pn

i=1 Xi

is a binomial

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo

Recall that the notation

Table of
contents
The Bernoulli
distribution

Properties
ML estimate of


=

n!
x!(n x)!

(read n choose x) counts the number of ways of selecting x items out of n


without replacement disregarding the order of the items

Binomial trials
The normal
distribution

n
x

n
0


=

n
n


=1

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Example justification of the binomial likelihood

Brian Caffo
Table of
contents
The Bernoulli
distribution

Consider the probability of getting 6 heads out of 10 coin flips from a coin

with success probability p


The probability of getting 6 heads and 4 tails in any specific order is

Binomial trials

p 6 (1 p)4

The normal
distribution
Properties
ML estimate of

There are

possible orders of 6 heads and 4 tails

10
6

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Example

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

Suppose a friend has 8 children, 7 of which are girls and none are twins
If each gender has an independent 50% probability for each birth, whats the

probability of getting 7 or more girls out of 8 births?


 
 
8
8
7
1
.5 (1 .5) +
.58 (1 .5)0 0.04
7
8
This calculation is an example of a Pvalue - the probability under a null

hypothesis of getting a result as extreme or more extreme than the one


actually obtained

0.8

1.0

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

The Bernoulli
distribution
Binomial trials

0.6
0.4

Table of
contents

Likelihood

Brian Caffo

0.0

Properties
ML estimate of

0.2

The normal
distribution

0.0

0.2

0.4

0.6
p

0.8

1.0

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo
Table of
contents

The normal distribution


A random variable is said to follow a normal or Gaussian distribution with

mean and variance 2 if the associated density is


(2 2 )1/2 e (x)

The Bernoulli
distribution

2 /2 2

Binomial trials
The normal
distribution
Properties
ML estimate of

If X a RV with this density then E [X ] = and Var(X ) = 2


We write X N(, 2 )
When = 0 and = 1 the resulting distribution is called the standard

normal distribution
The standard normal density function is labeled
Standard normal RVs are often labeled Z

0.4

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

0.3

Brian Caffo

Binomial trials

0.2

The Bernoulli
distribution

density

Table of
contents

0.0

Properties
ML estimate of

0.1

The normal
distribution

0
z

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Facts about the normal density

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

If X N(, 2 ) the Z = X is standard normal


If Z is standard normal

X = + Z N(, 2 )
The non-standard normal density is

{(x )/}/

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

More facts about the normal density

Brian Caffo
Table of
contents

Approximately 68%, 95% and 99% of the normal density lies within 1, 2 and 3
standard deviations from the mean, respectively

1.28, 1.645, 1.96 and 2.33 are the 10th , 5th , 2.5th and 1st percentiles
of the standard normal distribution respectively

By symmetry, 1.28, 1.645, 1.96 and 2.33 are the 90th , 95th , 97.5th and 99th
percentiles of the standard normal distribution respectively

The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo
Table of
contents

Question
What is the 95th percentile of a N(, 2 ) distribution?
We want the point x0 so that P(X x0 ) = .95


X
x0
P(X x0 ) = P



x0
= P Z
= .95

The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

Therefore

x0
= 1.645

or x0 = + 1.645
In general x0 = + z0 where z0 is the appropriate standard normal quantile

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo
Table of
contents
The Bernoulli
distribution

Question
What is the probability that a N(, 2 ) RV is 2 standard deviations above the

mean?
We want to know

Binomial trials
The normal
distribution
Properties
ML estimate of

P(X > + 2) = P

X
+ 2
>

= P(Z 2)
2.5%

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Other properties
1

The normal distribution is symmetric and peaked about its mean (therefore
the mean, median and mode are all equal)

A constant times a normally distributed random variable is also normally


distributed (what is the mean and variance?)

Sums of normally distributed random variables are again normally distributed


even if the variables are dependent (what is the mean and variance?)

Sample means of normally distributed random variables are again normally


distributed (with what mean and variance?)

The square of a standard normal random variable follows what is called


chi-squared distribution

The exponent of a normally distributed random variables follows what is called


the log-normal distribution

As we will see later, many random variables, properly normalized, limit to a


normal distribution

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions

Question
If Xi are iid N(, 2 ) with a known variance, what is the likelihood for ?

Brian Caffo
Table of
contents
The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

n
Y


L() =
(2 2 )1/2 exp (xi )2 /2 2
i=1

(
exp

n
X

)
(xi )2 /2 2

i=1

(
= exp

n
X
i=1

xi2 /2 2

n
X

)
2

Xi / n /2

i=1



exp n
x / 2 n2 /2 2
Later we will discuss methods for handling the unknown variance

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo
Table of
contents
The Bernoulli
distribution

Question
If Xi are iid N(, 2 ), with known variance whats the ML estimate of ?
We calculated the likelihood for on the previous page, the log likelihood is

n
x / 2 n2 /2 2

Binomial trials
The normal
distribution
Properties
ML estimate of

The derivative with respect to is

n
x / 2 n/ 2 = 0
This yields that x is the ml estimate of
Since this doesnt depend on it is also the ML estimate with unknown

Mathematical
Biostatistics
Bootcamp:
Lecture 7,
Some
Common
Distributions
Brian Caffo

Final thoughts on normal likelihoods


The maximum likelihood estimate for 2 is

Table of
contents

Pn

i=1 (Xi

The Bernoulli
distribution
Binomial trials
The normal
distribution
Properties
ML estimate of

)2
X

n
Which is the biased version of the sample variance
The ML estimate of is simply the square root of this estimate
To do likelihood inference, the bivariate likelihood of (, ) is difficult to

visualize
Later, we will discuss methods for constructing likelihoods for one parameter

at a time

Vous aimerez peut-être aussi