0 Votes +0 Votes -

18 vues16 pagesProbability Distributions

Jan 10, 2016

© © All Rights Reserved

PDF, TXT ou lisez en ligne sur Scribd

Probability Distributions

© All Rights Reserved

18 vues

Probability Distributions

© All Rights Reserved

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra
- The Theory of Death: A Decker/Lazarus Novel

Vous êtes sur la page 1sur 16

Probability Distributions

A random variable is a mechanism that generates data. The probability distribution (also

marginal distribution if univariate) of the random variable describes the probabilities by which

the data are generated. From the probability distribution we can then infer the expectation

(weighted mean) and/or the variance of the distribution (also described as the expectation and/or

the variance of the random variable, respectively). Thus, understanding the most common

random variables and their probability distributions can save us significant time in solving the

most common probabilistic problems we encounter in our studies. In this lesson, we will discuss

discrete random variables, continuous random variables, and then joint and conditional

probability distributions.

Discrete Distributions

Discrete #1: The Bernoulli Distribution

The simplest discrete random variable is a Bernoulli random variable, which is used to model

experiments that can only succeed or fail. Examples include flipping a coin (and hoping for

heads), choosing a female from the population, observing a daily stock return of less than -3%,

observing a white car on El Camino Real, and so on. The Bernoulli random variable takes on

values 1 and 0 with probabilities p and 1 p, respectively. In general, in order to fully describe a

discrete random variable, we need to list the outcomes and the corresponding probabilities of

those outcomes, as we have just done with the Bernoulli. More formally, if X is a Bernoulli

random variable, then we write the following:

if x = 1

p

p

1

P ( X = x) =

or X =

1 p if x = 0

0 1 p

X denotes the random variable and x denotes the arbitrary value that X can take on. Collectively,

this description is called the probability distribution of the random variable X. A shorthand

notation is X ~ Bernoulli ( p ) . The ~ notation means is distributed as and is used generally

for other types of random variables as well. What does the probability look like? A histogram!

Imagine the example of a coin flipin this case the histogram will have two bins (heads and

tails) that will each have a relative frequency of 50% (or, for a sample, closer and closer to that

distribution as the sample size increases). This plot of probabilities against outcomes for discrete

random variables is also called a probability mass function.

The Bernoulli distribution is for a single trial. The probabilities expressed by this distribution

derive, of course, from many repetitions of this trial, but the end result is a description of the

probability of doing a success/fail experiment one time: flipping a coin once, etc. If we are

interested in the outcome of repeated Bernoulli experiments (for example, the probability of

flipping three heads in five tosses), then we can turn to the Binomial distribution. Another

Page | 1

example could be describing how many people out of 100 vote for Obama over Romney (or vice

versa to be politically correct) when the (known) probability of voting for Obama is 0.40. This

latter probability of any individual vote would be described by a Bernoulli distribution, while the

former probability of how many out of all 100 vote for Obama would be described by a Binomial

distribution.

More formally and generally, the Binomial distribution describes the probabilities of x successes

out of n independent Bernoulli success/fail trials. If the random variable X describes the number

of successes, then the probability distribution of X is

n

nk

P ( X = k ) = p k (1 p )

k

n

n!

where =

and is spoken n choose k. The ! operator is the factorial operator,

k k !( n k ) !

and represents the product n ! = n ( n 1)( n 2 ) ... ( 2 )(1) . For example, 5! = 5 4 3 2 1 = 120 ,

account the different orderings in which the successes and failures can happen; if we were

interested in the probability of a specific ordering (e.g. five successes followed by five failures in

a trial of ten repetitions), we would have a different formula (since there are more ways to

permute the number of successes than there are to just combine them; the permutation formula,

n!

is just

).

( n k )!

We say X ~ Binomial ( n, p ) .

V ( X ) = np (1 p ) . Hopefully, these results make intuitive sense: we expect to get the same

proportion of successes as the probability of success in one trial would be, and we expect the

variance to depend on both the probability of success and the probability of failure, again scaled

up to the number of trials we run.

Example 6.1 Say Dirk Nowitzki, the 2011 Most Valuable Player of the National Basketball

Association, has a 90% chance of making any given free throw he takes. Say he takes ten free

throws in a row. What is the probability that he makes exactly five of those shots, and misses the

other five? What number of shots is he most likely to make?

10!

10 5

0.95 (1 0.9 ) = 252(0.59) (1105 ) = 0.15% . It makes

5!(10 5 ) !

sense that this shouldnt be very large, since this would represent making 50% of his shots when

he has a 90% chance of making any given shot. Intuitively, we can expect him to be most likely

to make 9 shots out of the 10; to actually prove this, wed have to find the probability of all 11

possible events (making 0 shots through making all 10). However, since the Binomial

probability distribution only has one peak, well check our intuition just by making sure that the

probability of making 10 shots and the probability of making 8 shots are both lower than the

Solution 6.1 P ( X = 5) =

Page | 2

P ( X = 9) =

10!

10 9

0.99 (1 0.9 )

= 38.7% , while also

9!(10 9 ) !

10!

10!

10 10

108

0.910 (1 0.9 )

= 34.9%, P ( 8 ) =

0.98 (1 0.9 ) = 19.4%

10!(10 10 ) !

8!(10 8 ) !

so we can indeed say that he is most likely to make 9 of his 10 shots. It is not always the case

that the most likely outcome is related to the underlying parameter p in the same manner it is in

this case.

P ( X = 10 ) =

The Binomial distribution describes probabilities for successes in a certain number of trials. In

contrast, the Poisson distribution applies to successes over a time period or an area. For

example, we could model the probability that a biotechnology firm files for a vaccination patent

in some time interval with a Poisson distribution. Poisson distributions can be useful in

regression analysis when the dependent variable (the modeled variable) is a count variable.

Aside from the probabilities and the distribution, we are interested in certain characteristics of

the random variable. For example, what is the central tendency of the random variable? What

about the spread and variation? These are the same questions that we asked in the previous

lesson about descriptive statistics, only now we are going to compute them for a particular

random variable. No data will be involved because these are population characteristics. Earlier

we talked about the difference between population and sample characteristics. In the case of a

random variable, the expectation E ( X ) (a.k.a. mean or average) and variance Var ( X ) or

V ( X ) are considered population or true characteristics, rather than estimates such as x and s 2 .

Additionally, when computing the expectation or variance, it is usually a good idea notationally

to write what distribution the expectation or variance is taken with respect to, i.e. E X ( X ) rather

than just E(X). In these notes it should always be clear what distribution the variance or

expectation is taken with respect to and so we omit the subscript for now. However, it may be

good to keep this in mind if you start dealing with complex expressions and many random

variables.

Lets assume that our random variable X has the general discrete probability distribution

p1 if

p if

P ( X = x) = 2

pn if

x = x1

x = x2

where

=1

i =1

x = xn

Page | 3

n

= 1 . Note that,

i =1

n

i =1

n

i =1

Solution 6.2 In a Bernoulli distribution, the random variable can only take on the values 0 and 1.

So we have E ( X ) = 0.2 ( 0 ) + 0.8 (1) = 0.8 . We can see that, in general for a Bernoulli random

variable, the expectation is p.

Example 6.3 Let X be the random variable describing how many students in the boot camp are

sleeping (or, at least, their eyes are getting droopy) at any one time after lunch:

0.8 if

P ( X = x ) = 0.1 if

0.1 if

x=0

x =1

x=2

Solution 6.3 E ( X ) = 0 ( 0.8) + 1( 0.1) + 2 ( 0.1) = 0.3 . Note that even though the random variable

is discrete and can only take on the values 0, 1, and 2, the expected value is continuous and can

take on any value in that range.

The expectation operator E (

) is a linear operator.

only be true if the two variables had a correlation of exactly zero.

We call E ( X ) the first moment of the distribution (or of the random variable). More generally,

n

E ( X k ) = pi xik is called the kth moment. Note that we have taken the outcomes to the kth

i =1

Page | 4

power inside the sum, but not the probabilities. Even more generally, for any continuous

n

function g ( x ) , E ( g ( X ) ) = pi g ( xi ) .

i =1

Just as we can define the expected value of a discrete random variable, we can define the

variance as well. Remember that in the case of descriptive statistics, the variance was the

average squared deviation from the mean. The current context is no different. Thus, the

variance of a discrete random variable X is defined as V ( X ) = E X E ( X )

2

{E X E ( X )} ).

( )

(and NOT

2

Example 6.4 What is the variance of our sleepy-time distribution from Example 6.3?

( )

course, that the more meaningful measure of spread is the standard deviation. Thats not

difficult to find, however, since it is just SD ( X ) = V ( X ) = 0.41 = 0.64 .

Just as we defined properties of the expectation, we can do so for the variance operator V (

V ( aX bY ) = a 2V ( X ) + b 2V (Y ) 2ab cov( X , Y )

):

These last property can easily be generalized for sums of more than two random variables.

Essentially what you get is that the variance of a sum of random variables is the sum of the

variances, plus a bunch of covariance terms. Showing that any of these properties is true is not

difficult; all you need is the definition of variance and a little patience with algebra.

By now you may be able to see that E ( X ) tells us something about the mean of a distribution

and E ( X 2 ) tells us something about the variance. As it turns out, for particular probability

distributions, E ( X 3 ) tells us about the skewness of the distribution and E ( X 4 ) tells us about

the kurtosis (how fat the tails of the distribution are).

If, instead of E ( X ) , you wish to use the median as your measure of central tendency for a

discrete random variable, find the value that the random variable takes such that there is equal

0.5 probability of being greater than or less than that point. Depending on the number of values

that the random variable can take, this may not be a unique number. Formally, the median is

defined as the number M such that P ( X M ) 0.5 and P ( X M ) 0.5 . For example, M = 0

for our sleepy-time distribution from Example 6.3.

Page | 5

Continuous Distributions

The primary difference between the properties of a continuous probability distribution and those

of a discrete distribution is that the continuous case requires a little calculus. Recall that the

continuous analogue of a histogram is a probability density function, f ( x ) , where

f ( x ) 0 for all x (that is, every value of x has at least zero probability of occurring), and

These should seem intuitive because, with the histogram (or, more formally, the probability mass

function (pmf)) in the discrete case, the y-axis was relative frequency of occurrence. In the

continuous world, it is actually the area under the probability density function over an interval on

b

the x-axis that corresponds to a probability. For example, P ( a < X < b ) = f ( x ) dx . Note that

a

this

means

that

the

probability

of

being

at

exactly

one

infinitesimal

point,

a

x 0 x 2

Example 6.5 Verify that f ( x ) =

is a probability density function.

0 otherwise

Solution 6.5 We can see by inspection that this function is always above the x-axis. The more

interesting question is the area under the function:

f ( x ) dx =

1

xdx = x 2 = 1 0 = 1

2 0

We could have also checked this geometrically since, by inspection, the density function is a

triangle. As a result, notice that the distribution is skewed right.

An alternative to the probability density function (often abbreviated pdf) is the cumulative

distribution function (often abbreviated cdf). The cdf, denoted by F ( x ) instead of the pdfs

probabilities: the cdf at any point is the probability that the random variable X takes on a value of

less than or equal to x0 . Additionally, it is related to the pdf by F '( x) = f ( x) . Though less

common, cdfs can be quite useful in certain cases. For example, the median M of a random

variable is easily found by solving F ( M ) = 1 F ( M ) = 0.5 .

With this background on probability and cumulative distribution functions, we can now proceed

to describe the most common continuous random variables, just as we did above with discrete

ones.

Page | 6

height of the uniform distribution is constant; therefore, the probability that a random variable X

that is uniformly distributed takes on a value in some interval depends only on the length of the

interval itself:

1

if a < x < b

f ( x) = b a

0

otherwise

We can say a < X < b or a X b because P ( X = a ) = P ( X = b ) = 0 , so P ( a < X < b )

While it may be simple, the uniform distribution is often applicable to real-world situations. For

example, it seems reasonable to assume that a random person in the world is equally likely to be

born at any particular time during the day in Pacific Standard Time, so the uniform distribution

may be a good choice for modeling that random variable.

Example 6.6 Suppose X ~ U [5,10] . What is P ( 6 < X < 7 ) ?

1

1

dx = ( 7 6 )

= 0.2 . Note that we could have also

6 10 5

10 5

just calculated the area of a rectangle in this case. But integration is so much more fun!

The normal distribution is a famous continuous distribution that underlies much of statistical

inference and which is thought to be a good model for many phenomena in the real world.

Among other properties, it is bell-shaped and symmetric about its mean (which means that the

mean is the median and is located at the center of the distribution). If X is a normal random

variable with mean and variance 2 , then we denote this by writing X ~ N ( , 2 ) . The pdf

2

1

e ( x ) / 2 . Thus, finding probabilities with the normal

2

distribution involves some nasty integration. In fact, there is no closed form expression for the

cdf of the normal distribution. Thankfully, previous statisticians have gone ahead and performed

various calculations for us and so we can now look up values of the normal cdf in the normal

table (provided on the next pages) rather than having to integrate. There are just two problems:

the normal table only lists values for N ( 0,1) (called the standard normal distribution), so we

need to get our distribution into that form to make use of this shortcut, and the normal table only

lists cumulative probabilities P ( X < x ) = ( x ) .

The second problem is not so hard to solve, noting that P ( x1 < X < x2 ) = P ( X < x2 ) P ( X < x1 )

by symmetry, ( x ) = 1 ( x ) , so we only need this table for 0 < x < (though we will

Page | 7

provide you with a table for < x < ). Solving the first problem is slightly more involved,

and requires the use of a process known as standardization.

Standardization is the process by which we can convert our X ~ N ( , 2 ) to a Z ~ N ( 0,1) .

For this, we subtract off the mean and divide by the standard deviation, to find the z-score of a

x

point of interest x. Thus, the z-score of x is

, and,

b

a

b

a

= P

<Z<

=

This works because the process of standardization preserves probabilities, allowing us to find a

probability on a N ( 5,9 ) distribution by finding it on N ( 0,1) instead, and because

X ~ N ( , 2 )

Solution 6.7 Rather than integrate to find this probability, we will standardize and use the tables.

65

X 6

P ( X > 6) = P ( X > 6 ) = P

>

= PZ >

= P ( Z > 0.33) = 0.3707

3

Page | 8

Page | 9

Page | 10

Normal variables have several other interesting properties. For example, when you scale a

normal variable (that is, you multiply it by a constant), the result is a normal variable whose

mean and variance are scaled accordingly: X ~ N ( , 2 ) X ~ N ( , 2 2 ) . Also, the sum

of two (or more) independent normal variables is normally distributed, with its mean the sum of

the means and its variance the sum of the variances (if the variables are not independent then the

variance also involves covariance terms, but the mean of the sum is still just the sum of the

individual means).

n

Example 6.8 Consider a series of random variables { X i }i =1 . Assume they are independent and

distributed according to X i ~ N ( , 2 ) . What is the probability distribution of X =

1 n

Xi ?

n i =1

n

2

1 n

n

2

2

~

,

~

,

~

X

N

N

n

X

=

X

N

(

)

i , n . This result,

i

n i =1

i =1

i =1 i =1

that the variance of a mean decreases with sample size, is a key result in statistical inference.

n

Solution 6.8

Interestingly, the inventor of the Student t distribution invented it while working at the Guinness

Brewery in Dublin in 1908. He wasnt allowed to publish under his own name, so he published

under the name Student. The main parameter for the t-distribution is the degrees of freedom.

This sole number parameterizes the distribution (that is, it determines the distributions shape

and location). Think of the Student t distribution as the close sibling of the standard normal

distribution. Both are symmetric about the (zero) mean and both are used extensively in

hypothesis testing. In fact, as the degrees of freedom of the t distribution go to infinity, the t

distribution converges to the standard normal.

The F distribution was invented by another anonymous brewmaster named F. Well, okay,

thats not actually true, but it would make these notes a lot more interesting! The F distribution

is NOT centered at zero like the t and the standard normal. Instead, the F distribution is defined

for positive x only and is skewed to the right. In similar fashion to the t distribution, the F is

parameterized by its degrees of freedom. The only difference is that the F has two sets of

degrees of freedom, called the numerator degrees of freedom and the denominator degrees of

freedom. This is because the ratio of two chi-squared distributions (defined next) is (roughly) F

distributed.

(Chi-squared) Distribution

The chi-squared distribution is similar to the F in that it is defined for positive x only and is

skewed to the right. It also is used frequently for particular types of statistical inference. The

most interesting aspect of the chi-squared distribution is that it is defined as the sum of

independent squared standard normal random variables, where the degrees of freedom are equal

Page | 11

to the number of standard normal variables in the sum. That is, if Z ~ N ( 0,1) and we observe v

draws from Z, then Z12 + Z 22 + ... + Z v2 ~ v2 .

Just like we did for discrete random variables, we want to calculate the expectation and variance

of continuous random variables. So, we will. The expectation of a continuous random variable

Remember,

integration is a kind of sum (of areas of really narrow rectangles), so this is a logical progression

3

Solution 6.9 E ( X ) =

x3

2

x xdx = =

= 0.94 .

3

3 0

The variance of a continuous random variable is the exact same formula as the variance for a

2

( )

expectation operators make use of the continuous definition rather than the discrete one.

b

Solution 6.10 E ( X ) =

x2

( b + a )( b a ) = b + a

x

b2 a2

dx =

=

=

ba

2 (b a )

2

2 (b a ) a 2 (b a )

b

E(X

)=

2

2

x3

x2

b3 a 3 ( b + ab + a ) ( b a ) b 2 + ab + a 2

dx =

=

=

=

ba

3(b a )

3

3(b a ) a 3 (b a )

b 2 + ab + a 2 ( b + a )

V ( X ) = E ( X ) E ( X ) =

3

4

2

=

=

=

12

12

12

12

Page | 12

2

Solution 6.11 E ( X

)=

all x

x f ( x ) dx =

2

x4

x xdx = = 1

4 0

2

V ( X ) = E ( X 2 ) E ( X ) = 1 ( 0.94 ) = 0.116

Finally, since the median is the middle number, then that must mean that half of the probability

in the distribution is on one side of the median and half of the probability in the distribution is on

the other side. So, for a continuous probability distribution, the median is defined as the point M

such that

M

case (assuming the cdf has a closed form expression).

So far we have discussed marginal density functions, or probability density functions for one

variable. In the previous lesson, we discussed correlation and how multiple variables may be

related. We can capture the interdependence (and independence) of multiple random variables

by looking at the multivariate probability distribution, specifically joint and conditional

distributions. For our purposes, we will stick to the context of just two variables.

We will mostly focus on the continuous case here, noting only for the discrete case that, for two

random variables X and Y, the discrete joint probability density (or mass in the case of discrete)

function is denoted by P ( X = x, Y = y ) . But we shall let our discussion of how to find such a

probability be couched in the continuous case. If X is the foreign aid that a country receives

from the U.S. and Y is the growth rate of that countrys economy the month prior to aid being

disseminated, we would expect the two variables to be related since we would expect that aid

would go to low-growth countries, so prior growth may decrease aid.

The probability that X is in some interval and Y is in some other interval can be calculated by

knowing the joint density function (or bivariate density function) f ( x, y ) . Probabilities are

found by integrating this function, as before: P ( x1 < X < x2 , y1 < Y < y2 ) =

Since f ( x, y ) is a density function, we still have

y2

y1

x2

x1

f ( x, y ) dxdy .

derive either marginal distribution from the joint distribution, by integrating with respect to the

variable we dont want: f ( x ) =

f ( x, y ) dy, f ( y ) =

f ( x, y ) dx .

Verify that both this and the two marginal density functions are all proper density functions.

Page | 13

0 0

xydydx

1

2

1

y2

= xdx = 2 xdx = x 2 = 1 . The marginal density functions are f ( x ) = xydy = 2 x

0

0

0

0

2 0

1

1

and f ( y ) = xydy = y . So then

0

2

1

f ( x ) dx = 2 xdx = x

2 1

= 1 and

0

1

1

ydy = y 2 = 1

2

4 0

Recall the definition of conditional probability, P ( A | B ) =

P ( A B)

P (B)

definition relates the marginal, joint, and conditional probabilities to one another. The same

f ( y, x )

relationship exists in the context of probability distributions: f ( y | x ) =

, where f ( y | x )

f ( x)

is the conditional density function. The discrete case is an even more direct application of the

P ( X = x, Y = y )

P ( X = x, Y = y )

definition: P ( X = x | Y = y ) =

and P (Y = y | X = x ) =

.

P (Y = y )

P ( X = x)

Example 6.13 Consider the following joint distribution of two random variables X and Y:

X =1

0.4

0.2

Y =2

Y =4

X =0

0.1

0.3

Solution 6.13 All we need to do is find the conditional probabilities for all possible combinations

of values that X and Y can take. Thus,

P ( X = 1, Y = 2 )

0.4

P ( X = 1| Y = 2 ) =

=

= 0.8

P (Y = 2 )

0.4 + 0.1

P ( X = 1| Y = 4 ) =

P ( X = 0 | Y = 2) =

P ( X = 1, Y = 4 )

0.2

=

= 0.4

P (Y = 4 )

0.2 + 0.3

P ( X = 0, Y = 2 )

P (Y = 2 )

0.1

= 0.2

0.4 + 0.1

P ( X = 0, Y = 4 )

0.3

=

= 0.6

P (Y = 4 )

0.2 + 0.3

If asked, we could find the conditional distribution of Y given X similarly.

P ( X = 0 | Y = 4) =

Page | 14

Now that we have defined conditional probability distributions, we can formally define what it

means for two random variables to be independentand, therefore, more generally, how random

variables may be related or unrelated. Recall that two events A and B are independent if and

only if P ( A B ) = P ( A) P ( B ) , or, equivalently, iff P ( A | B ) = P ( A) , P ( B | A) = P ( B ) . Well,

two discrete random variables are independent iff P ( X = x, Y = y ) = P ( X = x ) P (Y = y ) or

are independent iff f ( x, y ) = f ( x ) f ( y ) or f ( x | y ) = f ( x ) , f ( y | x ) = f ( y ) . Note that the

variables in Example 6.12 were independent, since f ( x ) f ( y ) = 2 x ( 0.5) y = xy = f ( x, y ) .

Just like for marginal distributions, we can also calculate the expectation of jointly or

conditionally distributed random variables. Depending on the joint distribution, this may involve

finding E ( X + Y ) = E ( X ) + E (Y ) , E ( XY ) , or E ( X 2Y ) . Furthermore, the covariance of two

random variables is cov ( X , Y ) = E ( X E ( X ) ) (Y E (Y ) ) = E ( XY ) E ( X ) E (Y ) , so being

able to find E ( XY ) can be especially important. In other words, we can determine how two

variables are correlated from the joint distribution of those variables.

Example 6.14 Find E ( XY ) for the joint distribution from Example 6.12.

Solution 6.14 E ( XY ) =

0 0

xyf ( x, y ) dydx =

0 0

1 2 3 2

8 1

8

x y ) dx = x 2 dx = .

(

0

0 3

3 0

9

x 2 y 2 dydx =

Calculating expectations for conditional distributions is pretty much the same as calculating them

for marginal distributions. The only difference is that we need to use the conditional

probabilities or density functions for the discrete and continuous cases, respectively. That is,

E ( X | Y ) = E ( X | Y = y ) = xP ( X = x | Y = y ) and vice versa for discrete random variables,

all x

while E ( X | Y ) = E ( X | Y = y ) =

all x

Example 6.15 Using the discrete probability distribution in Example 6.13, find E (Y | X = 1) .

Solution 6.15 E (Y | X = 1) = yP (Y = y | X = 1) = 4 P (Y = 4 | X = 1) + 2 P (Y = 2 | X = 1)

all y

0.2 0.4 4 4 8

= 4

+ 2

= + = = 2.67

0.2 + 0.4 0.2 + 0.4 3 3 3

Note that conditional expectations are functions of the variable that is given. Thus, for an

arbitrary value of the random variable X (i.e. X = x) that is given, the conditional expectation is a

function of that arbitrary value of X; in other words, the conditional expectation is itself a

Page | 15

random variable! As a result, it is possible to take the expectation of it again. The Law of

Iterated Expectations tells us that E ( E ( X | Y ) ) = E ( X ) ; in fact, this holds for any function of X

can find unconditional expectations from conditional expectations, if the latter is easier to find

initially. One use of this law is to write random variables covariance in terms of conditional

expectations: cov ( X , Y ) = E ( E ( XY | X ) ) E ( X ) E (Y ) = E ( X E ( Y | X ) ) E ( X ) E ( Y ) .

Page | 16

- Connexions Module: m13466Transféré parveeramahendran
- Statistics-_Elec__.pdfTransféré parAnonymous 4CQj0C5
- FRR 2016-1 Modelling the Effect of Fluctuation in Nonlinear Systems Using Variance AlgebraTransféré parHugo Hernández
- intTransféré parSpica Dim
- 0001Transféré parMohammed AL-Maaitah
- normal distributionTransféré parfadzil
- Sampling distribution.pptTransféré parabhinav_bitspilani8991
- Finals StatTransféré parPing Homigop Jamio
- Ips 2019 Session 5 RvTransféré parnaman chauhan
- Devroye - Complexity Questions in Non-Uniform RNG-compstat10Transféré parsgjogabonitoa
- Module on Standard Scores and the Normal CurveTransféré parlgp990 black
- Normal DistributionTransféré parmyhang1211-1
- Study Guide Ch5FTransféré parAlrick Barwa
- Doc 2Transféré parKemo Tufo
- CHPTR 7Transféré parSiapa Kamu
- 2rTransféré parachalba
- Function of RvTransféré parAbdullah Waqas
- GEORGE - Lectures in Turbulence for the 21st CenturyTransféré para_macaxeira
- Normal DistributionTransféré parmartina jumadil
- ch05 Probability Distribution.pptTransféré parPiyush Pandey
- 06. M.E. Software EnggTransféré parImran
- Lecture 5Transféré parDesmond Seah
- Chapter Eight.docTransféré parsurajmishra1985
- Mathematical Exploration StatisticsTransféré paranhhuyalex
- Ch 10.2, 12.1, 13.1 Review - solutions (1).pdfTransféré parJohn
- Quiz FourTransféré parLakmal Molligoda
- CourseOutline_Nov2014Transféré parNurul Hidayah Azmi
- Statistics Are Used to Describe the WorldTransféré parNadaZahro
- R rex1Transféré parzarcone7
- Innovation Diffusion in Heterogeneous Populations: Contagion, Social Influence, and Social Learning - H. Peyton YoungTransféré pargerarfoks

- Case AnalysisTransféré parPrem Kumar G
- Be Guide 2014Transféré parmarinesagar
- Restructuing-and-Recovery_tcm108-141329.pdfTransféré parOlajide Olanrewaju Adamolekun
- 10-036Transféré parMichael Amato
- CommonDCFErrorsTransféré parhappyfeetiam
- AbbVie Strategy PresentationTransféré parmedtechy
- Healthcare Investment 2014 1479Transféré parOlajide Olanrewaju Adamolekun
- Excel 2013 Expert OutlineTransféré parOlajide Olanrewaju Adamolekun
- The Changing Role of the Chief Financial OfficerTransféré parOlajide Olanrewaju Adamolekun
- HF Case Studies Part 4 MYL Stock PitchTransféré parOlajide Olanrewaju Adamolekun
- Ipev Valuation Guidelines 2015Transféré parOlajide Olanrewaju Adamolekun
- Hilton Hotels ValuationTransféré parOlajide Olanrewaju Adamolekun
- How to Set Up a Butchers ShopTransféré parOlajide Olanrewaju Adamolekun
- 2.1 New Model for Indian Road PPPTransféré parRohit
- Resolving CircularityTransféré parOlajide Olanrewaju Adamolekun
- Box IPO Financial ModelTransféré parranjan_anish
- Private Equity ValuationTransféré parOlajide Olanrewaju Adamolekun
- Analysis for Fina Cial ManagementTransféré parAngel Lagraña
- FR Financial ModelTransféré parOlajide Olanrewaju Adamolekun
- Modeling Aircraft Loan and Lease PortfoliosTransféré parOlajide Olanrewaju Adamolekun
- Statistical DistributionsTransféré parMahidhara Davangere
- Introduction to Probabilistic Simulations in ExcelTransféré parOlajide Olanrewaju Adamolekun
- Financial Model Best Practice WorkshopTransféré parOlajide Olanrewaju Adamolekun
- eBay Investor Deck FinalTransféré parAnonymous Hnv6u54H
- Bessemer Top 10 Laws Ecommerce Oct2010Transféré parOlajide Olanrewaju Adamolekun
- New Method for DCF ValuationTransféré parOlajide Olanrewaju Adamolekun
- Free Cash Flow TemplateTransféré parOlajide Olanrewaju Adamolekun

- The Binomial, Poisson, And Normal DistributionsTransféré pardela2
- Perceptron.pdfTransféré parVel Ayutham
- presentation (1).pdfTransféré parHarshitShukla
- mathapp.079.13Transféré parhenry amiel
- UmlTransféré parPawan
- hw6solTransféré parricharai2312
- Tugas AnalStatTransféré parRonald Widjojo
- Normal Distribution TableTransféré parMegat Azwan
- Norm MetlofTransféré parlokesh kumar
- Bezivin 2004Transféré parNelson Arias Hidalgo
- New multivariate time-series estimators in Stata 11Transféré parAviral Kumar Tiwari
- Capitulo_6Transféré parlilix139
- Advanced Statistical Theory 2 1112Transféré parjiashengrox
- Labsheet SKL 8Transféré pardika
- Hypergeometric Distribution.xlsxTransféré parPaolo Lorenzo G. Macaraeg
- prpTransféré parVijayaram Nagarajan
- Freq Dist 2015 SlidesTransféré parWei Seong
- Proc Arima ProcedureTransféré parBikash Bhandari
- Recurrent Neural Network WikiTransféré parq123456776543212373
- STAT1306_11-12_T05_SlidesTransféré parPhilip Hong
- Question TOCTransféré parmadhunath
- Asgn1_5th Ce It Daa 2019Transféré partempmail
- Standard Normal Distribution tables.pdfTransféré parRohon
- 2019 KDD-Deep Learning for Time-series ForecastingTransféré parPavan Kumar
- Sales ForecastingTransféré parSharath Shetty
- Deep Learning for Human Beings v2Transféré parKristina Melody
- Deep boltzmann machineTransféré parChimmy
- Protein Secondary Structure Prediction using LSTMTransféré parZaki Indra
- 39_1_norm_distTransféré partarek mahmoud
- uml 2.0Transféré parPrajakta Ovhal

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.