slides

© All Rights Reserved

14 vues

slides

© All Rights Reserved

- 38_3_exp_dist
- Guo+Spring+08+P
- Laboratory Work3
- A Course in Mathematical Statistics 0125993153
- Cssbb Instructor Sample
- Cowan Report Art 77
- A Course in Mathematical Statistics 0125993153 p1-91
- Quantitative Management
- Midterm Notes
- Prob Review
- Normal Distribution
- L1 - Overview and Probability (Recovered)
- Communication Theory Sample
- Binomial Dist Fnl
- Probability Distribution
- Math 3031 Syllabus and HW
- Lec Uniform Disttribution
- 39 2 Norm Approx Bin
- Probability Distribution
- Lesson 2-04 Probability Distributions of Discrete Random Variables-2.docx

Vous êtes sur la page 1sur 244

ST3905 - Applied Probability and Statistics

ST5005 - Introduction to Probability and Statistics

ST6030 - Foundations of Statistical Data Analytics

Eric Wolsztynski

eric.w@ucc.ie

Department of Statistics

School of Mathematical Sciences

University College Cork, Ireland

2015-2016

Version 1.0

ST1051-ST3905-ST5005-ST6030

Acknowledgment

These lecture notes make use of former material written by Dr

Kingshuk Roy Choudhury and Dr Supratik Roy for previous course

syllabii. This material largely used [Dekking et al 2005].

in 2014-15 and updated again for 2015-16. Updates are based

mainly on [Rice 1995].

author Eric Wolsztynski.

eric.w@ucc.ie

IPS 2

ST1051-ST3905-ST5005-ST6030

Course information

References

[1] J. A. Rice, Mathematical Statistics and Data Analysis, 2nd Edition, ITP Duxbury Press 1995

[2] J. L. Devore, Probability and Statistics for Engineering and the Sciences, 3rd Edition, Brooks-Cole 1991

[3] F. M. Dekking, C. Kraaikamp, H. P. Lopuha and L. E. Meester, A Modern Introduction to Probability and

Statistics, Springer 2005

[4] B.W. Lindgren, Statistical Theory, Fourth Edition, Chapman & Hall, 1993

[5] D.A. Berry and B.W. Lindgren, Statistics: Theory and Methods, 2nd edition, 1995

http://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/

[7] J. D. Gibbons and S. Chakraborti, Nonparametric Statistical Inference, 4th Edition, Dekker 2014

[8] B. S. Everitt and T. Hothorn, A Handbook of Statistical Analyses Using R, Second Edition, Chapman & Hall

2010

[10] R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical

Computing, Vienna, Austria. URL http://www.R-project.org/.

IPS 3

ST1051-ST3905-ST5005-ST6030

Course information

Timetable

Lectures: Mondays 3-4pm in BHSC G01

Fridays 3-4pm in WGB G05

Fridays 4-5pm in WGB G05

Practicals: ST1051

Monday 4-5pm in lab WGB G34 (TBC)

Tuesday 3-4pm in lab WGB G33 (TBC)

IPS 4

ST1051-ST3905-ST5005-ST6030

Course information

Assessment

ST1051/ST3905:

+ 90-minute exam (80 marks)

ST5005/ST6030:

+ 90-minute exam (50 marks)

IPS 5

ST1051-ST3905-ST5005-ST6030

Course information

Module objective

and Statistics, and explore basic probability and statistical notions

underlying hypothesis-driven data analytic methods.

IPS 6

ST1051-ST3905-ST5005-ST6030

Outline

1 Motivation

5 Limit theorems

6 Statistical Inference

7 Estimation

8 Hypothesis Testing

IPS 7

ST1051-ST3905-ST5005-ST6030

Motivation

Section I

Motivation

IPS 8

ST1051-ST3905-ST5005-ST6030

Motivation

General concepts

Probability? Statistics?

framework for representing real-life phenomena

observations: data-driven approach

IPS 9

ST1051-ST3905-ST5005-ST6030

Motivation

General concepts

statistics.

IPS 10

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

Typical examples

Business, financial mathematics and actuarial science:

decision making, investment strategies

Engineering:

tracking mobile terminals in wireless networks

clinical trials

genomics

IPS 11

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

[Dekking et al 2005]

about one minute after it had taken off from the launch pad

at Kennedy Space Center in Florida

that link rocket boosters)

the engineers recommendation not to launch

IPS 12

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

program, and we can look at the data on the number of failed

O-rings, available from previous launches

Each rocket has three O-rings, and two rocket boosters are

used per launch

O-rings, we also look at the corresponding launch temperature

IPS 13

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

There are 23 dots: one time the boosters could not be recovered

from the ocean; temperatures are rounded to the nearest degree

Fahrenheit; in case of two or more equal data points these are

shifted slightly

IPS 14

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

Modelling...

The probability p(t) that an individual O-ring fails should depend

on the launch temperature t. Use the data to calibrate this model

(a Binomial distribution) and estimate the expected number of

failures, 6p(t).

IPS 15

ST1051-ST3905-ST5005-ST6030

Motivation

Examples

Aftermaths...

needed for a complete failure of the joint, the estimated

probability of failure is 0.023...

failure is 1 (1 0.023)6 = 0.13

IPS 16

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Section II

IPS 17

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Outline

Introduction

Computing probabilities

IPS 18

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Introduction

Probability

mathematically.

complexity.

IPS 19

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Introduction

expresses its likelihood

heads and tails. Sample space = {H, T }

IPS 20

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Introduction

A commuter drives through a sequence of 3 intersections with

traffic lights. Each time, she either stops (s) or continues (c).

The sample space is the set of all possible outcomes:

which month her birthday falls. Sample space:

= {Jan, Feb, Mar , Apr , May , Jun, Jul, Aug , Sep, Oct, Nov , Dec}

in Nice (France) that are greater than a given magnitude, may

also be considered an experiment. What is the sample space

for this experiment?

IPS 21

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Introduction

Common scenario: same experiment performed several times

sample space for the combined experiment is

= 1 2 = {(1 , 2 ) : 1 1 , 2 2 }

If |1 | = r ,|2 | = s, then |1 2 | = rs

IPS 22

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Introduction

Events

Recall: subsets of the sample space are called events

events = outcomes corresponding to a long month (31 days)

L = {Jan, Mar , May , Jul, Aug , Oct, Dec}

Events may be combined according to the usual set operations

R = {Jan, Feb, Mar , Apr , Sep, Oct, Nov , Dec}

Then long months having the letter r are

L R = {Jan, Mar , Oct, Dec}

IPS 23

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Events and set operations

Events

if both L and R occur

which occurs if at least one of the events A and B occurs

A; it occurs if and only if A does not occur

impossible event

IPS 24

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Events and set operations

no outcomes in common; or A B = .

Ex: {the birthday falls in a long month} {Feb} =

AB

De Morgans laws: For any two events A and B,

(A B)c = Ac B c

c c c IPS 25

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Events and set operations

Let:

J be the event John is to blame

J, J c , M, M c :

It is certainly not true that neither John nor Mary is to blame

Morgans laws

IPS 26

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Events and set operations

IPS 27

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

number of ways event A can occur

P(A) =

total number of possible outcomes

a probability function

IPS 28

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

assigns to each event A in a number P(A) in [0, 1] such that

(i) P() = 1

IPS 29

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

Recall:

(i) P() = 1

element of the sample space

than two sets

P(A B C ) = P(A B) + P(C )

= P(A) + P(B) + P(C )

IPS 30

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

dishes, we may toss a coin

likely to occur

because a probability function is defined on events, not on

outcomes

IPS 31

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

the coin, the coin is not completely fair

and success, with probabilities 1 p and p to occur, where

p [0, 1]

only one prize, where success stands for winning the prize,

then p = 104

IPS 32

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Probability

How should we assign probabilities in the experiment where

we ask for the birthday month?

would you assign a probability to each month?

earthquake), it is impossible to assign a positive probability to

each outcome (there are just too many outcomes!)

IPS 33

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

event is obtained by summing the probabilities of the

outcomes belonging to the event

Exercise: compute P(L) and P(R) in the birthday experiment,

where

L = {Jan, Mar , May , Jul, Aug , Oct, Dec}

IPS 34

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

disjoint?

A = (A B) (A B c )

Hence

P(A) = P(A B) + P(A B c )

IPS 35

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

We obtain (A B) B and (A B) B c

Thus

P(A B) = P(B) + P(A B c )

IPS 36

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

P(A B) = P(B) + P(A B c )

rule:

P(A B) = P(A) + P(B) P(A B)

compute probabilities of complements of events:

IPS 37

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Permutation: ordered arrangement of objects

with replacement: nk different ordered samples

without replacement:

n!

Ank = = n(n 1) . . . (n k + 1)

(n k)!

different ordered samples

n! = n(n 1)(n 2) . . . 1

IPS 38

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Combinations:

n n!

Ckn = =

k k!(n k)!

n items

n!

Using Ckn = k!(nk)! implies that order does not matter

n

X

(a + b)n = Ckn ak b nk

k=0

(try with a = b = 1)

IPS 39

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

Ex: At a particular police checkpoint, 20% of females fail a breath

test for drunken driving. The corresponding percentage for males is

40%. Of the individuals tested, 70% are male.

breath test?

breath test?

probability this individual is female?

IPS 40

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Computing probabilities

test for drunken driving. The corresponding percentage for males is

40%. Of the individuals tested, 70% are male...

these proportions in a contingency table:

Gender

Breath test Male Female Total

Pass 420 240 660

Fail 280 60 340

Total 700 300 1,000

IPS 41

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Conditional probability

a particular heart condition, but it has a risk of intoxication (a

serious side-effect that is difficult to diagnose).

For diagnosis purposes, the concentration of digitalis in the blood

is measured in 135 patients and results are arranged as follows:

IPS 42

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Conditional probability

T+ high blood concentration (positive test)

Toxicity

D+ D Total

T+ 25 14 39

T 18 78 96

Total 43 92 135

IPS 43

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Conditional probability

Converting the frequencies to proportions (out of 135):

Toxicity Toxicity

D+ D Total D+ D Total

T+ 25 14 39 T+ .185 .104 .289

T 18 78 96 T .135 .578 .711

Total 43 92 135 Total .318 .682 1.000

If one knows that the test for high blood concentration was

positive, what is the probability of disease (toxicity)?

P(D + T +) 25 .185

P(D+ | T +) = = = = .640 = 64%

P(T +) 39 .289

IPS 44

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

(

P(AB)

P(A | B) = P(B) , if P(B) > 0

0, otherwise

P(A B) = P(A|B)P(B)

Q(A) = P(A | B) for events A ; then Q is a probability

function and hence satisfies all the rules

IPS 45

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

2001: the EC introduced massive testing of cattle to determine

infection with the transmissible form of Bovine Spongiform

Encephalopathy (BSE, mad cow disease) [Dekking et al 2005]

false positives and false negatives

isnt

known to be infected or known to be healthy and so

determine effectiveness of the test.

IPS 46

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

70% chance of testing positive, and a healthy cow just 10%;

ie P(T |B) = 0.70, P(T |B c ) = 0.10

combination with B or with B c (no other possibilities)

P(T ) =P(T B)+ P(T B c )

IPS 47

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Recall:

disjoint events that make up the whole sample space

IPS 48

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Total probability

P(T |B c ) = 0.05

Suppose B1 , B2 ,. . . , Bm are disjoint events such that

m

i=1 Bi =

m

X

P(A) = P(A|Bi )P(Bi )

i=1

IPS 49

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Bayes Theorem

Suppose a cow tests positive; what is the probability it really

has BSE?

P(T B)

P(B|T ) =

P(T )

P(T |B)P(B)

=

P(T |B)P(B) + P(T |B c )P(B c )

So with P(B) = 0.02 we find

P(B|T ) = (0.70 0.02)/(0.70 0.02 + 0.10(1 0.02)) = 0.125

P(B|T ) = 1 and P(B|T c ) = 0. This is Bayes rule.

IPS 50

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Bayes Theorem

Bayes rule:

Suppose the events B1 , B2 , . . . , Bm are disjoint and

m

i=1 Bi = . Then

P(A|Bi )P(Bi )

P(Bi |A) = Pm

j=1 P(A|Bj )P(Bj )

with the law of total probability applied to P(A)

Calculate P(B|T ) and P(B|T c ) if P(T |B) = 0.99 and

P(T |B c ) = 0.05

IPS 51

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Independence

Consider the three probabilities

2% chance it is infected (B)

12.5% chance the cow is infected

likelihood of B

IPS 52

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Independence

the test, and knowing the outcome of the test does not

change our probability of B: P(B|T ) = P(B)

IPS 53

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Independence

Definition:

of B, then

P(A|B) = P(A) follows from the definition of independence

IPS 54

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Independence

Finally, by definition of conditional probability, if A is

independent of B, then

P(A B) P(A)P(B)

P(B|A) = = = P(B)

P(A) P(A)

that is, B is independent of A

one of the following:

P(A | B) = P(A)

P(B | A) = P(B)

P(A B) = P(A)P(B)

where A may be replaced by Ac and B replaced by B c , or both

IPS 55

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

ace and D = card is a diamond.

and P(D) = 1/4

fact independent

IPS 56

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

m

Y

P(m

i=1 Ai ) = P(Ai )

i=1

independent. Then are A and C independent?

IPS 57

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

and C={ the two tosses are equal}

IPS 58

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Conditional probability and independence

Given that the first toss is heads (A occurs), C occurs if and

only if the second toss is heads as well (B occurs), so

1

P(C |A) = P(B|A) = P(B) = = P(C )

2

By symmetry, P(C |B) = P(C )

called pairwise independent

P(A)P(B)P(C ) = 1/8

P(A)P(B)P(C c ) = 1/8

IPS 59

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Random variables

known in advance and are subject to chance (variability).

probability (or mass, depending on the context).

i.e. it is a function. The development of random variables is

associated with measure theory.

IPS 60

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Let be a sample space. A discrete random variable is a

function X : R that takes on a finite number of values

a1 , a2 , . . . , an or an infinite number of values a1 , a2 , . . .

space to a more tangible sample space , whose events

are more relevant

and maximum

= {(1, 1), (1, 2), ..., (1, 6), (2, 1), ..., (6, 5), (6, 6)}

to = {2, . . . , 12}

= {1, ..., 6}

M=maximum transforms to

IPS 61

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

i.e. describe how the probability mass is distributed over

possible values of X

values of X and their corresponding probabilities, and the

sample space is no longer important

(pmf) of X

M=a 1 2 3 4 5 6

p(a) 1/36 3/36 5/36 7/36 9/36 11/36

IPS 62

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

is the function

p : R [0, 1]

defined by

p(a) = P(X = a)

for < a < . If X is a discrete random variable that takes on

the values a1 , a2 , . . ., then

p(ai ) > 0

X

p(ai ) = 1

i

IPS 63

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

the function

F : R [0, 1]

defined by

F (a) = P(X a)

for < a < .

function of a discrete random variable X contain all the

probabilistic information of X

them

IPS 64

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Example plots for M

IPS 65

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

lim F (a) = 1

a+

lim F (a) = 0

a

lim F (a + ) = F (a)

0

event {X b}

distribution function of some random variable

IPS 66

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

that p(a) > 0. Show that

IPS 67

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

function X : R that takes on any value a R

take any value between 0 and 14. We would then evaluate

e.g. the probability that 5.5 X 6.5.

IPS 68

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

The probability density function (pdf) f (x) of X is an

integrable function such that

Z b

P(a X b) = f (x)dx

a

Conditions on f :

f (x) 0 x

R

f (x)dx = 1

Z x

F (x) = f (u)du = P(X x)

IPS 69

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Definition: The expected value of a discrete random variable X

is defined as X

E (X ) = xi p(xi )

xi

X is defined as Z

E (X ) = xf (x)dx

exists.

IPS 70

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Var (X ) = E (X E (X ))2

= E (X 2 ) E (X )2

2

X X

Var (X ) = xi2 p(xi ) xi p(xi )

xi xi

Z Z 2

2

Var (X ) = x f (x)dx xf (x)dx

IPS 71

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

p

(X ) = Var (X )

expressed in metres, then so is (X )

IPS 72

ST1051-ST3905-ST5005-ST6030

Elements of Probability Theory

Random variables and distributions

Expectation:

E (aX ) = aE (X ) a constant

E (XY ) = E (X )E (Y ) if X and Y are independent

E (a + bX ) = a + bE (X ) linearity

E (X + Y ) = E (X ) + E (Y ) linearity

Xn n

X

E[ Xi ] = E [Xi ]

i=1 i=1

Variance:

Var (a + X ) = Var (X ) a constant

IPS 73

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

Section III

IPS 74

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

Outline

IPS 75

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

Binomial experiments

Consider an experiment with outcomes 1 (success) and 0

(failure) five times

This event is given by the set

A = {(0, 0, 0, 0, 1), (0, 0, 0, 1, 0), (0, 0, 1, 0, 0), (0, 1, 0, 0, 0), (1, 0, 0, 0, 0)}

Then P(A) = 5(1 p)4 p, since there are five outcomes in the

event A, each having probability (1 p)4 p

IPS 76

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

Binomial experiments

experiments were successful?

IPS 77

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

The Bernoulli distribution is used to model an experiment with

only two possible outcomes, often referred to as success and

failure, usually encoded as 1 and 0.

distribution with parameter p, where 0 p 1, if its probability

mass function is given by

pX (1) = P(X = 1) = p

and

pX (0) = P(X = 0) = 1 p

IPS 78

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

exam

alternatives (of which only one is correct)

You will pass the exam if you answer six or more questions

correctly

in such a way that the answer of one question is not affected

by the answers of the others

IPS 79

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

Bernoulli / Binomial

1 if i-th answer is correct

Ri =

0 if i-th answer is wrong

P10

The number of correct answers X is given by X = i=1 Ri

Exercise:

Calculate the probability that you answered the first question

correctly and the second one incorrectly

IPS 80

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

each other, we conclude that the events {R1 = a1 },. . . ,

{R10 = a10 } are independent for every choice of the ai , where

each ai is 0 or 1

We have

P(X0 ) = P(R1 = 0, R2 = 0, . . . , R10 = 0)

= P(R1 = 0)P(R2 = 0) . . . P(R10 = 0)

= (3/4)10

The probability that we have answered exactly one question

correctly equals

P(X = 1) = (1/4) (3/4)9 10

IPS 81

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

k 10k

10 1 3

P(X = k) =

k 4 4

n possibilities for the first object

...

objects

IPS 82

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

are composed of the same objects

ordered arrangements

number for ordered by k!

IPS 83

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Binomial distribution

distribution with parameters n and p, where n = 1, 2, . . . and

0 p 1, if its probability mass function is given by

n

pX (k) = P(X = k) = p k (1 p)nk

k

for k = 0, 1, . . . , n

E (X ) = np

Its variance is

Var (X ) = np(1 p)

IPS 84

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Geometric distribution

Example of infinite experiments: the geometric experiment

success or failure

observation

multiply probabilities)

obtain the first success

IPS 85

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Geometric distribution

distribution with parameter p, where 0 < p 1, if its probability

mass function is given by

for k = 1, 2, ....

X 1

E (X ) = kp(1 p)k1 =

p

k=1

Its variance is

1p

Var (X ) =

p2

IPS 86

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Geometric distribution

Geometric distribution

Exercise:

Let X have a Geo(p) distribution. For n 0, show that

P(X > n) = (1 p)n .

IPS 87

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Geometric distribution

Memoryless property

Memoryless property: for n, k = 0, 1, 2, . . . one has

We have:

P({X > k + n} {X > k})

P(X > n + k | X > k) =

P(X > k)

P(X > k + n)

=

P(X > k)

(1 p)n+k

=

(1 p)k

= (1 p)n

= P(X > n)

IPS 88

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Poisson distribution

One may be interested in counts per unit time/space interval

the situation may be modelled by a Poisson distribution

at a call centre per hour

Two assumptions:

Homogeneity: the rate at which events occur is constant

over time/space

independent of each other

E (X ) =

IPS 89

ST1051-ST3905-ST5005-ST6030

Discrete Random Variables

The Poisson distribution

distribution with parameter > 0 if its probability mass function p

is given by

k

p(k) = P(X = k) = e

k!

P k

X k1

E (X ) = k=0 ke

k! = e

(k 1)!

k=1

P j

= e j=0 j! =

The variance can be derived in a similar way:

Var (X ) =

IPS 90

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Section IV

IPS 91

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Outline

Moments

IPS 92

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

continuous scale (e.g. of weight, length, duration, etc.)

process of refinement from discrete random variables

This value may be refined (updated at a smaller scale), and

then the probability p is spread over the outcomes

6.2830, 6.2831, . . . , 6.2839

smaller than p, and the sum of the ten probabilities is p

IPS 93

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

decimals, the probabilities of the possible values of the

outcomes become smaller and smaller, approaching zero

fixed interval [a, b] will settle down

IPS 94

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

A random variable X is continuous if for some function f : R R

and for any numbers a, b with a b,

Z b

P(a X b) = f (x)dx

a

R

and f satisfies f (x) 0 x and f (x)dx = 1. We call f the

probability density function (or probability density) of X .

IPS 95

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

P(a X b) = area under a probability density function f

on the interval [a, b]

IPS 96

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

have: Z a+

P(a X a + ) = f (x)dx

a

a, b constant,

= P(a X < b)

= P(a < X < b)

IPS 97

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

For small > 0,

Z a+

P(a X a + ) = f (x)dx 2f (a)

a

likely it is that X will be near a

Ex: Let

if x 0

0

1

f (x) =

2 x

if 0 < x < 1

if x 1

0

is a probability density function

IPS 98

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

Discrete rvs do not have a probability density function f

{X a} and {a < X b}

directly in terms of F for both cases:

P(a < X < b) = P(X b) P(X a)

= F (b) F (a)

Z b

F (b) = f (x)dx

IPS 99

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

that can be described as an object hits a disc of radius r in a

completely arbitrary way. We are interested in the distance X

between the hitting point and the center of the disc.

F (b) = P(X b) = 0 when b < 0

Since the object hits the disc, we have F (b) = 1 when b > r

that region

IPS 100

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

The inner disc defined by the hitting point has radius b and

area b 2

b 2 b2

We should put F (b) = P(X b) = r 2

= r2

for 0 b r

0 x r,

dF (x) 1 d 2x

f (x) = = 2 x2 = 2

dx r dx r

IPS 101

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

Exercise:

Compute for the darts example the probability that

0 < X r /2, and the probability that r /2 < X r .

IPS 102

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Continuous random variables

variable

X

E [g (X )] = g (x)P(X = x)

x

and Z +

E [g (X )] = g (x)f (x)dx

exist

IPS 103

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Uniform distribution

the outcome is completely arbitrary, except that we know that

it lies between certain bounds

particles of some material

IPS 104

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Uniform distribution

what times the particles are emitted

way (in our physical world anyway)

the same length should have the same probability

IPS 105

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Uniform distribution

its probability density function f is given by

0 if x

/ [, ]

f (x) = 1

for x

Exercise:

Argue that the distribution function F of a rv that has a

U(, ) distribution is given by F (x) = 0 if x < , F (x) = 1

if x > , and F (x) = (x )/( ) for x .

IPS 106

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Uniform distribution

pdf and the distribution function of a U(0, 1/3) distribution:

IPS 107

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Exponential distribution

Describes how long until something happens (e.g. time

between emissions of particles from a radioactive source)

P(X > s + t | X > s) = P(X > t)

Notation: Exp(), with rate > 0

F (a) = 1 e a for a0

Probability density function:

f (x) = e x for x 0

IPS 108

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Exponential distribution

Exponential distribution for various rates (en.wikipedia.org):

IPS 109

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

Illustration

Example: relative frequency histogram of lifetimes of a

computer component

sample)?

IPS 110

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

IPS 111

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

its standard deviation

Notation: N(, 2 )

2

1 1 (x)

f (x) = e 2 2

2

IPS 112

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

The shape of the normal distribution varies according to the

values of and

about the mean

IPS 113

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

To find probabilities, we would need to integrate the pdf

z-values. i.e. right hand tails or P(Z z)

IPS 114

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

x

will be at least ,

where is the mean and is the standard deviation of the normal

variable.

.00 .01 .02 .03 .04 .05 .06 .07 .08 .09

0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641

0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247

0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859

0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483

0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121

0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776

0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451

0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148

0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867

0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611

1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379 IPS 115

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

IPS 116

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

We can also use the standard normal table to find percentiles

of the standard normal distribution

distribution, i.e. P(Z < P0.80 ) = 0.80

Since P(Z > P0.80 ) = 0.20, we rather look for 0.2000 within

the table

We read that

P(Z > 0.84) = 0.2005

P(Z > 0.85) = 0.1977

Therefore 0384 < P0.80 < 0.85

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

distribution...

IPS 118

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

Standardization

and/or 6= 1

probabilities

X

Z= N(0, 1)

This principle is also (implicitly) fundamental in many

statistical inference methods

IPS 119

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

Standardization: example

A life assurance company has established that the lifetimes of

a certain subgroup of policy-holders are normally distributed

with a mean of 72 years and a standard deviation of 4 years,

i.e. the continuous lifetime H N(72, 4)

IPS 120

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

Standardization: example

Standardize:

H 78 72

P(H > 78) = P > = P(Z > 1.50)

4

IPS 121

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

also normally distributed

X1 , X2 , . . . , Xn where Xi N(i , i ) i = 1, ..., n

v

Xn Xn u n

uX

Y = ai Xi N ai i , t ai2 i2

i=1 i=1 i=1

IPS 122

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

B.

The volumes of A are normally distributed with a mean of

50ml and a standard deviation of 1.5ml.

75ml and a standard deviation of 2.5ml.

B.

330ml?

IPS 123

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

The Normal distribution

with a mean of 1700 and a standard deviation of 100.

A random sample of n accounts is taken.

1700 and standard deviation 100.

IPS 124

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Moments

exists, where k is any positive integer.

IPS 125

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Special expectation: the Moment Generating Function (MGF)

X (t) = E [e tX ]

distribution of a random variable.

Z

X (t) = e tx f (x)dx

Z

1

= (1 + tx + t 2 x 2 + . . . )f (x)dx

2!

1

= 1 + tE [X ] + t 2 E [X 2 ] + . . .

2!

IPS 126

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Useful properties:

1 Limit:

d k X (t)

lim = E [X k ]

t0 dt k

2 If X , Y are independent,

X +Y (t) = E [e t(X +Y ) ]

= E [e tX e tY ]

= E [e tX ]E [e tY ]

= X (t)Y (t)

IPS 127

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Examples of MGFs

X Exp():

1 tx x/

Z

X (t) = e e dx

0

1 x[t(1/)]

Z

= e dx

0

" #

1 e x[t(1/)]

=

t (1/)

0

" #

1 e x[t(1/)]

1 1

= lim

x t (1/) t (1/)

1

=

1 t

as long as t < 1/, since the upper limit will vanish

IPS 128

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Examples of MGFs

X N(, 2 ):

Z 2

1 1 (x)

X (t) = e tx e 2 2 dx

2

Z

1 1 x 2 2x+2

= e tx e 2 2 dx

2

Z 2 2 2

1 1 x 2x(+t )+

= e 2 2 dx

2

Z 0 2

(+t 2 )2

2

1 2+ 1 (x )

= e 2 2 2 e 2 2 dx

2

where 0 = ( + t 2 )

Therefore it must integrate to 2

1 2 2 IPS 129

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Characteristic function

The MGF of a random variable does not always exist

X (t) = E [e itX ]

Z

= e itx f (x)dx

X (it) = X (t)

Additive property:

IPS 130

ST1051-ST3905-ST5005-ST6030

Continuous Random Variables

Moments

Cumulants

The log of the characteristic function is used to generate

cumulants n :

X (it)n

log (X (t)) = n

n!

n=1

1 = E [X ]

2 = E [X 2 ] E [X ]2

3 = 2E [X ]3 3E [X ]E [X 2 ] + E [X 3 ]

...

IPS 131

ST1051-ST3905-ST5005-ST6030

Limit theorems

Section V

Limit theorems

IPS 132

ST1051-ST3905-ST5005-ST6030

Limit theorems

Outline

Motivation

Limit theorems

IPS 133

ST1051-ST3905-ST5005-ST6030

Limit theorems

Motivation

distribution of the sum of a large number of mutually

independent rvs may be approximated by a Normal

distribution

distribution in a very large variety of situations

(describing relationships between variables) are some of the

key elements of the theory of Statistics that largely rely on the

normal distribution

IPS 134

ST1051-ST3905-ST5005-ST6030

Limit theorems

Limit theorems

Chebyshevs inequality

Var(Xi ) = 2 . Let

n

1X

Xn = Xi

n

i=1

Var(Xn )

P | Xn |>

2

IPS 135

ST1051-ST3905-ST5005-ST6030

Limit theorems

Limit theorems

Var(Xi ) = 2 . Let

n

1X

Xn = Xi

n

i=1

P | Xn |> 0

as n

IPS 136

ST1051-ST3905-ST5005-ST6030

Limit theorems

Limit theorems

be a rv with cdf F . We say that Xn converges in distribution to X

if

lim Fn (x) = F (x)

n

IPS 137

ST1051-ST3905-ST5005-ST6030

Limit theorems

Limit theorems

variance 2 , and the common distribution function F and MGF M

defined in a neighbourhood of 0. Let

n

X

Sn = Xi

i=1

Sn

lim P x = (x)

n n

where (x) denotes the cdc for the Standard Normal distribution

IPS 138

ST1051-ST3905-ST5005-ST6030

Limit theorems

Limit theorems

Examples:

Pn

Sn n Xi n

Zn = = i=1 N(0, 1)

n n

n

X approx.

Sn 6 = Xi 6 N(0, 1)

i=1

IPS 139

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Section VI

Statistical Inference

IPS 140

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Outline

Sampling

IPS 141

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory Analysis and Descriptive statistics

Probability? Statistics?

IPS 142

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory Analysis and Descriptive statistics

Statistics!

Moneyball (2011)

IPS 143

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

Population parameters

Statistical inference consists in estimating population features

which the mean number of discharges is

N

1 X

= xi = 814.6

N

i=1

The population total (total number of discharges) is

N

X

= xi = N = 320, 138

i=1

Population variance on number of discharges per hospital:

N N

2 1 X 2 1 X

= (xi ) = xi 2 2

N N

i=1 i=1

IPS 144

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

N

There are such samples (without replacement)

n

urn, etc.

IPS 145

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

n

1X

X = Xi

n

i=1

then

= N X

hospital is estimated with

n

1 X

2

s = (Xi X )2

n1

i=1

IPS 146

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

groups (strata)

Ex:

human populations organised in geographical areas

and small)

strata

IPS 147

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

nl

1 X

Xl = Xil l

nl

i=1

L L

X Nl Xl X

X = = Wl Xl

N

l=1 l=1

The estimate of l2 is

l n

1 X

sl2 = (Xil Xl )2

nl 1

i=1

IPS 148

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

Cluster Sampling

clusters

stratified sampling

data collection

IPS 149

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Sampling

Systematic Sampling

sorted) so as to avoid bias

IPS 150

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

The durations of 272 eruptions of the Old Faithful geyser at

Yellowstone National Park, Wyoming, USA, were recorded from

1st to 15th Aug 1985 (in seconds)

IPS 151

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

The durations of 272 eruptions of the Old Faithful geyser at

Yellowstone National Park, Wyoming, USA, were recorded

from 1st to 15th Aug 1985 (in seconds)

randomness is involved

By exploring the dataset we might learn about this

randomness:

which durations are more likely to occur?

dataset?

IPS 152

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

216 108 200 137 272 173 282 216 117 261 110 235 252 105 282

130 105 288 96 255 108 105 207 184 272 216 118 245 231 266

258 268 202 242 230 121 112 290 110 287 261 113 274 105 272

199 230 126 278 120 288 283 110 290 104 293 223 100 274 259

134 270 105 288 109 264 250 282 124 282 242 118 270 240 119

304 121 274 233 216 248 260 246 158 244 296 237 271 130 240

132 260 112 289 110 258 280 225 [......] 200 250 260 270 145 240

250 113 275 255 226 122 266 245 110 265 131 288 110 288 246

238 254 210 262 135 280 126 261 248 112 276 107 262 231 116

270 143 282 112 230 205 254 144 288 120 249 112 256 105 269

240 247 245 256 235 273 245 145 251 133 267 113 111 257 237

140 249 141 296 174 275 230 125 262 128 261 132 267 214 270

249 229 235 267 120 257 286 272 111 255 119 135 285 247 129

265 109 268 (Source: W. Hardle. Smoothing techniques with

implementation in S. 1991; Table 3, page 201. Springer New York)

IPS 153

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

observed durations does not help much

209.3 for the Old Faithful data

is a lot more information in the observed durations

IPS 154

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Ordered durations

96 100 102 104 105 105 105 105 105 105 107 107 108 108 108 108 109

109 109 110 110 110 110 110 110 110 111 111 112 112 112 112 112 112

112 112 113 113 113 113 115 115 116 116 117 118 118 118 119 119 119

120 120 120 120 121 121 121 122 122 124 125 125 126 126 126 128 129

130 130 131 132 132 132 133 134 134 135 135 136 137 138 139 140 141

142 143 144 144 145 145 149 157 158 168 173 174 184 199 200 200 202

205 207 210 210 214 214 216 216 216 216 221 223 224 225 226 226 229

230 230 230 230 230 231 231 233 235 235 235 237 237 238 238 240 240

240 240 240 240 242 242 243 244 244 245 245 245 245 [.....] 274 274

275 275 275 275 276 276 276 276 277 278 278 278 279 280 280 282 282

282 282 282 282 283 284 285 286 287 288 288 288 288 288 288 289 289

290 290 291 293 294 294 296 296 296 300 302 304 306

Middle elements (136th and 137th) = 240, much closer to max

(306) than to min (96) - implies asymmetry

IPS 155

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Numerical summaries

numerical summary of a sample

data, some statistics are more adequate than others

X[1] , . . . , X[n]

IPS 156

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

highlights probabilistic notions

n n

1X X 1

X = Xi = (Xi )

n n

i=1 i=1

n

X

E (X ) = xp(x) (discrete)

Z

n

E (X ) = xf (x)dx (continuous)

IPS 157

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Median:

2

(interpolation)

IPS 158

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

statistics are:

For centrality: the sample mean X and/or median X[ n+1 ]

2

qn (0.25) = Q1 (X ) = X[ n+1 ]

4

qn (0.75) = Q3 (X ) = X[ 3(n+1) ]

4

n

1 X

sn2 = (Xi Xn )2

n1

i=1

IPS 159

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Numerical summaries

include:

min(X ) (or alternative)

Q1 (X )

median(X )

Q3 (X )

IPS 160

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Often with real datasets one has to deal with outlying values

of robust statistics:

use the median rather the the sample mean

IPS 161

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Ex: sepal width on the Iris data (50 flowers from each of 3 species

of iris)

Iris data (2nd component)

4.0

3.5

3.0

2.5

2.0

New S Language. Wadsworth & Brooks/Cole

IPS 162

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Histogram reveals the asymmetry of the dataset and the fact that

the elements accumulate somewhere near 120 and 270, which was

not clear from the list of values

IPS 163

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Drawing a histogram

Whenever feasible let the software do it!

and modify its calibration

Essential features:

1 Total area under graph taken to represent 1

standard deviation

IPS 164

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

IPS 165

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

325 55 242 68 422 180 10 1146 600 15 36 4 0 8 227 65 176 58

457 300 97 263 452 255 197 193 6 79 816 1351 148 21 233 134

357 193 236 31 369 748 0 232 330 365 1222 543 10 16 529 379 44

129 810 290 300 529 281 160 828 1011 445 296 1755 1064 1783

860 983 707 33 868 724 2323 2930 1461 843 12 261 1800 865

1435 30 143 108 0 3110 1247 943 700 875 245 729 1897 447 386

446 122 990 948 1082 22 75 482 5509 100 10 1071 371 790 6150

3321 1045 648 5485 1160 1864 4116

Source: J.D. Musa, A. Iannino, and K. Okumoto. Software

reliability: measurement, prediction, application. McGraw-Hill,

New York, 1987; Table on page 305

IPS 166

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Cumulative representation of the data

1

Fn (x) = (number of elements in the dataset x)

n

IPS 167

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Boxplot

Another way of summarising the underlying data distribution

Faithful data

90

80

70

60

50

IPS 168

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Scatterplot

(x1 , y1 ), (x2 , y2 ), . . ., (xn , yn ) (bivariate dataset)

IPS 169

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Scatterplot

Example: daily readings of air quality values in NYC,

1st May - 30th Sept 1973 (R dataset airquality)

hours at Roosevelt Island

40007700 Angstroms from 0800 to 1200 hours at Central Park

1000 hours at LaGuardia Airport

La Guardia Airport.

IPS 170

ST1051-ST3905-ST5005-ST6030

Statistical Inference

Exploratory data analysis

Air quality, NYC, May-Sep 1973 Air quality, NYC, May-Sep 1973

20

90

Temperature (degrees F)

15

80

10

70

5

60

IPS 171

ST1051-ST3905-ST5005-ST6030

Estimation

Section VII

Estimation

IPS 172

ST1051-ST3905-ST5005-ST6030

Estimation

Outline

Statistical Inference

Estimation

Confidence intervals

Linear regression

IPS 173

ST1051-ST3905-ST5005-ST6030

Estimation

Statistical Inference

Statistical inference

Detection:

Discrete probabilities (most of the time)

Estimation:

Discrete or continuous probabilities

(e.g. the mass of an electron)

IPS 174

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Why estimation?

estimators is closer to the parameters of interest

from the parameter?

IPS 175

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Estimators

Let t = h(x1 , x2 , . . . , xn ) be an estimate based on the dataset

x1 , x2 , . . . , xn

T = h(X1 , X2 , . . . , Xn )

estimation

value computed from a dataset

IPS 176

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

sample observations that gives us an estimated value for the

unknown parameter

the mean and variance 2 from the Normal distribution

N(, 2 )

The two most common criteria used are (a) Unbiasedness (b)

Minimum Variance

IPS 177

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Unbiasedness: If T is an estimator for , then T is called

unbiased if

E [T ] =

minimum variance, then under regular conditions T achieves

the best possible estimation accuracy

unbiased estimator for its expected value

for

IPS 178

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Let

L() = f (x1 , x2 , . . . , xn ; ),

be the joint pdf of X1 , X2 , . . . , Xn

at which L() is maximum is called a maximum likelihood

estimate (MLE) of

= max f (x1 , x2 , . . . , xn ; )

f (x1 , x2 , . . . , xn ; )

IPS 179

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

MLE:

= max f (x1 , x2 , . . . , xn ; )

f (x1 , x2 , . . . , xn ; )

conditions on the density or pmf:

ML converges to the true in probability as sample size

increases

as sample size increases n( ) converges to a Normal

distribution with mean 0

IPS 180

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Ex: the sample is a realization of random variables

X1 , X2 , . . . , Xn , with n=135, and Xi EXP()

The pdf is

1

f (x, ) = e x/ , x, > 0

Joint pdf is

Y

L() = f (xi , )

i

n

Y 1 xi /

= e

i=1

n ni=1 xi /

P

= e

IPS 181

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

Pn

i=1 xi

ln L() = n ln

Note that ln L() is a differentiable function of and

Pn

d ln L() n xi

= + i=1 2

=0 = x

d

Further, Pn

d 2 ln L()

n i=1 xi

= 2 2 <0

d2 3

IPS 182

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

among all possible Unbiased estimators [But beyond our scope

now]

IPS 183

ST1051-ST3905-ST5005-ST6030

Estimation

Estimation

So we have a supposedly good estimator, X

manner from one sample to another if we repeat the

experiment

the true value lies with a high probability

distribution of the estimator statistic

IPS 184

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

Confidence Intervals

A confidence interval is an interval in which we are very

confident the population parameter of interest lies

mean), we calculate a single estimate, known as a point

estimate

to give a measure of the sampling error

calculate a confidence interval

IPS 185

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

The confidence interval for the mean X is obtained based on

the Normal distribution N(. 2 )

X Z

n

In practice the standard deviation is not known, and one

usually uses its sample estimate instead

level is 100%-95%=5%

yielding a rhs area of half the significance level

IPS 186

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

Ex: for a 95% CI, one needs to remove the most extreme

2.5% from each tail of the distribution

(Z , Z ) = (1.96, 1.96)

0.4

0.3

Density

0.2

0.1

0.0

4 2 0 2 4

X 2

Z=

n

IPS 187

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

travel agency

The mean value was 732.16 and the standard deviation was

83.14

s 83.14

X Z = 732.16 1.96 = 732.16 23.05

n 50

= (709.11,755.21)

transaction value lies between 709.11 and 755.21

IPS 188

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

Confidence Intervals

higher confidence level, more values must be included

sample size and the level of confidence

IPS 189

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

sub-population

+

a proportion given a sample of n observations is p = nn

r

p)

p(1

p Z

n

IPS 190

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

ward, 65 were in favour of a proposed amendment to the

constitution

65

We have p = 250 = 0.26, Z = 2.575 and

r r

p)

p(1 0.26 0.74

pZ = 0.262.575 = 0.26.0714

n 250

This means that we can be 99% confident that the true

(population) proportion of people in favour of a proposed

amendment to the constitution is between 18.66% and

33.14%

IPS 191

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

When we estimate a population feature using a statistic, we

do not know in advance how wide the CI will be

be to wide to be meaningful

be unnecessarily narrow, meaning valuable resources were

wasted in the process

variability, we can calculate the sample size required to

estimate the population feature to within a stated range

(precision) with a stated level of confidence

IPS 192

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

sample size required to estimate a mean to within an

allowable error is

Z 22

n = 2

Usually, the population standard deviation will not be known

but an estimate, say s, may be available e.g. from a pilot

study or from similar previous studies

accuracy of the previous estimate of the standard deviation

IPS 193

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

customers was found to have a standard deviation of 10

population mean bank charge to within 1.50 with 95%

confidence?

2

1.96 10

n= = 170.74

1.50

IPS 194

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

sub-population

an allowable error and with (100 2)% confidence is

determined by

Z 2 p(1

p)

n= 2

The allowable error is expressed as a decimal, i.e. (0, 1)

IPS 195

ST1051-ST3905-ST5005-ST6030

Estimation

Confidence intervals

about an estimated proportion p = 0.20 to be too wide

proportion to within 4 percentage points with 95%

confidence...

n= = 384.16

0.042

i.e. we must take n = 385

IPS 196

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

Regression

Let Y be a random variable and x a deterministic variable

(that is, non-random)

a mathematical relationship that expresses Y in terms of x

called the dependent or response variable

propose is of the form

Y = 0 + 1 x +

IPS 197

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

Yi = 0 + 1 Xi + i

variables i are independent

same for all values of i

Note also that

the Yi s are observed

IPS 198

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

1 , that is, the minimum variance unbiased estimators of 0

and 1 , are obtained using the method of least squares

n

X n

X

SS = 2i = (Yi 0 1 xi )2

i=1 i=1

squares are the values of 0 and 1 that minimize the sum SS

IPS 199

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

n

SS X

= 2 (Yi 0 1 xi ) = 0

0

i=1

n

SS X

= 2 xi (Yi 0 1 xi ) = 0

1

i=1

Pn Pn

(x x)(Yi Y ) x Y

i=1 xi Yi n

1 = Pn i

i=1

= n 2

)2 x2

P

i=1 (xi x i=1 xi n

0 = Y 1 x

IPS 200

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

alloy depends on the percentage of zinc it contains. We have

the following data:

Tensile strength 1.2 1.4 1.5 1.5 1.7

where x is the percentage of zinc and Y is the tensile strength

IPS 201

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

P5 2

P5

x= 4.9, y = 1.46, i=1 xi =120.15 and i=1 xi yi = 35.88

Then,

P5

xi yi 5

x y 35.88 5(4.9)(1.46)

1 = Pi=1

5

= = 1.1

2

i=1 xi 5 x2 120.15 5(4.9)2

and

0 = y 1 x = 1.46 (1.1)(4.9) = 3.93

IPS 202

ST1051-ST3905-ST5005-ST6030

Estimation

Linear regression

Air quality, NYC, May-Sep 1973 Air quality, NYC, May-Sep 1973

20

90

Temperature (degrees F)

15

80

10

70

5

60

IPS 203

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Section VIII

Hypothesis Testing

IPS 204

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Outline

Two-sample tests

Goodness-of-fit tests

Summary

IPS 205

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

Hypothesis Testing

We know that X is an unbiased estimator

for the unknown parameter

values that are not dependent on the sample

is dependent on the sample]

from the sample, can we start out by making assumptions

about the range of possible values and then test the

assumption based on the sample?

IPS 206

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

is equal to 700?

sample of observations

NB: we cannot assume that, say, < 500, since is not the

rate of an Exponential distribution

IPS 207

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

Forming Hypotheses

values for the true unknown parameter

IPS 208

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

set represented by H0

IPS 209

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

Forms of hypothesis Reject if

H0 : = 0 vs Ha : = 1 , 0 < 1 X > k

H0 : = 0 vs Ha : = 1 , 0 > 1 X < k

H0 : = 0 vs Ha : 6= 1 , X < k1 or X > k2

H0 : < 0 vs Ha : > 0 X > k

H0 : = 0 vs Ha : > 0 X > k

H0 : = 0 vs Ha : < 0 X < k

Note that

P[N(0, 1) < 1.645] = P[N(0, 1) > 1.645] = 0.05

P[N(0, 1) < 1.96] = P[N(0, 1) > 1.96] = 0.025

IPS 210

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

Errors in detection

Recall: one seeks to retain or reject a null hypothesis H0 on the

basis of evidence. Let us denote H1 the alternative hypothesis.

H0 is true H1 is true

H0 is accepted Correct decision Type II error

H1 is accepted Type I error Correct decision

IPS 211

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

Test statistic

We test the hypotheses based on the sample

not involve the unknown

Idea: if

the statistic in question, say T, is an unbiased estimator for

that weak Law of Large Numbers applies

then for large samples T will be quite close in probability to

true value

IPS 212

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

p-value

The test procedure becomes: Reject H0 if T > tc for some

unknown but computable tc

reject H0 . Now how to decide on tc ?

The p-value is the probability of obtaining a value of the test

statistic

at least as extreme as the one computed form the sample data

therefore the more evidence there is against it

IPS 213

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Concepts in hypothesis testing

In the example, to set up the test:

use an unbiased estimator, e.g.T = X

= P(T > tc | = 700)

= 0.05

i.e.

P(|X| > tc | = 700) = 0.05

IPS 214

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

Let us consider the case where the hypothesized value 0 is

an upper bound on the true population mean

H0 : 0

Where the population variance is known, one defines

the z-test statistic for a sample of size n as

x 0

z=

/ n

(for a Normal population or n > 30)

percentile of the Normal distribution (e.g. P95 = 1.645)

tail test, H0 : 0 is rejected when z z .

IPS 215

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

0.4

P( Z > 1.645, H0) = 0.05

0.3

Density

0.2

0.1

0.0

2 0 2 4

x 0

z=

n

IPS 216

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

z-test in R

carried out by comparing the test statistic

z = (xbar-mu0)/(sigma/sqrt(n))

with the critical value set e.g. for alpha=.05:

z.alpha = qnorm(1-alpha)

pval=pnorm(z, lower.tail=FALSE)

pval > alpha

IPS 217

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

t-test instead of the z-test. The t-test statistic for a sample

of size n is defined using the sample standard deviation s as

x 0

t= , df = n 1

s n

P100 percentile of the Student t-distribution with n 1 dfs

tail test, H0 : 0 is rejected when t t .

IPS 218

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

X N , 2 /n

X12 + + Xn2 2 (n 1)

X

t(n 1)

s/ n

use known probabilities to derive the p-value

IPS 219

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

is also called the degrees of freedom where m 1 is an

integer, if its probability density is given by

m+1

x2 2

f (x) = km 1 +

n

for x R, where

(m + 12 )

km =

(m/2) m

and Z

(u) = e x x u1 dx

0

IPS 220

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

P(t(n 1) < zL ) = P[(t(n 1) > zU ) = 0.025

Everything

else is the same as in the

known- case:

P X zU sn < < X zL sn = 0.95

X = 23.78778, s = 0.07827513, n = 23

Using R:

qt(0.025,22) = -2.073873

qt(0.975,22) = 2.073873

0.08 0.08

23.79 2.07 , 23.79 + 2.07 = (23.78, 23.79)

23 23

IPS 221

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population mean

t-test in R

t.test(x, y = NULL,

alternative = c(two.sided, less, greater),

mu = 0, paired = FALSE, var.equal = FALSE,

conf.level = 0.95, ...)

t.test(1:10,y=c(7:20)) # p-value = .00001855

t.test(1:10,y=c(7:20, 200)) # p-value = .1245

t.test(1:10,y=c(7:20), alt=less) # comment?

t.test(1:10,y=c(7:20), alt=greater) # comment?

IPS 222

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population proportion

The null hupothesis of an upper-tail test of population

proportion is formulated as

H0 : p 0 p

where p0 is an hypothesized upper bound on the true

population proportion p

n(1 p0 ) are > 10, the one-proportion z-test is defined as

p p0

z=p

p0 (1 p0 )/ n

The null hypothesis is to be rejected when z z , where z is

the 100(1 ) percentile of the standard Normal distribution

IPS 223

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, one-sided tests of the population proportion

Testing proportions in R

carried out by comparing the test statistic

z = (pbar-p0)/sqrt(p0*(1-p0)/n)

with the critical value set e.g. for alpha=.05:

z.alpha = qnorm(1-alpha)

pval=pnorm(z, lower.tail=FALSE)

pval > alpha

IPS 224

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

z z/2 or z z/2

pval = 2 * pnorm(z, lower.tail=FALSE)

t.test(x, alternative=two.sided)

There is actually no need to specify it as it is the default value.

IPS 225

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

follows (e.g. at the 5% significance level):

z = (pbar-p0) / sqrt(p0*(1-p0)/n)

alpha = .05

z.half.alpha = qnorm(1-alpha/2)

pval = 2 * pnorm(abs(z), lower.tail=FALSE)

IPS 226

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

Combine these values into a confidence statement about the

true gross calorific content of Osterfeld 262DE27?

equivalently, the standard deviation is known

free of the unknown parameter

Using standardization,

X

N(0, 1)

/ n

IPS 227

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

When a shipment of coal is traded, a number of its properties

should be known accurately, because the value of the

shipment is determined by them

megajoules per kilogram, MJ/kg)

normal, with a standard deviation of about 0.1 MJ/kg

receive ISO certificates

measurements for a shipment of Osterfeld coal coded

IPS 228

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

23.940 23.830 23.877 23.700 23.796 23.727 23.778 23.740

23.890 23.780 23.678 23.771 23.860 23.690 23.800

Interlaboratory study programme ILS coal

characterizationreported data. Technical report, NMi Van

Swinden Laboratorium B.V., The Netherlands, 1996 ]

IPS 229

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

X

P zL < < zU = P zL < X < zU

/ n n n

= P X zU < < X zL

n n

= 0.95

IPS 230

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

One-sample, two-sided tests

Using the given = 0.1 and = 0.05, we find the 95% CI:

0.1 0.1

23.788 1.96 , 23.788 + 1.96

23 23

i.e.

(23.747, 23.829) MJ/kg

IPS 231

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Two-sample tests

Two-sample z-test

samples w.r.t. an hypothesized difference in means d0 , using a

test statistic of the form

x1 x2 ) d0

(

z= q 2

1 22

n1 + n2

IPS 232

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Two-sample tests

Paired t-test

In a paired t-test, one compares the mean d of the differences

between two samples with an hypothesized difference in

means d0 , using a test statistic of the form

d d0

t= , df = n 1

s/ n

Recall synopsis for the t-test:

t.test(x, y = NULL,

alternative = c(two.sided, less, greater),

mu = 0, paired = FALSE, var.equal = FALSE,

conf.level = 0.95, ...)

the same length. Example:

t.test(x, y, alt=less, paired=TRUE)

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Two-sample tests

Other tests in R

F-test to compare the variances of two samples from normal

populations: var.test()

x <- rnorm(50, mean = 0, sd = 2)

y <- rnorm(30, mean = 1, sd = 1)

var.test(x, y)

z <- rnorm(30, mean = 0, sd = 2)

cor.test(y, z)

wilcox.test() (means), ks.test() (Normality), ...

IPS 234

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Goodness-of-fit tests

often needed

mass) function fX (x) is unknown

against the alternative hypothesis HA :fX (x) 6= f0 (x) where f0

is a given distribution function

frequencies are consistent with those expected under f0 (x)

IPS 235

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Goodness-of-fit tests

exhaustive classes (or intervals)

(n m )2

Calculate D 2 = kj=1 j mj j where nj is number of obs in j-th

P

3

class and mj the expected frequency under H0

(approximately), where r is the number of unknown

parameters of the function f0 (x) that we must estimate

IPS 236

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Goodness-of-fit tests

Suppose we have a random sample from a discrete random

variable summarized in the following table

frequency 31 33 22 12 2

Poisson distribution with = 1 at significance level =0.05

0

Compute probabilities P[X = 0]=e 0! =0.3678 and multiply

by sample size n to get an estimated proportion of the

number of 0s expected if it really came from Poi(1)

and multiply by n = 100

IPS 237

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Goodness-of-fit tests

frequency 31 33 22 12 2

estimated freq 36.78 36.78 18.39 6.13 1.899

k

X (nj mj )2

D2 =

mj

j=1

IPS 238

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Testing for significance in linear regression

Tests are based on two quantities:

n

X

SSx = (xi x)2 = nsx2

i=1

squared errors (or residuals)

n

X

SSE = (Yi Yi )2

i=1

SSE

MSE =

n2

IPS 239

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Testing for significance in linear regression

Regression Tests

0 00

T0 := q P 2 tn2

MSE i xi

SSX n

|T0 | > t/2,n2 if HA : 0 6= 00

T0 > t,n2 if HA : 0 > 00

T0 < t,n2 if HA : 0 < 00

IPS 240

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Testing for significance in linear regression

Regression Tests

1 10

T0 := q tn2

MSE

SSX

|T0 | > t/2,n2 if HA : 0 6= 00

T0 > t,n2 if HA : 0 > 00

T0 < t,n2 if HA : 0 < 00

IPS 241

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Testing for significance in linear regression

n

X

SST = (Yi Y )2

i=1

Xn

= Yi2 nY 2

i=1

SSR = 12 SSX

= SST SSE

IPS 242

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Testing for significance in linear regression

Test H0 : 1 = 0 against HA : 1 6= 0

P

q

SSR

Note that the test statistic reduces to MSE or

s

0.121

T0 = = 33 = 5.744563

0.011/(5 2)

hypothesis

IPS 243

ST1051-ST3905-ST5005-ST6030

Hypothesis Testing

Summary

Summary

1 What we need to do a test:

1 null and alternative hypotheses

2 a test statistic T

3 a significance level

reject H0 )

3 Two possible errors:

1 Type I error =P[Reject H0 H0 True]

It is the Type I error which is fixed by setting it equal to the

significance level

IPS 244

- 38_3_exp_distTransféré parEbookcraze
- Guo+Spring+08+PTransféré parAlyn Lee 南宫竹轩
- Laboratory Work3Transféré parSufyan Nikanin
- A Course in Mathematical Statistics 0125993153Transféré parzheinos
- Cssbb Instructor SampleTransféré parTripuraneni Vidya Sagar
- Cowan Report Art 77Transféré parChris Herzeca
- A Course in Mathematical Statistics 0125993153 p1-91Transféré pariman_umb
- Quantitative ManagementTransféré parMalaika Baptist
- Midterm NotesTransféré parasdfddddddd
- Prob ReviewTransféré pararash_gourtani
- Normal DistributionTransféré paryashar2500
- L1 - Overview and Probability (Recovered)Transféré parJohn
- Communication Theory SampleTransféré parManoj1508Aravind
- Binomial Dist FnlTransféré parIshaan Sharma
- Probability DistributionTransféré parJoyeeta Roy
- Math 3031 Syllabus and HWTransféré parDaniel Lee Eisenberg Jacobs
- Lec Uniform DisttributionTransféré parRana Gulraiz Hassan
- 39 2 Norm Approx BinTransféré parsound05
- Probability DistributionTransféré parmahi_111
- Lesson 2-04 Probability Distributions of Discrete Random Variables-2.docxTransféré parJamiefel Pungtilan
- Probablity ConceptTransféré parRavi Yadav
- Application 9Transféré parDr. Ir. R. Didin Kusdian, MT.
- Investment Study ExampleTransféré parjuanfrancisco89
- (Precourse) Statistics for EconomistTransféré parNattawut Noree
- section7.2Transféré parAnonymous 8erOuK4i
- Exercises Test 6Transféré parthermopolis3012
- Demystifying Gaussian Distribution (2)Transféré parDebashis Ghosh
- ENME 392 - Homework 12 - Fa13_solutionsTransféré parSam Adams
- Trial_2004.docTransféré parmanan_09
- 1Transféré parEstella Yang

- 5564 Quantitative TechniquesTransféré parAdeel Rao
- Random VariableTransféré parpeeyushtewari
- Discrete Probability DistributionTransféré parKent G Nacaytuna
- chapter4.pptTransféré parAnonymous ecgeyhV
- normal distr ITransféré parÜlle Koduste
- Doc 3Transféré paranon_373541876
- probreview.pdfTransféré parHasmaye Pinto
- Sila BusTransféré parMiljan Kovacevic
- A.M.fraser_Hidden Markov Models and Dynamical SystemsTransféré parНиколай Бобков
- Advance numerical analysisTransféré parRavi Shankar
- Math1041 Study Notes for UNSWTransféré parOliver
- DIPQNAUNITIIITransféré parbenharece
- Conditional ProbabilityTransféré parYousuf Hussain
- Time Series NotesTransféré parGao Jiong
- ch09.randvarTransféré paramisha2562585
- PROBABILITY THEORY & STOCHASTIC PROCESSES.pdfTransféré parShareef Khan
- STA2023 Fall 2014 Syllabus Both SectionsTransféré parPaula West
- Appendix ATransféré parabriyo
- Collaborative Statistics_ Symbols and Their MeaningsTransféré parShijo Thomas
- M.E. ElectronicsTransféré paraksaltaf9137
- Probability AssignmentTransféré parAbdurrahman Khan
- 00 Lecture Notes(1)Transféré parNaphtali Ochonogor
- Lecture Notes StatisticsTransféré parPawan Yadav
- Dinámica de sistemas: BucksTransféré parPuma3D
- StatisticsTransféré par11111__11__1111
- NITRR_syllabusTransféré parDeepak Kumar Rai
- simulationTransféré parvenkatesh
- Difference Between Discrete and Continuous VariableTransféré parChristine Claire Algodon Baldo
- Basic Probability ReviewTransféré parcoolapple123
- Probability Theory and Random Processes_Prof_Shital ThakkarTransféré parErika Peralta