Vous êtes sur la page 1sur 48

Elements of Probability and Statistics

Probability Theory provides the mathematical models of phenomena governed by chance. Examples of such phenomena include weather, lifetime of batteries, traffic congestion, stock exchange indices, laboratory measurements, etc. Statistical Theory provides the mathe- matical methods to gauge the accuracy of the probability models based on observations or data. The remaining Lectures are about this topic. “Essentially, all models are wrong, but some are useful.” — George E. P. Box.

Contents

1 Sets, Experiments and Probability

 

3

1.1 Rudiments of Set Theory

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

3

1.2 Experiments .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5

1.3 Probability

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

7

1.4 Conditional Probability

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

11

2 Random Variables

 

15

2.1 Discrete Random Variables and their Distributions

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

18

2.1.1 Discrete uniform random variables with finitely many possibilities

.

.

19

2.1.2 Discrete non-uniform random variables with finitely many possibilities

20

2.1.3 Discrete non-uniform random variables with infinitely many possibilities 22

2.2 Continuous Random Variables and Distributions .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

25

3 Expectations

33

4 Tutorial for Week 1

38

4.1 Preparation Problems (Homework)

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

4.2 In Tutorial Problems

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

39

5 Tutorial for Week 2

43

5.1 Preparation Problems (Homework)

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

43

5.2 In Tutorial Problems

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

43

List of Tables

1 f ( x ) and F ( x ) for the sum of two independent tosses of a fair die RV X .

 

.

.

21

2 DF Table for the Standard Normal

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

3 Quantile Table for the Standard Normal

 

.

.

.

.

.

.

.

.

.

.

.

.

48

1

List of Figures

.

.

3 f ( x ) and F ( x ) of RV X for the sum of two independent tosses of a fair die

1

2

f(x) = P(x) =

6 and F ( x ) of the fair die toss RV X of Example 2.4 .

.

1

.

.

.

.

f ( x ) and F ( x ) of an astragali toss RV X of Example 2.6

.

.

.

.

.

.

.

.

19

21

21

4 Probability density function of the volume of rain in cubic inches over the

lecture theatre tomorrow.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5 PDF and DF of Normal(µ, σ 2 ) RV for different values of µ and σ 2

2

.

.

.

.

.

.

.

.

.

.

.

.

26

30

1

Sets, Experiments and Probability

1.1 Rudiments of Set Theory

1.

A set is a collection of distinct objects or elements and we enclose the elements by curly braces. For example, the collection of the two letters H and T is a set and we denote it by {H, T}. But the collection {H, T, T} is not a set (do you see why? think distinct!). Also, recognise that there is no order to the elements in a set, i.e. {H, T} is the same as {T, H}.

2.

We give convenient names to sets. For example, we can call the set {H, T} by A and write A = {H, T} to mean it.

3.

If a is an element of A, we write a A.

For example, if A = {1, 2, 3}, then 1 A.

4.

If a is not 13 / A.

an element of A, we write a /

A.

For example, if A = {1, 2, 3}, then

5.

We say that a set A is a subset of a set B if every element of A is also an element of

B

and write A B. For example, {1, 2} ⊆ {1, 2, 3, 4}.

 

6.

We say that a set A is not a subset of a set B if at least one element of A is not an element of B and write A B. For example, {1, 2} is not a subset of {1, 3, 4} since

2

{1, 2} but 2 / {1, 3, 4} and write {1, 2} {1, 2, 3, 4} to mean this.

 

7.

We say a set A is equal to a set B and write A = B if and only if A B and B A

8.

The union A B of A and B consists of elements that are in A or in B or in both A

and B. For

example, if A = {1, 2} and B = {3, 2} then A B = {1, 2, 3}.

9.

The intersection A B of A and B consists of elements that are in both A and B. For example, if A = {1, 2} and B = {3, 2} then A B = {2}.

10.

The empty set contains no elements and it is the collection of nothing. It is denoted by = {}.

11.

Given some universal set, say Ω, the Greek letter Omega, the Complement of a set

A

denoted by A c

is the set of all elements in Ω that are not in A.

For example, if

Ω = {H, T} and A = {H} then A c = {T}.

Note that for any set A Ω:

 

A c A = ,

A A c = Ω,

c = ,

c = Ω

.

12.

When we have more than two sets, we can define unions and intersections similarly. The union of m sets

m

j=1

A j = A 1 A 2 ∪ ··· ∪ A m

3

consists of elements that are in at least one of the m sets A 1 , A 2 , union of infinitely many sets

j=1

A j = A 1 A 2 ∪ ··· ∪ ···

, A m , and the

consists of elements that are in at least one of the sets A 1 , A 2 , Similarly, the intersection

m

j=1

A j = A 1 A 2 ∩ ··· ∩ A m

of m sets consists of elements that are in each of the m sets and the intersection of infinitely many sets

j=1

A j = A 1 A 2 ∩ ···

consists of elements that are in each of the infinitely many sets.

Exercise 1.1 Let Ω = {1, 2, 3, 4, 5, 6}, A = {1, 3, 5} and B = {2, 4, 6}. By using the definitions of sets and set operations find the following sets:

A c =

B c =

c =

c =

{1} c = {

}

A B

=

A B

A Ω =

A Ω =

B

Ω =

B

= Ω =

A A c =

B

B c =

etc.

Venn diagrams are visual aids for set operations.

Example 1.1 For three sets A, B and C, the Venn diagrams for AB, AB and AB C are:

A B Ω (a) A ∪ B
A
B
Ω
(a) A ∪ B
A B Ω (b) A ∩ B
A
B
Ω
(b) A ∩ B

4

A B C Ω (c) A ∩ B ∩ C
A
B
C
Ω
(c) A ∩ B ∩ C

Exercise 1.2 Let A = {1, 3, 5, 7, 9, 11}, B = {1, 2, 3, 5, 8, 13} and

C = {1, 2, 4, 8, 16, 32}

denote three sets. Let us use a Venn diagram to visualise these three sets and their intersec- tions. Can you mark which sets correspond to A, B and C in the figure below.

sets correspond to A , B and C in the figure below. 1.2 Experiments Definition 1.1

1.2

Experiments

Definition 1.1 An experiment is an activity or procedure that produces distinct, well- defined possibilities called outcomes. The set of all outcomes is called the sample space and is denoted by Ω, the upper-case Greek letter Omega. We denote a typical outcome in Ω

by ω, the lower-case Greek letter omega, and a typical sequence of possibly distinct outcomes

by ω 1 , ω 2 , ω 3 ,

Example 1.2 Ω = {Defective, Non-defective} if our experiment is to inspect a light bulb.

Example 1.3 Ω = {Heads, Tails} if our experiment is to note the outcome of a coin toss.

In Examples 1.2 and 1.3, Ω only has two outcomes and we can refer to the sample space of such two-outcome experiments generically as Ω = {ω 1 , ω 2 }. For instance, the two outcomes of of Example 1.2 are ω 1 = Defective and ω 2 = Non-defective while those of Example 1.3 are ω 1 = Heads and ω 2 = Tails.

Example 1.4 If our experiment is to roll a die whose faces are marked with the six numerical symbols or numbers 1, 2, 3, 4, 5, 6 then there are six outcomes corresponding to the number that shows on the top. Thus, the sample space Ω for this experiment is {1, 2, 3, 4, 5, 6}.

5

Exercise 1.3 Suppose our experiment is to observe whether it will rain or shine tomorrow.

What is the sample space for this experiment? Answer: Ω = {

}.

The subsets of Ω are called events. The outcomes ω 1 , ω 2 ,

such as, {ω 1 }, {ω 2 },

., are simple events.

., when seen as subsets of Ω,

Example 1.5 In our roll a die experiment of Example 1.4 with Ω = {1, 2, 3, 4, 5, 6}, the set of odd numbered outcomes A = {1, 3, 5} or the set of even numbered outcomes B = {2, 4, 6} are examples of events. The simple events are {1}, {2}, {3}, {4}, {5}, and {6}.

Example 1.6 Consider a generic die-tossing experiment by a human experimenter. Clearly,

Ω = {

} and C = {ω 3 } are examples

of events. This experiment could correspond to rolling a die whose faces are:

} and A = {

}, B = {

1. sprayed with six different scents (nose!), or

2. studded with six distinctly flavoured candy (tongue!), or

3. contoured with six distinct bumps and pits (touch!), or

4. acoustically discernible at six different frequencies (ears!), or

5. painted with six different colours (eyes!), or

6. marked with six different numbers 1, 2, 3, 4, 5, 6 (eyes!), or ,

This example is meant to concretely convince you that an experiment’s sample space is merely a collection of distinct elements called outcomes and these outcomes have to be discernible in some well-specified sense to the experimenter!

Definition 1.2 A trial is a single performance of an experiment and it results in an out- come.

Example 1.7 We call a single roll of a die as a trial.

Example 1.8 We call a single toss of a coin as a trial.

An experimenter often performs more that one trial. Repeated trials of an experiment forms the basis of science and engineering as the experimenter learns about the phenomenon by repeatedly performing the same mother experiment with possibly different outcomes. This repetition of trials in fact provides the very motivation for the definition of probability in § 1.3.

Definition 1.3 An n-product experiment is obtained by repeatedly performing n trials of a mother experiment.

6

Example 1.9 Suppose we toss a coin twice by performing two trials of the coin toss ex- periment of Example 1.3 and use the short-hand H and T to denote the outcome of Heads and Tails, respectively. Then our sample space Ω = {HH, HT, TH, TT}. Note that this is the 2-product experiment of the coin toss mother experiment.

Exercise 1.4 What is the event that at least one Heads occurs in the 2-product experiment of Example 1.9, i.e., tossing a fair coin twice?

Exercise 1.5 What is the sample space of the 3-product experiment of the coin toss exper- iment, i.e., tossing a fair coin thrice?

Definition 1.4 An -product experiment is defined as

n n-product experiment of some mother experiment .

lim

Remark 1.5 Loosely speaking, a set that can be enumerated or tagged uniquely by natural

numbers N = {1, 2, 3,

elements. Some examples of such sets include any finite set, the set of natural numbers

N = {1, 2, 3,

= 0},

.} is said to be countably infinite or contain countably many

.}, the set of non-negative integers {0, 1, 2, 3,

.}, the set of all integers Z =

, 3, 2, 1, 0, 1, 2, 3,

.}, the set of all rational numbers Q = {p/q : p, q Z, q

but the set of real numbers R = (−∞, ) is uncountably infinite.

Example 1.10 The sample space Ω of the -product experiment of tossing a coin infinitely many times has uncountably infinitely many elements and is in bijection with all binary numbers in the unit interval [0, 1] — just replace H with 1 and T with 0. We cannot enumerate all outcomes in Ω but can show some outcomes:

Ω = {HHHH · · · , HTHH · · · , THHH · · · , TTHH · · · , , TTTT · · · , HTTT · · · , THTT · · · , HHTT · · · ,

,

1.3

Probability

,

.}

.

Definition 1.6 Probability is a function P that assigns real numbers to events, which satisfies the following four Axioms:

Axiom (1): for any event A, 0 P (A) 1

Axiom (2): if Ω is the sample space then P (Ω) = 1

Axiom (3): if A and B are disjoint, i.e., P (A B) = then

P(A B) = P(A) + P(B)

7

Axiom (4): if A 1 , A 2 ,

exclusive events, i.e., A i A j = whenever i

is an infinite sequence of pairwise disjoint or mutually

= j, then

P

i=1 A i =

i=1

P(A i )

These axioms are merely assumptions that are justified and motivated by the frequency interpretation of probability in n-product experiments as n tends to infinity, which states that if we repeat an experiment a large number of times then the fraction of times the event A occurs will be close to P (A). To be precise, if we let N (A, n) be the number of times A occurs in the first n trials then

P (A) = lim

n N (A, n)/n

Given P (A) = lim n N (A, n)/n, Axiom (1) simply affirms that the fraction of times a given event A occurs must be between 0 and 1. If Ω has been defined properly to be the set of ALL possible outcomes, then Axiom (2) simply affirms that the fraction of times something in Ω happens is 1. To explain Axiom (3), note that if A and B are disjoint then

N (A B, n) = N (A, n) + N (B, n)

since A B occurs if either A or B occurs but it is impossible for both to occur. Dividing both sides of the previous equality by n and letting n → ∞, we arrive at Axiom (3). Axiom (3) implies that Axiom (4) holds for a finite number of sets. In many cases the sample space is finite so Axiom (4) is not relevant or necessary. Axiom (4) is a new assumption for infinitely many sets as it does not simply follow from Axiom (3) any longer. Axiom (4) is more difficult to motivate but without it the theory of probability becomes more difficult and less useful, so we will impose this assumption on utilitarian grounds.

The following three Theorems are merely properties of probability.

Theorem 1.7 Complementation Rule. The probability of an event A and its comple- ment A c in a sample space Ω, satisfy

P(A c ) = 1 P(A)

.

(1)

Proof: By the definition of complement, we have Ω = A A c and A A c = . Hence by Axioms 2 and 3,

1 = P (Ω) = P (A) + P (A c ), thus P (A c ) = 1 P(A).

8

Example 1.11 Recall the coin toss experiment of Example 1.3 with Ω = {Heads, Tails}. Suppose that our coin happens to be fair with P (Heads) = 1/2. Since, {Tails} c = {Heads}, we can apply the complementation rule to find the probability of observing a Tails from P (Heads) as follows:

P (Tails) = 1 P (Heads) = 1 2

.

Theorem 1.8 Addition Rule for Mutually Exclusive Events. For mutually exclusive

or pair-wise disjoint events A 1 ,

, A m in a sample space Ω,

P(A 1 A 2 A 3 ∪ ··· ∪ A m ) = P(A 1 ) + P(A 2 ) + P(A 3 ) + ··· + P(A m )

.

(2)

Proof: This is a consequence of applying Axiom (3) repeatedly:

P (A 1 A 2 A 3 ∪ ··· ∪ A m ) = P (A 1 (A 2 ∪ ··· ∪ A m ))

= P (A 1 ) + P (A 2 (A 3 ··· ∪ A m )) = P (A 1 ) + P (A 2 ) + P (A 3 ···

A m ) = ···

= P(A 1 ) + P(A 2 ) + P(A 3 ) + ··· + P(A m ) .

Example 1.12 Let us observe the number on the first ball that pops out in a New Zealand

Lotto trial. There are forty balls labelled 1 through 40 for this experiment and so the sample

, 39, 40}. Because the balls are vigorously whirled around inside the

Lotto machine before the first one pops out, we can model each ball to pop out first with the same probability. So, we assign each outcome ω Ω the same probability of 40 , i.e., our

probability model for this experiment is:

space Ω = {1, 2, 3,

1

P(ω) =

40 1 , for each

ω Ω = {1, 2, 3,

, 39, 40}

.

NOTE: we sometimes abuse notation and write P (ω) instead of the more accurate but cumbersome P ({ω}) when writing down probabilities of simple events. Now, let’s check if Axiom (1) is satisfied for simple events in our model for this Lotto experiment,

0 P (1) = P (2) = · · · = P (40) =

1

40 1

Is Axiom (3) satisfied? For example, disjoint simple events {1} and {2}

P({1, 2}) = P({1} ∪ {2}) = P({1}) + P({2}) =

1

1

40 + 40 =

2

40

Is Axiom (2) satisfied? Yes, by Equation (2) of the addition rule for mutually exclusive events (Theorem 1.8):

P (Ω) = P ({1, 2,

, 40}) = P

i=1 i =

40

9

40

i=1

P(i) =

1

40 +

1

40 + ··· +

1

40 = 1

(a) 1114 NZ Lotto draw frequency from 1987 to 2008. (b) 1114 NZ Lotto draw

(a) 1114 NZ Lotto draw frequency from 1987 to 2008.

(a) 1114 NZ Lotto draw frequency from 1987 to 2008. (b) 1114 NZ Lotto draw relative

(b) 1114 NZ Lotto draw relative frequency from 1987 to 2008.

Recommended Activity 1.1 Explore the following web sites to learn more about NZ and British Lotto. The second link has animations of the British equivalent of NZ Lotto. http://lotto.nzpages.co.nz/

Theorem 1.9 Addition Rule for Two Arbitrary Events. For events A and B in a sample space,

Proof:

P(A B) = P(A) + P(B) P(A B)

.

(3)

P(A B)

=

P (A (B A c ))

=

P(A) + P(B A c ) by Axiom (3) and disjointness

=

P(A) + P(B) P(A B)

The last equality P (B A c ) = P (B) P (A B) is due to Axiom (3) and the disjoint union of B = (B A c ) (A B) giving P (B) = P (B A c ) + P (A B). It is easy to see this with a Venn diagram.

Exercise 1.6 In English language text, the twenty six letters in the alphabet occur with the following frequencies:

E

13%

R

7.7%

A

7.3%

H

3.5%

F

2.8%

M

2.5%

W

1.6%

X

0.5%

J

0.2%

T

9.3%

O

7.4%

S

6.3%

L

3.5%

P

2.7%

Y

1.9%

V

1.3%

K

0.3%

Z

0.1%

N

7.8%

I

7.4%

D

4.4%

C

3%

U

2.7%

G

1.6%

B

0.9%

Q

0.3%

Suppose you pick one letter at random from a randomly chosen English book from our

, Z} – ignoring upper/lower cases, then what is the

central library with Ω = {A, B, C, probability of the following events?

(a)

P ({Z}) =

(b)

What is the most likely outcome?

(c)

P (‘picking any letter’) = P (Ω) =

10

(d)

P ({E, Z}) =

— by Axiom (3)

(e) P (‘picking a vowel’) = by Equation (2) of addition rule for mutually exclusive events (Theorem 1.8).

(f) P (‘picking any letter in the word WAZZZUP’) = tion (2) of addition rule for mutually exclusive events (Theorem 1.8).

(g) P (‘picking any letter in the word WAZZZUP or a vowel’) =

by Equa-

= 42.2%

by Equation (3) of addition rule for two arbitrary events (Theorem 1.9).

1.4 Conditional Probability

Conditional probability allows us to make decisions from partial information about an ex- periment.

Definition 1.10 The probability of an event B under the condition that an event A occurs is called the conditional probability of B given A and is denoted by P (B|A). In this case A serves as a new (reduced) sample space, and that probability is the fraction of P (A) which corresponds to A B. Thus,

P(B|A) = P(A P(A) B) ,

if

=

P (A)

0

(4)

Similarly, the conditional probability of A given B is

 

.

P(A|B) = P(A P(B) B) ,

if

=

P (B)

0

.

(5)

Conditional Probability is a probability and therefore all four Axioms of probability also hold for conditional probability of events given the conditioning event A has P (A) > 0.

Axiom (1): For any event B, 0 P (B|A) 1.

Axiom (2): P (Ω|A) = 1.

Axiom (3): For any two disjoint events B 1 and B 2 , P(B 1 B 2 |A) = P(B 1 |A)+P(B 2 |A).

Axiom (4): For mutually exclusive or pairwise-disjoint events, B 1 , B 2 ,

P(B 1 B 2 ∪ ···|A) = P(B 1 |A) + P(B 2 |A) + ··· .

.,

Note that the complementation and addition rules also follow for conditional probability.

1. complementation rule for conditional probability:

P(B|A) = 1 P(B c |A)

11

.

(6)

2. addition rule for two arbitrary events B 1 and B 2 :

P(B 1 B 2 |A) = P(B 1 |A) + P(B 2 |A) P(B 1 B 2 |A)

.

(7)

Theorem 1.11 Multiplication Rule. If A and B are events and P (A)

then

= 0, P (B)

= 0,

P(A B) = P(A)P(B|A) = P(B)P(A|B)

.

(8)

Proof: Solving for P (A B) in the Definitions (4) and (5) of conditional probability, we obtain Equation (8) of the above theorem.

Example 1.13 Suppose the NZ All Blacks team is playing in a four team Rugby match. In the first round they have a tough opponent that they will beat 40% of the time but if they win that game they will play against an easy opponent where their probability of success is 0.8. What is the probability that they will win the tournament?

If A and B are the events of victory in the first and second games, respectively, then P (A) = 0.4 and P (B|A) = 0.8, so by multiplication rule, the probability that they will win the tournament is:

P (A B) = P (A)P (B|A) = 0.4 × 0.8 = 0.32 .

Exercise 1.7 In Example 1.13, what is the probability that the All Blacks will win the first game but loose the second?

Definition 1.12 Independent events. If events A and B are such that

P(A B) = P(A)P(B),

they are called independent events. Assuming P (A)

= 0, P (B)

= 0, we have P (A|B) =

P (A), and P (B|A) = P (B). This means that the probability of A does not depend on the occurrence or nonoccurence of B, and conversely. This justifies the term “independent”.

12

Example 1.14 Suppose you toss a fair coin twice such that the first toss is independent of the second. Then,

P (HT) = P (Heads on the first toss Tails on the second toss) = P (H)P (T) =

1

2 ×

1

2 = 1

4

.

Similarly, P (HH) = P (TH) = P (TT) = space Ω = {HT, HH, TH, TT}.

2 × 1 2 = 1 4 . Thus, P (ω) =

1

4 1 for every ω in the sample

Accordingly, three events A, B, C are independent if and only if

P(A B) = P(A)P(B), P(B C) = P(B)P(C), P(C A) = P(C)P(A), P(A B C) = P(A)P(B)P(C).

Example 1.15 Suppose you independently toss a fair die thrice. What is the probability of getting an even outcome in all three trials? Let E i be the event that the outcome is an even number on the i-th trial. Then, the probability of getting an even number in all three trials is:

P(E 1 E 2 E 3 ) = P(E 1 )P(E 2 )P(E 3 ) = (P ({2,

4, 6})) 3 = (P ({2} ∪ {4} ∪ {6})) 3

= (P ({2}) + P ({4}) + P ({6})) 3 =

1

6 + 1

6 + 1

3

6 3 = 6

3 =

1

2 3 = 1

8

.

Definition 1.13 Independence of n Events. Similarly, n events A 1 , independent if

, A n are called

P(A 1 ∩ ··· ∩ A n ) = P(A 1 )P(A 2 ) ··· P(A n ) .

Example 1.16 Suppose you toss a fair coin independently m times. Then each of the 2 m

possible outcomes in the sample space Ω has equal probability of

2

1 m due to independence.

Theorem 1.14 Total probability theorem. Suppose B 1 B 2 ··· ∪ B n is a sequence of

events with positive probability that partition the sample space, i.e., B 1 B 2 ··· ∪ B n = Ω,

B i B j = for i

= j, then

P(A) =

n

i=1

P(A B i ) =

n

i=1

P(A|B i )P(B i )

.

(9)

Proof: The first equality is due to addition rule for mutually exclusive events, A B 1 , A

B 2 ,

, A B n and the second equality is due to multiplication rule.

13

Exercise 1.8 An well-mixed urn contain five red and ten black balls. We draw two balls from the urn without replacement. What is the probability that the second ball drawn is black?

Theorem 1.15 Bayes theorem.

P(A|B) = P(A)P(B|A) P(B)

.

(10)

Proof: The proof is a consequence of the definition of conditional probability and the multiplication rule.

P(A|B) = P(A P(B) B)

= P(B A)

P(B)

= P(B|A)P(A)

P(B)

= P(A)P(B|A)

P(B)

.

Exercise 1.9 Approximately 1% of women aged 40–50 have breast cancer. A woman with breast cancer has a 90% chance of a positive test from a mammogram, while a woman without breast cancer has a 10% chance of a false positive result from the test. What is the probability that a woman indeed has breast cancer given that she just had a positive test?

14

2 Random Variables

We are used to traditional variables such as x as an “unknown” in the equation:

x + 3 = 7

,

where we can solve for x = 7 3 = 4. Another example is to use traditional variables to represent geometric objects such as a line:

y = 3x 2

,

where the variable y for the y-axis is determined by the value taken by the variable x, as x varies over the real line R = (−∞, ). The variables we have used to represent sequences such as:

{a n } n=1 = a 1 , a 2 , a 3 ,

,

are also traditional. When we wrote functions of a variable, such as x, in:

f(x) =

x

x + 1 ,

for x 0

,

the argument x is also a traditional variable. In fact, all of Calculus you have been taught is by means of such traditional variables.

, f (x)?

Answer: They are instances of deterministic variables, that is, these traditional variables take a fixed or deterministic value when we can solve for them.

Question: What is common to all these variables above, such as, x, y, a 1 , a 2 , a 3 ,

We need a new kind of variable to deal with real-world situations where the same variable may take different values in a non-deterministic manner. Random variables do this job for us. Random variables, unlike traditional deterministic variables can take a bunch of different values! In fact, random variables are actually functions! They take you from the “world of random processes and phenomena” to the world of real numbers. In other words, a random variable is a numerical value determined by the outcome of the experiment.

15

Definition 2.1 A Random variable or RV is a function from the sample space Ω to the set of real numbers R:

X(ω) : Ω R

,

such that, for every real number x, the corresponding set {ω Ω : X(ω) x}, i.e. the set

of

outcomes whose numerical value is less than or equal to x, is an event. The probability

of

such events is given by the function F (x) : R [0, 1] called the distribution function

or

DF of the random variable X:

F (x) = P (X x) = P ({ω : X(ω) x}) , for any x R

.

(11)

NOTE: Distribution function or DF is sometimes called cumulative distribution function or CDF in pre-calculus treatments of the subject. We will avoid the CDF nomenclature in our treatment.

Example 2.1 Recall the rain or shine experiment of Example 1.3 with sample space Ω = {rain, shine}. We can associate a random variable X with this experiment as follows:

X(ω) =

1,

0,

if ω = rain if ω = shine

Thus, X is 1 if it will it rain tomorrow and 0 otherwise. Note that another equally valid discrete random variable, say Y , for this experiment is:

Y (ω) = π, 2,

if

if

ω = rain ω = shine

A random variable can be chosen to assign each outcome ω Ω to any real number as the

experimenter desires.

Recall the experiments of Example 1.6 that involved smelling, tasting, touching, hearing, or seeing to discern between outcomes. It becomes very difficult to communicate, process and make decisions based on outcomes of experiments that are discerned in this manner and even more difficult to record them unambiguously. This is where real numbers can give us a helping hand. Data are typically random variables that act as numerical placeholders for out- comes of an experiment about some real-world random process or phenomenon. We said that the random variable can take one of many values, but we cannot be certain of which value it will take. However, we can make probabilistic statements about the value x the random variable X will take. This can be done with probabilities.

Theorem 2.2 Probability that the RV X takes a value x in the half-open interval ( a, b ], i.e., a < x b, is:

(12)

P(a < X b) = F(b) F(a)

.

16

Proof: Since the events (X a) = {ω : X(ω) a} and (a < X b) = {ω : a < X(ω) b} are mutually exclusive or disjoint events whose union is the event (X b) = {ω : X(ω) b}, Axiom (3) of Definition 1.6 of probability and by Equation (11) in Definition 2.1 of DF,

F(b) = P(X b) = P(X a) + P(a < X b) = F(a) + P(a < X b)