Académique Documents
Professionnel Documents
Culture Documents
Engineers
Epoka University
CEN CE ECE
Probability and Statistics for Engineers
Dr. Julian Hoxha
a.y. 2016/2017
Prerequisite/Textbook
Textbook is required:
Papoulis, A., & Pillai, S. U. (2002). Probability,
Random Variables, and Stochastic Processes. 4th
Edition. Tata McGraw-Hill Education.
Grading Policy
Assignments: 10%
Each student must hand in one copy
2 assignments
Lectures
Objective
The goal of the course is to introduce
probabilistic modeling and its role in solving
engineering problems.
It provides a foundation in the theory and
applications of probability and stochastic
processes and an understanding of the
mathematical techniques relating to random
processes.
It forms the basis for understanding random
processes in the areas of signal processing,
detection, estimation, and communication.
Probability and Statistics for Engineers
Lectures
Approach: how to do well in this course
Contents
History and overview (slides adapted from prof. Hisashi Kobayashi
blog)
Meaning of Probability
The axioms of probability
Repeated trials
Concept of Random Variables (C.R.V)
Function of one random variable
Characterization of a random variable
Two random variables
Function of two random variables
Sequences of random variables
Introduction to statistics
Stochastic process
Markov chain
Bayesian statistical inference
Introduction
Why study probability, random process and statistical
analysis?
Motivations/Applications
- Communication, information, and control systems
- Signal processing
Meaning of probability
The theory of probability deals with averages of mass
phenomena occurring sequentially or simultaneously:
electron emission, telephone calls, radar detection, quality
control,...etc.
In repeated experiments, the averages may exhibit statistical
regularity, and may converge as more trials are made.
A mathematical model to study random phenomena. This is
the domain of probability and statistics.
Certain averages approach a constant value as the number
of observation increases.
Using this approach we define probability in terms of
frequency of occurrence, as a percentage of successes in
a large number of observation.
Example: In the coin experiment, the percentage of the
heads approches 0.5.
Probability and Statistics for Engineers
Meaning of probability
The purpose of theory is to predict and describe such
averages in terms of probabilities of events.
The probability of an event (an event is a collection or a set
of outcomes) A is a number P(A) assigned to this event.
If the experiment is performed n times (with n sufficiently
large) and the event A occurs nA times, with a high
degree of certainty, the relative frequency nA| n of the
occurrence of A is close to P(A):
P( A)
nA n
Meaning of probability
In the applications of probability to real problems, we assume that
probabilities satisfy certain axioms, and by deductive reasoning we
determine from the probabilities P(Ai) of certain events Ai the probabilities
P(Bj) of other events Bj.
Example: if you are rolling a fair die, the probability of the event even
equals 3| 6.
Meaning of probability
Random:
Probability:
Probability theory:
Examples
1. Tossing a coin outcomes S ={Head, Tail}
Rolling a die outcomes
S ={ , , , , , }
={1, 2, 3, 4, 5, 6}
Probability and Statistics for Engineers
An Event E
The event, E, is any subset of the sample
space, S. i.e. any set of outcomes (not
necessarily all outcomes) of the random
phenomena.
Venn
diagram
Union or
sum
Let A and B be two events, then the union of A and B is the event denoted
by AB or A+B. This is a set whose elements are all elements of A or of B
or of both.
AB
Intersection or product
Let A and B be two events, then the
intersection of A and B is the event
denoted by AB or AB. This is a set
consisting of all elements that are common
to the set A and B.
Probability and Statistics for Engineers
Regardless of their number, different elements of the sample space should be distinct
and mutually exclusive so that when the experiment is carried out, there is a unique
outcome.
A probability measure is an assignment of real numbers to the events defined on
.
The set of properties that the assignment must satisfy are called the axioms of
probabilities.
Probability and Statistics for Engineers
which is consistent with the normalization axiom. Thus, the probability law is given by
These two proprieties gives a minimum set of conditions for F to be a field. All the
other properties follow:
An n1 F
An F
n 1
!
=
! !
If a set has n elements, then the total number of its subsets consisting of k elements
each equals ! ! !
Consider the coin experiment, where the probability of heads {} is and of tails
{} is = 1 .
Suppose we toss the coin n times, we obtain a new space, = 1 consisting
of 2 elements 1 , , , where = .
Assuming that the experiment are independent we have that: 1 , , = 1 1 . .
( ) where = = = .
Probability and Statistics for Engineers
=1
=0
> < 2
This theorem states that the frequency definition of probability of an event and
its axiomatic definition can be made compatible to any degree of accuracy with
probability 1.
In other words, given two positive numbers and , the probability of the inequality
The theorem states that in a sufficiently long series of independent trials with
constant probability, the relative frequency of an event will differ from that probability
by less than any specified number ( no matter how small ), with a probability
approaching 1 or with certainty.
For a given > 0, 2 can be made arbitrarily small by letting n became large.
Thus, we can make the relative frequency, , close to in a single trial.
x = x ,
x<0
0x<1
x1
ba
,
T
0<abT
> 0.
1. We have that: + = 1, = 0.
Proof. + = + = = 1 and = = = 0.
2. It is a non decreasing function: if x1 < x2 (x1 ) (x2 ).
Proof. If x1 < x2 we have that x1 { x2 } ( x1 ) ( x2 ).
3. If x0 = 0 then x = 0 for every x x0 .
Proof. Follows from propriety 2. Also, we have that if () > 0, we have
that 0 = 0 = 0 x = 0 for x 0. In this case we have a positive r.v.
4. P > x = 1 (x).
Proof. We just need to observe that x > x = and the two events are
mutually exclusive, x + > x = = 1.
5. The function x is continues from the right, x + = (x).
Probability and Statistics for Engineers
property 2, (x) is monotonically increasing function (and limited) so we are sure that at
every point exists finite right hand limit and left one (theorem of existence of the limit
for monotonic function). Then, to calculate the limit from the right it is not restrictive to
consider = 1 and make (that is, let tend to zero on a particular sequence
of values). We than note that, x + 1 = x + 1 = ( ), where = {
+ 1 }; note that is a decreasing function such that
=1 = = { x}
x + = lim ( + 1 ) = lim ( ) = = x = (x)
6. P 1 < 2 = 2 1 .
Proof. For the event {1 < 2 } we have { 1}{1 < 2 }= { 2 }.
Where the two events are mutually exclusive 1 + 1 < 2 = (
2 ).
7. P = = () ( ).
Proof. Lets take = { 1 < }: this sequence of events is decreasing such
that
=1 = = { = x}. From the property 6, for 1 = 1 e 2 = we have
that:
= 1 < = () ( 1 )
Passing to the limit and exploiting the properties of continuity of the probability
Probability and Statistics for Engineers
monotonous and limited function it admits a finite limit to the left on the point ,
having:
= = lim 1 = () ( ).
8. P x1 x2 = (x2 ) (x1 ).
Proof. We have that:
x1 x2 = x1< x2 { = x1}
With events at the second member mutually exclusive. For 6&7 property we have that:
P x1 x2 = x1< x2 + ( = x1)=F(x2 ) (x1)+F(x1) x1
= (x2 ) (x1 ).
9. P x1 < x2 = (x2 ) (x1 ).
10. P x1 < < x2 = (x2 ) (x)
0,
, =
A discrete r.v. takes on the values with probability taken from the CDF
discontinuity jump values.
The r.v. is said to be a continuous type if its distribution function () is
continuous.
The continuity of () implies that = = + = = 0, .
In other terms, a continuous r.v. takes on each value of its codomain with zero
probability.
Finally, the r.v. is mixed if its CDF is discontinuous, but not piecewise constant.
()
.
Since
()
( + ) ()
= lim
0
0
( )
where represent the jump discontinuity points in and () is the Dirac delta
function (more on Dirac function in the next slide), and = ( = ).
The amplitude of the discontinuity jump represent the probability that the r.v. takes
the values .
We derive from the derivative of the CDF, a pdf which consist of only Dirac pulses,
centered in discrete values.
Probability and Statistics for Engineers
1 , 2
with 1.
0,
2
(0).
lim
= lim
0
= (0).
This allows us to treat the Dirac pulse as the limit of a family of functions with
the following propriety:
For 0, the functions become more and more narrow.
For 0, the functions become more and more high.
The area of such functions is 1 regardless of .
= 1.
(), where
= .
Derivative: =
Integration:
1,
0
0,
()
() = 1
1 < 2 = 2 1 =
We note that the derivative (and therefore the pdf) is not defined at the points x = 0
and x=T (angular points of the CDF curve). This, however, is not a problem because
,as we shall see, the pdf is always used within an integral, the values in the
isolated points play no roles (if there is no Dirac pulse in that point).
Probability and Statistics for Engineers
monotonically
2. =
we find
3.
()
=
[() ] = () (). But = 0.
+
()
4. 1 < 2 = 2 1 =
both side
For continuity assumption of (),we can apply the mean value theorem for integrals
+ =
= + , with [0,1].
So, the value () in the point , represents the probability that takes values in a
interval (, + ), divided by the interval width , that is precisely a probability
density function.
We also observe, that the probability [, + ] is proportional (if 1) to
() and is maximum if [, + ] , where ( ) is locally maximum.
A pdf or density of a continuous r.v., is a function that describes the relative likelihood
for this r.v. to take on a given value.
Define law of probability in a continuous probability space is equivalent to
assign a pdf of a r.v.
3)
, ().Proof.
= =
{ } =
, ().
= 1. Proof. + = 1 =
4) 1 < 2 =
]1 ,2 [
,+ ()
().
, (
= )
Note that the Bernoulli r.v. is a particular case (a single trial experiment) of the
Binomial r.v.
Binomial Distribution: is said to be a binomial r.v. with parameters (total
number of test), (total number of successes), (the probability of success in each
experiment) and = 1 if takes the 0,1,2, , with probability (or its DF):
= = =
,
= {0,1,2,3 , }
2. The probability that the number of defective components is less than or equal to 80
is calculated noting that, 80 = 80
=0{ = } . As the elementary events are
mutually exclusive, the probability of the union is equal to the sum of the
probabilities:
3. The event that is comprised between 80 and 120 can also be expressed as union
of mutually exclusive elementary events as:
It is a probability less than 2%, so it is extremely difficult that the student exceeds
the test, by random responding the questions.
,
!
= 0,1,2, = 0 .
= 0, = 1
2
1
= =
() =
22 =
2 2
suppose that ; since the integral of all over is 1, and the Gaussian
function is symmetric with respect to the integral from to is 1 2.
1
2
= +
22
2
=2+
22
2 2 0
2 2 where = 2 2
2 2 = 2 +
22
1
0
2 2
( 0)
erfc = 1 erf =
+
2
1
2
= + erf
22
If the events are independent such as arrival time of the telephone or bus arrival time
then the waiting time of this events is shown to be Exponential.
The CDF function is:
= 1
We shall express the CDF () of the r.v. = in terms of the CDF () of the
r.v. and the function ().
For this purpose we must determine the set of the axis such that () , and
the probability that is on this set.
It will be assumed that the function () is continuous.
For a > 0:
For a < 0:
()
=
=
=
=
For a < 0:
=
()
=
=
( )
,
( )|
We will consider the values of such that for some , = . In case of 3 solutions:
shown on the next figure. Because 1 > 0, 2 < 0, 3 > 0 and is infinitesimal,
the three set where belong are mutually exclusive.
If
= = = 1.
If = = = =
Probability and Statistics for Engineers
( ) =
) where = ( = ).
( ) =
Proof.
=
=1
!
1
( 1)! ( )!
1
=
=0
1 !
1
! 1 !
=
=0
1
!
+1 1
! ( 1)!
= + 1
= .
Example: Lets have a r.v. with uniform distribution in the interval [0,2], and
calculate the mean value of = cos().
Solution:
The variance is a positive quantity, and its positive square root is = () known
as the standard deviation of .
The standard deviation represent the root mean square of the r.v. around its mean
.
Because of the linearity property of the mean we can show that:
We observe how the variance increases with amplitude interval , where the
r.v. takes its values.
The variance 2 measures the concentration (or, equivalently, the dispersion) of
around its mean .
We can say that equivalently, the variance is a measure of the uncertainty associated
with the values of the r.v. .
Because last expression is valid for any > 0, we take the derivative with respect to .
2 =
Probability and Statistics for Engineers
( )
Variance Property: The variance is not a linear operator, but a quadratic one. If
is a r.v. with finite variance, whatever the real constants a and b are, we have that:
Using the linearity of the mean operator, with simple steps we can write:
where
Since and () is an even function, the moments for n odd are zero, because
the integral of an odd and even function is zero.
We are so interested only on n even. Because the calculation are still difficult to do,
we will use the Gauss Integral:
>0
Probability and Statistics for Engineers
The latest relation can be rewritten, by simple algebraic manipulations, in the form:
0,
1 !!,
0,
1 !!,
Proof.
Theorem: Bienaym Inequality Let be a r.v. and a real number, then for
and > 0 we have:
Last relation allows to obtain a lower bound for the probability that the r.v. takes
values in the interval ( , + ), as shown in the table below, for = 1,2,3,4,5.
( )