INTRODUCTION TO ENGINEERING RELIABILITY
Lecture Notes for MEM 361
Albert S. D. Wang Albert & Harriet Soffa Professor
Mechanical Engineering and Mechanics Department
Drexel University Philadelphia, PA 19104
Chapter  I
Introduction
I  1
CHAPTER I. INTRODUCTION
Why Engineering Reliability?
In a 1985 Gallop poll, 1000 customers were asked: what attributes are most important to you in selecting and buying a product?
On a scale from 1 to 10, the following were the results from the poll:
* Performance quality 
9.5 
* Lasts a long time (dependability) 
9.0 
* Good service 
8.9 
* Easily repaired 
8.8 
* Warranty 
8.4 
* Operational safety 
8.3 
* Good look 
7.7 
* Brand reputation 
6.3 
* Latest model 
5.4 
* Inexpensive price 
4.0 
It is interesting to note that the top five attributes are related to “Engineering Reliability”; the study of which is the essence of this course.
I1 Basic Notions in Engineering Reliability.
Engineering Product. An “engineering” product is designed, manufactured, tested and deployed in service. The product may be an individual part in a large operating system; or it may be the system itself. In either case, the product is expected to perform the designed functions and last beyond it’s designed service life.
Product Quality. It is a “quantitative" measure of the engineered product to meet it’s designed functional requirements, including the designed service life.
Example 11: A bearing ball is supposed to have the designed diameter of 10mm. But an inspection of a large sample off the production line finds that the diameters of the balls vary from 9.91mm to 10.13mm, although the average diameter from all the balls in the sample is very close to being 10mm.
Now, the bearing ball is an engineering product; it’s diameter is one “quality measure”. The diameter of any given ball off the production line is uncertain, although it is “likely” to be between 9.91mm and
10.13mm.
Example 12: A color TV tube is designed to have an operational life of 10,000 hours. Aftersale data shows that 3% of the tubes were burnt out within the first 1000 hours; 6.5% were burnt out within 2000 hours.
Here, the TV tube is an engineering product and the operating life in service is one “quality measure”. For any one given TV tube off the production line, it’s operating life (the time of failure during
Chapter  I
Introduction
I  2
service) is uncertain; but there is a 3% chance that it may fail within 1000 hours of operation, and a 6.5% chance that it may fail within 2000 hours of operation.
Random Variable. A random variable (denoted by X) is one which can assume one or more possible values (denoted by x); but at any given instance, there is only a chance that X=x. In this context, the “quality measure” of an engineering product is best described by one or several random variables, say X, Y, Z, etc.
Discussion: The diameter of the bearing balls discussed in Example 11 can be described by the random variable X; according to the sample taken, X can assume any value between x=9.91mm and x=10.13mm. If a much larger sample is taken from the production line, some diameters may be smaller than 9.91mm and some diameters may be larger than 10.13mm. In the extreme case, X can thus assume any value between 0 and µ. Similarly, the operating life of the TV tube discussed in Example 12 can also be described by a random variable, say Y. From the statement therein, for a given tube, the chance that Y £ 1000 hours is 0.03 and that Y £ 2000 hours is 0.065.
Probability Function. Let X be a random variable representing the quality measure of a product; and it can assume any value x in the range, say, 0< x < µ. Furthermore, for X=x, there is a certain associated “chance” or "probability"; this is denoted by f(X=x) or by f(x). Note that f(x) is the probability that the value of X is exactly x.
Discussion: In Example 11, the random variable X describes the diameter of a bearing ball. From the sample data, we see that the probability for X<9.91mm, or X>10.13mm, should be very small while the probability for X=10mm should be much higher; this is because the bearing ball is designed to have a target diameter of 10 mm. In this context, f(x) may be graphically displayed as follows:
(design target)
If f(x) is known, a number of questions related to the quality of the bearing balls can be rationally answered. For instance, the percentage of the bearing balls having the diameters X £ x* is obtained as:
x*
F(X _{£} x*) = F(x*) = _{Ú}
9.91
f(x)dx
Here, F(x*) represents (1) the probability that a given bearing ball is less than or equal to x*; or equivalently (2) the percentage of the bearing balls with diameters less than or equal to x*. Clearly, for a given bearing ball, the probability that its diameter is larger than x* is: R(x*) = 1F(x*).
Note: Based on the sample given, F(x*) denotes graphically the area under the f(x) curve from 9.91mm to x*; while F(X£10.13) denotes the total area under the entire f(x) curve. Since the diameter of any given bearing ball is less or at most equal to 10.13 mm, the probability that the diameter of a given bearing ball being less than or equal to 10.13mm is 100%; or F(X£10.13) = 1.
Chapter  I
Introduction
I  3
Note the difference in notation between f(x) and F(x) and their meanings; f(x) is termed the “probability density function” while F(x) is termed the “cumulative distribution function”. We shall discuss these functions and their mathematical relationships in Chapter II.
Probability of Failure. If the quality of a product X is measured by it’s time of failure during service, then the value of X is in terms of time, t. Let the range of t be 0< t<µ; and the associated probability for X=t is f(t).
Discussion: In Example 12, the operating life of the TV tube may be described by the probability function f(t), such as shown below:
f(t)
f(t*)
t*
1000
(design target)
0.1
.05
t, hrs.
Here, the TV tube is designed for a life of 10000 hours of operation; the chance for a given tube to last 10000 hours is better than any other tvalues. Based on the sample data, there is a 3% of failure up to t=t*= 1000 hours; thus, we have
t*
F(X 
£ 
t*) = F(t*) = 
Ú 
f(t)dt 
= 0.03 
0 
The above is indicated graphically by the shaded area under the f(x) curve in the interval 0 _{£} t _{£} t*. Here, note the relation between f(t) and F(t).
Product Reliability. Let the quality of a product be measured by the timetofailure probability density function, f(t); the probability of failure up to t = t* is given by the cumulative distribution function, F(t*). Then, the probability for “non failure” before t = t* is:
F(X _{>} t*) = 1 F(t*) =
µ
Ú f(t)dt
t*
The term F(X>t*), which is associated with the probability density function f(t), represents the probability of survival, or known as the reliability function. A precise definition of the latter will be fully discussed in Chapter IV.
Discussion: In Example 12, the data shows that 3% failed before t _{£} 1000 hrs and 6.5% before t £ 2000 hrs. Hence, in terms of the reliability function, we write:
•
1000
R(1000) =
Ú
•
R(2000) = _{Ú}
2000
f(t)dt = 0.97
f(t)dt = 0.935
Chapter  I
Introduction
I  4
If we want to know the service life t* for which no more than 5% failure (or 95% or better reliability), we can determine t* from the following relation:
R(t*) =
Ú ^{•}
t*
f(t)dt = 0.95
Clearly, one attempts to answer all the questions regarding product reliability; and this can be done when the mathematical form of f(t) is known. In fact, the knowledge of f(t) or the pursuit of it, is one of the central elements in the study of engineering reliability.
I2. Probability, Statistics and Random Variables.
In the preceding section, we used terms such as “random variable”, “probability” of occurrence (say, X=x), “sample data”, etc. which form the basic notions of engineering reliability. How these notions, which are sometimes abstractive in nature, all fit together in a realworld setting is a complicated matter, especially for the beginners in the field.
As in nature, the occurrence of some engineered event is often imperfect; indeed, it may seem to occur in random; but when it is observed over a large sample or over a long period of time, there may appear a definitive “mechanism” which causes the event to occur. If the mechanism is exactly known, the probability for the event to occur can be inferred exactly; if the mechanism is not known at all, sampling of a set of relevant data (observations) can provide a statistical base from which at least the nature of the mechanism may become more evident. The latter is, of course, keenly dependent on the details of data sampling and on how the sample is analyzed.
Example 13. The number obtained by rolling a die is a random variable, X. In this case, we know all the possible values of X (the integers from 1 to 6) and the exact mechanism that causes a number to occur. Thus, the associated probability density function f(x) is determined exactly as:
f(x)=1/6,
for x = 1, 2,
6.
Note, for example, the probability that the number from a throw is less than 3 is given by
F(x<3) = f(1) + f(2) = 1/6+1/6 = 2/6 = 1/3.
Similarly, the probability that the number from a throw is greater than 3 is given by
F(x>3) = f(4) + f(5) + f(6) = 3/6 =1/2
The probability that the number from a throw is any number x £ 6 is given by
F(x £ 6) = f(1)+f(2)+ .
. F(6) = 1.
Note: In this example, X is known as a discrete random variable, since all the possible values of X
are distinct; and the number of all the values is finite. The distribution of f(x) is said to be uniform
since f(x)=1/6, for all x = 1, 2,
6.
Example 14. Now, let us pretend that we do not know any thing about the die. By conducting a sampling test in which the die is rolled N=100 times and each time the integer “x” on the die
Chapter  I
Introduction
I  5
recorded, the following data is obtained from an actual experiment of N=100 throws:
Xvalue: x

6
1
2
3
4
5
# times x occur, n
17
14
16
20
15
18
The above is said to be a "sample" of size N=100. It is, comparatively speaking, a rather small sample; even, in fact, only the integers from 1 to 6 can actually be observed, we cannot be certain integers other than 1 to 6 could not appear (remember: we pretended not to know any thing about the die). However, we can infer from the sample quite closely what is actually happening: namely, we observe that the number “1” appears 17 times out of 100 throws; the number “2” appears 14 times out
of 100 throws; and so on. Hence, we can estimate the probability density function f(x) as:
f(1)=17/100; 
f(2)=14/100; 
f(3)=16/100; 
f(4)=20/100; 
f(5)=15/100; 
f(6)=18/100. 
We see that the estimated f(x) is not uniform over the range of X; but they vary slightly about the theoretical value of 1/6. A graphical display of the above results is more revealing:
It is generally contented that all estimated f(x) would approach the theoretical value of 1/6 if N is
sufficiently large, say N=1000 (you may want to experiment on this). The relation between the sample size N and the theoretical probability function is another central element in "statistics".
The above example illustrates the statistical relationship between a test sample and the theoretical probability distribution function f(x) for X (X being generated by a specific mechanismrolling a die
for N times). This example casts an implication in engineering: i.e. in most cases, the theoretical f(x)
is unknown and the only recourse is to find (estimate) f(x) through a test sample, along with and a
proper statistical analysis of the sample.
Example 15. Suppose we roll two dice and take the sum of the two integers to be the random variable X. Here, we know the exact mechanism in generating the values for X. First, there are exactly 36 ways to generate a value for X; and X can have any one of the following 11 distinct inters:
i = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12. For instance, the number “2” is generated by the sum of “1” and “1” in one throw; while the number “3” is generated by the sum of “1” and “2”; but there is only one chance to obtain X=2, while two chances to obtain X=3 (1+2 and 2+1). Hence, the probability for f(2)=1/36; and f(3)=2/36. In fact, the probabilities associated with each of the 11 values are:
f(2)=f(12)=1/36; 
f(3)=f(11)=2/36; 
f(4)=f(10)=3/36; 
f(5)=f(9)=4/36; 
f(6)=f(8)=5/36; 
f(7)=6/36. 
A graphical display of f(i) is shown below:
Chapter  I
Introduction
I  6
6/36
3/36
2
3
4
5
6
7
8
9
10
11
i
12
Here, X is also a discrete random variable; but it's probability density function f(i) is not uniform. It is, however, a symmetric function with respect to X=7. For X=7, the theoretical probability is 6/36, the largest among the 11 numbers.
Discussion. Again, if we do a statistical sampling by actually rolling 2 dice N times, we may obtain an estimated f(i) function. In that case, we need a very large sample in order to approach the theoretical f(i) as displayed above.
Central Tendency. In most engineering settings, the random variable X often represents some measured quantity from identical products; for example, the measured diameters of a lot of bearing balls off the production line constitute just such a case. In general, the measured values tend to cluster around the designed value (i.e. the diameter), thus the central tendency. To evaluate this tendency is clearly important for design or quality control purposes.
The following example illustrates the evaluation of such a tendency:
Example 16. A master plumber keeps repaircall records from customers in his service area for 72 consecutive weeks:

71 
73 
22 
27 
46 
47 
36 
69 
38 
36 
36 
37 
79 
83 
42 
43 
45 
45 
55 
47 
48 
60 
60 
60 
49 
50 
51 
75 
76 
78 
31 
32 
35 
85 
58 
59 
38 
39 
40 
40 
41 
42 
42 
54 
73 
53 
54 
65 
66 
55 
55 
56 
56 
57 
49 
51 
46 
54 
62 
62 
54 
62 
63 
64 
67 
37 
58 
58 
61 
62 
52 
52 

Here, let X be the number of repaircalls per week, which seems uniformly random. While one sees that the smallest X value is 22 and the largest is 85, the sample really does not provide a definitive value range for X. Furthermore, since no definitive mechanism(s) could be identified as to how and why the values of X are generated, the true probability distribution of X could never be determined. Hence, instead of looking for the mechanism(s), the sample data can be analyzed in some way to show its central tendency, which may in turn be used to estimate the probability distribution function f(x) for X. Here, we follow a simple procedure as described in the following:
First, we note that there are 72 data values in the sample, roughly within the range from 21 to 90; so we divide this range into 7 equal "intervals" of 10; namely, 2130, 3140, 4150, etc. Second, for each
Chapter  I
Introduction
I  7
interval, we count from the sample the number of X values that fall within the interval. For instance,
in the first interval (2130), there are 2 values (22, 27); in the second interval (3140), there are 13
values (36, 38, 36, 36, 37, 31, 32, 35, 38, 39, 40, 40 and 37); and so on. After all is counted in the 7 intervals, the following result is obtained:
intervals:

^{#} ^{X} ^{v}^{a}^{l}^{u}^{e}^{s}
2130
2
3140
13
4150
15
5160
22
6170
11
7180
7
8190
2
in 
interval: 
In 
this manner, we can already observe that fewer values fall into the lower interval (2130), or into 
the upper interval (8190); but more values fall into the middle intervals, especially in the central interval (5160). With the above “interval grouping”, we may estimate the probability for X to fall inside the intervals. Instead of treating X, we introduce a new variable I representing the value intervals; the values of I are the integers 1 to 7, since there are 7 intervals. Thus, the probability density function of I, f(i) can be approximated as follows:
f(1)=2/72 
f(2)=13/72 
f(3)=15/72 
f(4)=22/72 
f(5)=11/72 
f(6)=7/72 
f(7)=2/72. 
A bar chart for the above is constructed as shown below; it is termed a “histogram” for the sample
The above bar chart displays some important features for the weekly repaircalls. Namely, it suggests that the most probable number of the repair calls occurs at i=4, the 4th valueinterval; or 5060 calls per week. Secondly, the shape of the histogram provides another clue as to the form of the estimated probability distribution function, f(x).
Note that the "fitted" f(x) shown in the figure is just a qualitative illustration; details of sample fitting will be further discussed in Chapter III.
Discussions. We note that the “histogram” obtained above is not unique. For one may take more or fewer intervals value intervals in the range from 21 to 90. In either case, one may obtain a somewhat different histogram for the same sample; often, one may even draw a quite different conclusion for the problem under consideration. This aspect in handling sample data will be examined further in later chapters.
Sampling Errors. As has been seen, most engineeringrelated events are unlike rolling dice. Rather, the mechanisms that generate the random variable X are not exactly known. Moreover,
Chapter  I
Introduction
I  8
the values of X may not be distinct and the range of their values may not be defined exactly either. In such cases, the probability density f(x) can be estimated from test samples. However, questions arise as to the proper size of the sample, possible bias that may have been introduced in taking the sample and the manner the sample is analyzed. This necessitates an evaluation of the confidence level in estimating f(x). We shall defer the discussion of this subject in Chapter III.
I3. Concluding Remarks.
This chapter provides an overview of the intrinsic elements that embody the subject of engineering reliability. At the heart is the interrelationship linking the random variable X, the mechanisms which generate the values of X, the statistics of the sample data related to X, and the determination and/or estimation of the probability distribution function f(x).
The fundamentals of probability  i.e. the mechanisms that generate the random variable X and the properties of various probability functions are investigated more critically in Chapter II. Sample statistics and methods for fitting sample data, some known probability distribution functions, and sampling error estimate and related subjects are discussed in Chapter III. The basics in reliability and failure rates are included in Chapter IV. Chapter V presents techniques for reliability testing, while Chapter VI discusses applications to some elementary product quality control issues. Throughout the text, simple but pertinent examples are used to illustrate the essence of the important points and/or concepts involved in the subject of concern.
A modest number of homework problems are included in each chapter; the students are urges to do the problems with a clear logical reasoning, rather than seeking the “formula” and “plugging in” just for an answer.
Chapter  I
Introduction
I  9
Summary.
As a beginner, it is useful to be conceptually clear about the meaning of some of the key terminologies introduced in this chapter, and to distinguish their differences and interrelations:

The random variable X is generated to occur as an event, by some mechanisms that are or are not known. When it occurs, X may assume a certain real value, denoted by x; and x can be any one value inside a particular range; say, 0 £ x< • . 

In one occurrence, there is a chance (probability) that X=x; that chance is denoted by f(x); here X=x means X equals exactly x. 

Since x denotes any value in the range of X values, f(x) is treated mathematically a distribution function over the value range. Thus, f(x) is the mathematical representation of X; f(x) has several properties and these will be discussed in Chapter II. 

In one occurrence, the probability that X £ x is denoted by P{xX £ x}, or simply F(x); here, F(x) is the sum of all f(x) where X £ x. Similarly, P{xX>x} denotes the sum of all f(x) where X>x; sometimes, it is also denoted by R(x) and/or by 1F(x). 

In this course, we sometimes mix the use of the symbols between P{xX £ x} and F(x); between P{xX>x} and R(x) and 1F(x), etc. This can be a source of confusion at times. 

If the exact mechanism that generates X is known, it is theoretically possible that the exact valuerange x, along with the theoretical f(x) is also known; if the mechanism is not known exactly, one can only rely upon statistical samples along with a proper statistical analysis methodology in order to determine f(x). 

A certain quality of an engineered product can be treated as a random variable (X); x is then the measure of that quality in one such product picked in random. Often than not, the exact mechanisms which generate the product quality (X) are not completely known; hence, sampling of the quality and statistical analysis of samples become essential in determining the associate f(x) function for X. 

Engineering reliability is a special case where the random variable X represents timetofailure, such as service lifetime of a light bulb; for obvious reasons, it is necessary to obtain the timetofailure probability f(t) for, say, the light bulb. 
Assigned Homework.
1.1 Let the random variable X be defined as the product of the two numbers when 2 dice are rolled.

List all possible values of X by this mechanism; 

Determine the theoretical probability function, f(x); 

Sketch a barchart for f(x), similar to Example I5; 

Compute F(25), explain the meaning of F(25); 

Compute R(15), explain the meaning of R(15); 

Show that the sum of all possible value of f(x) equals to one. 
[Partial answer: there are 18 values for X; f(6)=1/9; f(25)=1/36]
1.2
A coinbag contains 3 pennies, 2 nickels and 3 dimes. If 3 coins are to be taken from the bag each time, their sum is then a random variable: X.
Chapter  I
Introduction
I  10

List all the possible values of X by this mechanism; 

Determine the associated probability distribution f(x); 

Plot the distribution in a graphical barchart; 

Show that the sum of all possible value of f(x) equals to one. 
[Partial answer: there 56 combinations in drawing “three coins”; but only 9 difference values; $0.20 and 0.25 are among them]
1.3 (Optional; for extra effort) Let the random variable X be the sum of the three numbers when 3 dice are rolled.

Complete the theoretical probability distribution f(x) for X; 

Show your results in a barchart; 

Comment on the form of the barcharts obtained by rolling 1, 2 and 3 dice, respectively. 
[There are 216 possible outcomes in rolling 3 dice; they provide only 16 values, from 3 to 18;
f(10)=27/216; f(13)=21/216; one die gives a uniform f(x); 2 dice yield a bilinear f(x);
]
1.4 In Example 16, the exact mechanism that generates repair calls (X) is not known; but the sample provided can be used to gain some insight into the probability distribution function f(x).

Now, by using the classinterval = 6 calls instead of 10 calls, Redo a histogram for the sample; 

Discuss the difference between your histogram and the one obtain in Example 15. 
1.5 (Optional; for extra effort) The Boeing 777 is designed for a mean service life of 20 years in normal use. Let the service life distribution be given by f(t), where t is in years; and the form of f(t) looks like the one shown in Example 12.

Sketch f(t) as a function of t (years); and locate the design life (20 years) on the taxis; 

If a B777 has been in serve for 10 years already, what is the chance that the craft is still fit to fly until for another 6 years? 
[This is a case of "conditional" probability]
Chapter II
Fundamentals
II  1
CHAPTER II. FUNDAMENTALS IN PROBABILITY
II1 Some Basic Notions in Probability.
Probability of an Event. Suppose that we perform an experiment in which we test a sample of N "identical" products, and that n of them fail the test. If N is "sufficiently" large (N Æ • ) the following defines the probability that a randomly picked product would fail the test:
P{X} = p = n/N;
0 £ n £ N; N Æ •
(2.1)
Here, X denotes the event that a product fails the test; it is a random variable because the picked product may or may not fail the test; thus, the value of P{X} represents the probability that the event X (fail the test) does occur. Clearly, the value of P{X}is bounded by:
0 £ P{X} £ 1.
(2.2)
The NonEvent. Let X be an event with the probability of P{X}. We define X’ the nonevent of X, meaning that X does not occur. Then, the probability that X’ occurs is given by:
P{X’} = 1  P{X}
(2.3)
The relationship between P{X} and P{X’} can be graphically illustrated by the socalled Venn Diagram as shown below:
P{X}
P{X'}
In the Venn diagram, the square has a unit area; the (shaded) circle is P{X}; and the area outside the circle is P{X’}. If P{X} is the probability of failure, P{X’} is then the probability of survival, or the reliability.
Event and NonEvent Combination. In a situation where there are only two possible outcomes (such as in tossing a coin, one outcome is a “head” and the other a “tail”), X is a random variable with two distinct values, say 1 and 0; the associated probability distribution are then:
It follows from (2.3) that
f(1)= P{X}= p
and
f(0) = P{X’}= q.
f(0) + f(1) = q + p
=1.
Chapter II
Fundamentals
II  2
Example 21: In tossing a coin, the head will or will not appear; we know that the probability for the head to appear is P{X}=p =1/2, and that for the head not to appear is P{X’}=q =1 p =1/2.
Similarly, in rolling a die, let X be the event that the number “1” occurs. Here, we also know that P{X} =p =1/6, and the probability that “1” will not occur is P{X’}= q = 1p = 5/6.
In the above, we know the exact mechanisms that generate the occurrence of the respective random variable X. In most engineering situations, one can determine P{X} or p from test samples instead.
Example 22. In a QC (quality control) test of 500 computer chips, 14 chips fail the test. Here, we let
X be the event that a chip fails the QC test; and from the QC result, we estimate using (2.1):
P{X} = p @ n/N = 14/500 = 0.028.
Within the condition of the QC test, we say that the computer chip has a probability of failure p = 0.028, or a survivability of q = 0.972.
Discussion. In theory, (2.1) is true only when N Æ µ. The p=0.028 value obtained above is based on
a sample of size N=500 only. Hence, it is only an estimate and we do not know how good the
estimate is. There is a way to evaluate the goodness of the estimate; and this will be discussed in Chapter III.
Combination of Two Events. Suppose that two different events X and Y can possibly occur in one situation; if the respective probabilities are P{X} and P{Y}. The following defined:
P{X « Y} = the probability that both X and Y occur; and P{X » Y} = the probability that either X, or Y, or both occur.
X « Y is termed the intersection of X and Y; X » Y is termed the union of X and Y; a graphical representation of these two cases is shown by means of the Venn diagrams:
In each diagram, the outline square area is 1x1, representing the total probability; the circles X
Chapter II
Fundamentals
II  3
and Y represent the probabilities of the respective events to occur. The shaded area on the left is
X « Y, in which both X and Y occur; the shaded area on the right is X » Y, in which either X or
Y or both occur. The union is mathematically expressible as:
P{X » Y} = P{X}+ P{Y}  P{X « Y}
which can be inferred from the Venn diagram.
(2.4)
Note that the blank area outside the circles in each case represents the probabilities of “non event", that is neither X nor Y will occur, P{X » Y}'= 1  P{X » Y}.
Independent Events. If the occurrence of X does not depend on the occurrence of Y, or vice versa, X and Y are said to be mutually independent. Then,
P{X « Y} = P{X} * P{Y}
(2.5)
Expression (2.5) is an axiom of probability and it cannot be shown on Venn diagram.
Example 23. In rolling two dice, let the occurrence of #1 in the first die be X and that in the second is Y. In this case, occurrence of Y does not depend on that of X; and we know P{X}=P{Y}=1/6. It follows from (2.5) and (2.4), respectively, that
P{X « Y}= #1 appears in both dice = (1/6)(1/6) = 1/36. P{X » Y}= #1 appears in either or both dice = 1/6+1/61/36=11/36.
Discussion: The fact that P{X » Y}=11/36 can also be found as follows: there are in all 11 possible combinations in which #1 will appear in either or both dice  (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (3,1), (4,1), (5,1), (6,1), while there are a total of 36 possible outcomes.
Conditional Probability. If the occurrence of X depends on the occurrence of Y, or vice versa, then X and Y are said to be mutually dependent. We define
P{X/Y} = probability of X occurrence, given the occurrence of Y; P{Y/X} = probability of Y occurrence, given the occurrence of X.
It follows the axiom (2.5) that
P{X « Y}= P{X/Y}* P{Y} = P{Y/X}* P{X}
(2.6)
Example 24. Inside a bag, there are 2 red and 3 black balls. The probability for drawing a red ball out is: P(X}=2/5 and that to draw another red ball from the rest of the balls in the bag is:
P{Y/X}=1/4. Thus, to draw both red balls consecutively, the probability is:
P{X « Y}=P{Y/X}* P{X}=(1/4)(2/5)=1/10.
Example 25. An electrical system is protected by two circuit breakers that are arranged inseries. When an electrical surge passes through, at least one breaker must break in order to protect the system; if both do not break, the system would be damaged.
In a QC test of the breakers individually, the probability for the breaker not to break is P{X}=0.02;
Chapter II
Fundamentals
II  4
and if two are connected inseries, the probability of failure to break of the second, given the failure to break of the first, is much higher: P{Y/X}=P{X/Y}=0.1. The probability that the system fails by an electric surge is when both breakers fail to break:
P{X « Y}=P{Y/X}* P{X}= (0.1)(0.02)=0.002.
The probability that at least one fail to break (when they are inseries) is:
P{X » Y}=P{X}+P{Y} P{X « Y} = 0.02 + 0.02  0.002 = 0.038.
Discussion. If failuretobreak of one breaker does not affect the other, the failure probability of the system is when both fail to break:
P{X « Y}=P{Y}* P{X}= (0.02)(0.02)=0.0004.
Mutually Exclusive Events. In a situation where if X occurs then Y cannot, and vice versa, then
X 
and Y are said to be mutually exclusive. In the Venn diagram, X and Y do not intersect. So, 

P{X « Y} = 0. 
(2.7) 

It 
follows from (2.4) and (2.6), respectively, that 

P{X » Y} = P{X} + P{Y} 
_{(}_{2}_{.}_{8}_{)} 
P{X/Y}=P{Y/X}=0
Discussion: To illustrate mutually exclusive or nonexclusive events, consider the following example:
for a deck of poker cards, the probability of drawing an "ace" of any suite is P(X}=4/52, and that of drawing a "king" is P(Y}=4/52. These are two independent events but mutually exclusive in a single draw; since if an "ace" is drawn, it is impossible to draw a "king". Thus,
P{X « Y} = 0;
P{X » Y} = P{X}+P{Y} = 8/52
Alternatively, given the occurrence of an "ace", the probability of drawing a "king" (in a single draw) is
P{Y/X} = 0.
Now, in a single draw, the probability of getting a "heart" is P{Z}=13/52; the chance of getting an "ace of heart" is then
P{X « Z}=P{X}* P{Z} = (4/52)(13/52) = 1/52.
The chance of getting either an "ace" or a "heart" is the union of X and Z:
P{X » Z} = P{X}+P{Z}P{X « Z} = 4/52+13/521/52 = 16/52
Note that to get an "ace" (X) and a "heart" (Z) is not mutually exclusive, while to get an "ace" and a "king" (Y) are mutually exclusive.
Combination of NEvents. In a situation where N possible events X _{1} , X _{2} , X _{3} ,
occur their intersection or union cannot be obtained. However, if they are independent events, then their intersection is:
X _{N} can
Chapter II
Fundamentals
II  5
P{X _{1} « X _{2} « X _{3} 
« X _{N} } = P{X _{1} }* P{X _{2} }* P{X _{3} }* 
* 
* P{X _{N} } 
(2.9) 
Note that (2.9) represents the probability that all N events occur. 

Similarly, X' _{1} , X' _{2} , X' _{3} , 
X' _{N} are the respective nonevents; and their intersection is: 

P{X' _{1} « X' _{2} « X' _{3} 
« X' _{N} }=P{X' _{1} }*P{X' _{2} }*P{X' _{3} }* * * P{X' _{N} } 
(2.10) 
And the above is the probability that none of the N events occur.
As for the union of the Nevents P{X _{1} » X _{2} » X _{3}
one or more or all of the N events occur. Since P{one or more or more events occur}+ P{none occurs} = 1, we can write:
» X _{N} }, it represents the probability that
P{X _{1} » X _{2} » X _{3} 
.» X _{N} }+[P{X' _{1} }* P{X' _{2} }* P{X' _{3} }* * P{X' _{N} }]= 1 
(2.11) 

Note that (2.11) is the total probability of all possible outcomes; thus, it is a unity. 

The terms P{X' _{i} }, i=1,2, 
N in (2.11) can be replaced by 

P{X' _{i} } = 1  P{X _{i} }; 
i=1,2, 
N 
(2.12) 

Hence, the union of all the Nevents can be expressed in the following alternate form: 

P{X _{1} » X _{2} » X _{3} 
. » X _{N} }=1[1P{X _{1} }] * [1P{X _{2} }] * * * [1P{X _{N} }] 
(2.13) 
A Special case: if P{X _{i} } = p for all i=1,N, then the intersect in (2.9) becomes:
P{X _{1} « X _{2} « X _{3} 
. 
. 
. 
« X _{N} } = p ^{N} ; 

And the union in (2.13) becomes: 

P{X _{1} » X _{2} » X _{3} 
. 
. 
. 
» X _{N} } = 1 (1p) ^{N} 
The example below illustrates such a special case.
Example 26: A structural splice consists of two panels connected by 28 rivets. QC finds that 18 out of 100 splices have at least one defective rivet. If we assume defective rivets occur independently and the probability of being defective is p, what can we say about the quality of the rivets?
Here, let X _{i} , i=1,28 be the event that the ith rivet is found defective in one randomly chosen splice; and P{X _{i} }= p. The probability for one or more or all rivets to be found defective in one splice is the
union of all X _{i} , i=1,28: P{X _{1} » X _{2} » X _{3}
» X _{2}_{8} }.
But, QC finds the probability of a splice to have at least one defective rivet is 18/100; hence,
P{X _{1} » X _{2} » X _{3}
» X _{2}_{8} } = 1[1p ] ^{2}^{8} = 0.18.
Chapter II
Fundamentals
II  6
Solving, we obtain p = 0.0071. We can say that about 7 out of 1000 rivets may be found defective
Discussion. In this example, QC rejection rate of the splice (0.18) is given but the probability of being defective of a single rivet is not. By using the definitions of intersect and union of multiple events, we can estimate the probability of being defective for a single rivet, p.
Conversely, if p is given, we can use the same relations to estimate the rejection rate of the splice.
II2 The Binomial Distribution.
A special case of Nevents leads to the socalled “binomial distribution”; it is also referred to as
the “Bernoulli Trials”. Specifically, if the events X _{1} , X _{2} ,
statistically identical:
X _{N} are independent as well as
P{X _{n} }= p
Then, we can always write:
and
P{X' _{n} } =1p =q ;
for n = 1,2,
N.
P{X _{n} }+ P{X' _{n} } = p +q = 1 
for n=1,2, 
N 
(2.14) 

It 
follows that 

(q +p ) ^{N} = 1 
(2.15) 

Note that (2.15) is a binomial of power N; upon expansion, we have 

C ^{N} _{o} q ^{N} +C ^{N} _{1} q ^{N}^{}^{1} p +C ^{N} _{2} q ^{N}^{}^{2} p ^{2} + + C ^{N} _{i} q ^{N}^{}^{i} p ^{i} + 
+ C ^{N} _{N} p ^{N} = 1 
(2.16) 

where 

C ^{N} _{i} = (N!) / [(Ni)! i!] 
i = 0, 1, 2, 
N. 
(2.17) 
It turns out that each term in the binomial expansion (2.16) has a distinct physical meaning:
C ^{N} _{o} q ^{N} = q ^{N} is the probability that none of the Nevents ever occur: f(i =0);
C ^{N} _{1} q ^{N}^{}^{1} p is the probability that one of the Nevents occur: f(i =1);
C ^{N} _{N} p ^{N} = p ^{N} is the probability that all of the Nevents occur: f(i =N); and
In particular,
C ^{N} _{i} q ^{N}^{}^{i} p ^{i} are the probabilities that i out of N events occur: f(i).
The expression
Chapter II
Fundamentals
II  7
C ^{N} _{i} q ^{N}^{}^{i} p ^{i} = f(i)
(2.18)
is known as the binomial distribution, representing the probability that: of the Nevents there
will be exactly i events (i = 1, 2,
.N) that will occur.
Note that (2.16) is the total probability for all possible outcomes:
f(0) + f(1) + f(2) +
+ f(N) = 1
It can, in turn, also be rewritten as:
f(1) + f(2) +
+ f(N) = 1  f(0) = 1 q ^{N}
(2.19)
(2.20)
which is the probability for one or more events to occur (or the union of all events).
The binomial distribution (2.18) is also known as the Bernoulli trial; the meaning of the trial is illustrated in the following examples:
Example 27: Suppose that a system is made of 2 identical units A and B. Under a certain prescribed operational conditions, the failure probability of each unit is p.
For the system, one of the following 4 situations may occur during the prescribed operation:
Afail and Bfail; Anot fail, Bfail;
Afail, Bnot fail; Anot fail, Bnot fail.
The associated probabilities for the four situations are, respectively, pp, pq, qp and qq. Thus, we have
f(0) = probability of none failure = qq f(1) = probability of just one failure = pq + qp = 2pq f(2) = probability of two failures = pp
Check that the total probability of all possible outcomes is 100%:
f(0)+f(1)+f(2) = q ^{2} + 2pq +p ^{2} = (q +p ) ^{2} = 1
The above follows the binomial distribution for N=2: (q+p) ^{2} = 1.
Example 28: In rolling a die repeatedly, what is the probability that the #1 appears at least once in 4 trials?
Here, the same event (the appearance of #1) is observed in N=4 repeated "trials"; the random variable of interest is the number of times the observed event occurs; this can be any number i = 0, 1, 2, 3 or 4. This is a trial case known as the Bernoulli Trial:
Now, for f(0), f(1), f(2), f(3) and f(4) we can write:
f(0): 
no 
no 
no 
no 
= q ^{4} 
f(1): 
yes 
no 
no 
no 
= pq ^{3} 
no 
yes 
no 
no 
= pq ^{3} 
Chapter II
Fundamentals
II  8
no 
no 
yes 
no 
= pq ^{3} = pq ^{3} 

no 
no 
no 
yes 

f(2): 
yes 
yes 
no 
no 
= p ^{2} q ^{2} 
yes 
no 
yes 
no 
= p ^{2} q ^{2} = p ^{2} q ^{2} 

yes 
no 
no 
yes 
= p ^{2} q ^{2} 

no 
yes 
yes 
no 
= p ^{2} q ^{2} 

no 
yes 
no 
yes 
= p ^{2} q ^{2} 

no 
no 
yes 
yes 

f(3): 
yes 
yes 
yes 
no 
= p ^{3} q 
yes 
yes 
no 
yes 
= p ^{3} q = p ^{3} q 

yes 
no 
yes 
yes 
= p ^{3} q 

no 
yes 
yes 
yes 

f(4): 
yes 
yes 
yes 
yes 
= p ^{4} 
The total probability of all outcomes is thus:
f(0)+f(1)+f(2)+f(3)+f(4) = q ^{4} + 4q ^{3} p +6q ^{2} p ^{2} +4q p ^{3} +p ^{4} = (q +p ) ^{4} = 1
Again, it follows the binomial distribution.
Discussion: For engineering products, a system or a single component may be under repeated and statistically identical demands. Say, in each demand, the failure probability is p =1/6 while that for nonfailure is q =5/6. Then, the probability that failure of the system (or component) occurs at least once in 4 repeated demands is given by:
f(1)+f(2)+f(3)+f(4) = 1 f(0) = 1  q ^{4} = 1  (5/6) ^{4} = 51.8%
The above result can be obtained by applying (2.18) to (2.20) directly.
The Pascal Triangle. The coefficients in the binomial expansion of power N in (2.17) can be easily represented by a geometrical construction, known as the Pascal Triangle, if N is not very large:
1 
N=0 

1 
1 
N=1 

1 
2 
1 
N=2 

1 
3 
3 
1 
N=3 

1 
4 
6 
4 
1 
N=4 

1 
5 
10 
10 
5 
1 
N=5 

1 
6 
15 
20 
15 
6 
1 
N=6 

1 
7 
21 
35 
35 
21 
7 
1 
N=7 

1 
8 
28 
56 
70 56 
28 
8 
1 
N=8 

. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. . 
. 
. 
. 
. 
. 
Example 29. Suppose the probability of a light bulb being burnt out is p whenever the switch is
turned on. In a hallway, 8 light bulbs are controlled by one switch. Compute f(0), f(1), the switch is turned on.
, f(8) when
Chapter II
Fundamentals
II  9
Using the Pascal Triangle, for N=8, we can quickly write:
f(0) = q ^{8} ; f(5) = 56q ^{3} p ^{5} ;
f(1) = 8q ^{7} p; f(6) = 28q ^{2} p ^{6} ;
f(2) = 28 ^{6} p ^{2} ; f(7) = 8q p ^{7} ;
f(3) = 56q ^{5} p ^{3} ; f(8) = p ^{8} .
f(4) = 70q ^{4} p ^{4} ;
The Poisson Distribution. While the Pascal triangle becomes cumbersome to use when N is large, say N>20, there is a simple expression for (2.18) when N is large and p is small (say N>20 and p <<1):
f(i) = (Np ) ^{i} exp[Np ]/(i!)
i = 0,1,2,
N
(2.21)
Expression (2.21) is known as the Poisson Distribution. It is an approximation of the binomial distribution (2.18) when N is large and p small.
Example 210: Suppose that the probability for a compressor to pass a QC test is q=0.9. If 10
compressors are put through the QC test, compute the various probabilities of failure: f(i), i=0,1,
10.
In this case, N=10 and p =0.1; the binomial distribution (2.18) and the simplified Poisson distribution
(2.21) will be used; the following are the respective results:
f(6), f(7), f(8), f(9), f(10)

f(0)
f(1)
f(2)
f(3)
f(4)
f(5)
by (2.18) 
0.349 
0.387 
0.194 
0.057 
0.011 
0.0015 
_{ª} 0 
by (2.21) 
0.368 
0.368 
0.184 
0.061 
0.015 
0.0031 
ª 0 

Discussion: The exact binomial distribution (2.18) gives a maximum at f(0)=0.349 and f(1)=0.387; the Poisson approximation (2.21) gives f(0)=f(1)=0.368. For the rest, the two distributions are rather close.
The Poisson approximation would yield better results if N is larger or p smaller. In this example, the
N (=10) value is not large enough and p (=0.1) is not small enough, thus the difference.
II3. Properties of Discrete Random Variables.
The random variable X is discrete when it’s values x _{i} (i =1,N) are distinctly defined and N is finite. If the probability that X takes the value x _{i} is f(x _{i} ), f(x _{i} ) has the following important properties:
The axiom of total probability:
S
f(x _{i} ) = 1;
S sums over i =1, N.
(2.22)
Here, the function f(x _{i} ) is formally termed as the probability mass function of X, or pmf for short. It is called “mass” function in the sense that the probability f(x _{i} ) is exacted at X=x _{i} , much like the lumpedmass representation of particle dynamics in physics; the expression in (2.22) which equals to unity, resembles the “total” of all the “lumped masses”.
Chapter II
Fundamentals
II  10
The partial sum (for n £ N):
F(x _{n} ) = S
f(x _{i} )
S sums over i =1, n.
(2.23)
It is termed the cumulative mass function of X, or CMF for short. Note that
0 F(x
_{n} ) 1
for 1
£
_{n} £
_{N}_{.}
The Mean of X: The mean of f(x _{i} ) is defined as:
m = S
x _{i} f(x _{i} );
S sums over i =1, N
(2.24)
The Variance of X: The variance of f(x _{i} ) is defined as:
s ^{2} = S
(x _{i} m) ^{2} f(x _{i} );
S sums over i =1, N
(2.25)
By utilizing (2.24), the variance defined in (2.25) can be alternately expressed as:
s ^{2} = S
x _{i} ^{2} f(x _{i} )  m ^{2}
S sums over i =1, N
(2.26)
The standard deviation: The standard deviation is s, as defined in (2.26).
The above properties are illustrated by the following example:
Example 211: Suppose the values of X are given as: (0, 1, 2, 3, 4, 5) and the associated pmf are:
f(0)=0, f(1)=1/16, f(2)=1/4, f(3)=3/8, f(4)=1/4 and f(5)=1/16. Compute the mean, variance and standard deviation for X.
We compute: 

* 
the total probability by checking the sum: 0 + 1/16 + ¼ + 3/8 + 1/4 + 1/16 = 1; 

* 
the mean of X by applying (2.24): 

m = 0 * 0 + 1 * (1/16) + 
2* (1/4) + 
3* (3/8) + 4* (1/4) + 5* (1/16) = 3 

* 
the variance of X by applying (2.26): 

s ^{2} = [0 ^{2} * 0+1 ^{2} (1/16)+2 ^{2} (1/4)+3 ^{2} (3/8)+4 ^{2} (1/4)+5 ^{2} (1/16)] 3 ^{2} = 1 

* 
the standard deviation is computed as s = ±1. 
Discussion: It is geometrically revealing if the pmf is represented by a bar chart versus x and the CMF by stair case chart, as shown in the figures on the next page. Note that the sum of the pmf bars (lumped masses) is a unity and that the center of gravity of the total mass is at x=m =3. Note that the mean is not the average of all of the values of X.
Chapter II
Fundamentals
II  11
It is also interesting to note that the mass moment of inertia about the mean is s ^{2} , which equals 1; the radius of gyration of the total mass about mean is s, which is the standard deviation. This analogy in physics is sometimes helpful.
i
i
Note: In the upper figure, the length of the bar at each x _{i} represents the value of f(x _{i} ); in the lower figure, the staircase rise at each x _{i} equals the value of f(x _{i} ). The charts provide a geometric view of the pmf and CMF, respectively.
Example 212. For the binomial distribution f(i) given by (2.18), it’s mean can now be found by using (2.24):
m = S (i) C ^{N} _{i} (1p ) ^{N}^{}^{i} p ^{i}
= Np;
S sums over i = 1, N
The variance is found by using (2.25):
s ^{2} = S (im) ^{2} C ^{N} _{i} (1p ) ^{N}^{}^{i} p ^{i} = Np(1p );
S sums over i = 1, N
The proof of the above results for m and s ^{2} requires manipulations of the binomial terms in (2.18);
Chapter II
Fundamentals
II  12
the details of which will not be included here; interested readers may consult the text by E. E. Lewis (2nd Edition, p.23).
Recall that the Poisson distribution (2.21) is a simplified version of the binomial distribution when
N _{Æ} µ and p <<1. The mean m and the variance s ^{2} for the Poisson distribution are readily obtained from the above:
m = s ^{2} = Np
Thus, the Poisson distribution is a “oneparameter” exponential function (see 2.21):
f(i) = m ^{i} e ^{}^{m} /i!
The Expected Value: Suppose that a weighting function g(x _{i} ) is accompanying f(x _{i} ) whenever the random variable X assumes the value of x _{i} . Then, the expected value of X with respect to the weighting function g(x _{i} ) is defined as:
E(g) =S g(x _{i} )f(x _{i} )
S sums over i =1, N
(2.27)
Here, the weighting function g(x _{i} ) may be understood as follows:

Each time x _{i} is taken by X, a value of g(x _{i} ) is realized. But, for X to assume x _{i} , there is the probability f(x _{i} ); hence, the expected realization is g(x _{i} )f(x _{i} ); 

E(g) is the cumulative realized value for all possible values of X. 
Example 213. Suppose in Example 211 the accompanying weighting function is g(x _{i} )=ax _{i} , where
i=0 to 5; and a is a real constant. Then, the expected value E(g) is given by (2.27):
E(g) = a[0 * 0+1(1/16)+2(1/4)+3(3/8)+4(1/4)+5(1/16)] = am = 3a.
Applications of the expected value formula in (2.27) will be detailed in chapter VI.
Summary of Sections II2 and II3. The following table summarizes the essential elements in the discrete random variable X discussed in Sections II2 and II3:
Function/Properties
General Discrete
Special Case: Bernoulli Trials
Binomial
Poisson

values of X
x _{i} , i=1,N
i = 0, 1, 2, . N

pmf
f(x _{i} )
f(i)={N!/[(Ni)!i!]}p ^{i} q ^{N}^{}^{i}
f(i)=(Np) ^{i} /i!{e ^{}^{N}^{p} }

CMF
F(x _{n} )=Sf(x _{i} ); over 1,n
f(n)=Sf(i), over 0,n
Chapter II
Fundamentals
II  13

mean of X, m
m=Sx _{i} f(x _{i} ); over 1,N
m=Np

variance of X, s ^{2}
s ^{2} =Sx _{i} ^{2} f(x _{i} )m ^{2}
s ^{2} =Np(1p)
s ^{2} =Np

nonevent (reliability)
R(x _{n} )=1F(x _{n} )
f(0)=(1p) ^{N}
f(0)=e ^{}^{N}^{p}

II4. Properties of Continuous Random Variables.
A random variable X is continuous when it’s value x are continuous distributed over the range of
X. For definiteness, let the range of X be: µ < x < µ; then the probability that X takes the value x
in the range is a continuous function, f(x). Here, f(x) is formally termed the probability density
function; or pdf for short. The pdf must satisfy the axiom of total probability:
Ú ^{µ} f(x) dx = 1
µ
^{(}^{2}^{.}^{2}^{8}^{)}
The cumulative distribution function, or CDF for short, is defined as:
F(x) =
x
_{Ú}
µ
f(x) dx
(2.29)
F(x) represents the probability that value of X x.
The Mean m and The Variance s ^{2} of f(x) are:
s ^{2} =
•
Ú
• (xm) ^{2}
f(x) dx
m =
µ
Ú
µ
x f(x) dx
(2.31)
(2.30)
The variance in (2.31) can alternatively be expressed as:
^{s} ^{2} ^{=}
•
Ú • ^{x} ^{2}
f(x) dx
 m ^{2}
(2.32)
The Proof of (2.32) is left in an exercise problem at the end of this Chapter.
Discussion: A pdf, which is common engineered product quality variation, is shown graphically on the next page. This f(x) resembles a belllike curve; the “head” of the curve is with diminishing f(x) as x increases, while the “tail’’ is with diminishing f(x) as x decreases. The maximum of f(x) occurs
Chapter II
Fundamentals
II  14
at x=x _{m}_{o}_{d}_{e} (which can be determined by setting df(x)/dx=0).
The total area under f(x) is the integral in (2.28), which must equal to 1; the centroid of the area under the f(x) curve, represented by the integral (2.30), is located at x=m, which is the mean of X; the area moment of inertia, with respect to the axis x=m and represented by (2.31) or (2.32), is the
variance s ^{2} . Finally, the radius of gyration of the area under f(x), revolving about the axis x=m, is given by ±s; it is the standard deviation of X.
µ
Æ µ
The Expected Value: given g(x _{i} ) accompanying f(x _{i} ), the expected value of X with respect to g(x) is given by:
E(g) =
•
Ú
•
g(x) f(x) dx
(2.33)
Note: when g(x)=x, E(g)=m; when g(x)=x ^{2} , E(g)= s ^{2} + m ^{2} . Thus, the expected value of X is it’s mean;
the expected value of X ^{2} is (s ^{2} + m ^{2} ). The latter is readily seen from (2.32).
The Median of X: The median of X is when X = x _{m} such that
F(x _{m} ) =
x m
Ú f(x) dx = 1/2
•
(2.34)
From the geometric point of view, X=x _{m} separates the total area under the f(x) curves into two halves.
The Mode of X: The mode of X is when X = x _{m}_{o}_{d}_{e} such that it corresponds to the maximum of f(x) as discussed above.
Note: The mean m, median x _{m} and mode x _{m}_{o}_{d}_{e} are distinctly defined quantities; each has it’s own physical
Chapter II
Fundamentals
II  15
meaning. But the three quantities become the same if f(x) is a symmetric respect to the mean: x _{m}_{o}_{d}_{e} = x _{m} = m.
Skewness of f(x): The pdf is a skewed distribution when it is not symmetric with respect to the mean; then in general x _{m}_{o}_{d}_{e} x _{m} m.
A measure for the skewness of f(x), known also as the skewness coefficient, is given by:
If
It
sk =
1
s ^{3}
µ
_{}_{µ} Ú (x
m
)
3
f(x) dx
X is discrete, then (2.35) can be expressed as
may be shown that when,
sk = (1/s ^{3} )S (x _{i}  m) ^{3} f(x _{i} )
S sums over 1,N.
(2.35)
(2.36)
sk > 0, f(x) is a leftskewed curve: x _{m}_{o}_{d}_{e} < x _{m} < m sk < 0, f(x) is a rightskewed curve; x _{m}_{o}_{d}_{e} > x _{m} > m sk = 0, f(x) is a symmetric curve; or x _{m}_{o}_{d}_{e} = x _{m} = m.
A
graphical display of the leftskewed and rightskewed curves is shown below:
x
Example 214. Let X be a continuous random variable with it's values defined in the interval a x b. Suppose that the corresponding pdf is a constant: f(x)=k. This is a case of uniform distribution.
Now, in order for f(x) to be a bona fide pdf, it must satisfy the total probability axiom:
b
_{Ú}
a
kdx = k(ba) = 1;
This yield the value for k: k = 1/(ba).
Chapter II
Fundamentals
II  16
The cumulative function F(x) is obtained by integrating f(x) from a to x:
F(x) = (xa)/(ba)
The mean and variance can be easily obtained:
m = (a+b)/2
s ^{2} = (ba) ^{2} /12
Example 215. Let the random variable T be defined in the valuerange of 0 t < the form:
f(t) = le ^{}^{l}^{t}
µ and the pdf
in
Here, the pdf is an exponential function. It’s CDF is easily integrated as:
F(t) = 1 
e ^{}^{l}^{t}
The mean and the standard deviation are also easily obtained:
s = m = 1/l.
Example 216. The lifetime (time to failure) of a washing machine is a random variable. Suppose that it’s pdf is described by the function:
f(t) = A t e ^{} ^{0}^{.}^{5}^{t}
where A is a constant and t is in years.
0 £ t < µ
We examine the following properties of this pdf:
(a). The CDF:
F(t) =
t
Ú
0
 0.5t
A t e
dt = (4A) ^{t} (0.5t) e
Ú
 0.5t
0
d(0.5t) _{=} (4A) [1  (1 + 0.5t) e ^{} ^{0}^{.}^{5}^{t} ]
where the integration is carried out by integration by parts.
(b). The pdf satisfies the axiom of total probability F(µ) = 1, we have
µ
_{Ú} f(t) dt
0
=
µ
_{Ú}
0
 0.5t
A t e
dt
=
1
Upon carrying out the integration, we obtain A = 1/4.
(c). The mean of the pdf, which is also called “the mean time to failure”; or MTTF for short:
µ
m = Ú
0
t (A t e
 0.5t
)dt = (1/4) [ 2!/(0.5) ^{3} ] = 4
Chapter II
Fundamentals
II  17
where the integration is carried out using an integration table.
(d). The variance is given by:
s
2
µ
= Ú
0
t
2
(A t e
 0.5t
)dt
(e). The standard deviation is hence: s =
 m ^{2}
=
4
(1/4) [ 3!/(0.5)
]  16 = 8
A graphical display of the pdf and the CDF in this example is shown below:
1
3/4
1/2
1/4
0
(f) The skewness coefficient, sk, of the pdf is given by the integral:
sk = (1/s
3
) Ú
µ
0
(t
 m )
3
 0.5t
(A t e
)dt
With s =
in the plot.
8 and m = 4, it can be shown that sk > 0; so the pdf is a leftskewed curve as can be seen
Discussion: Plots of the pdf and/or CDF provide a visual appreciation of the life distribution of the washing machine. We see that the failure probability rises during the first two years in service (about 25% of the machines will fail by the end of the second year; i.e. F(2) ª 0.25). Similarly, the mean timetofailure (MTTF) is 4 years; and we find F(4)=0.594; so nearly 60% of the machines will fail in 4 years.
If the manufacturer offers a warranty for one full year (t=1), then F(1)=0.09; or 9% of the machines is
expected to fail during the warranty period.
Chapter II
Fundamentals
II  18
Another Note: Often, one has to integrate complex integrals in order to obtain explicit results. An integration table at hand is always helpful. At times, some integration can become extremely tedious to the extent that it may not be integrated explicitly. Such difficulty can be circumvented by a suitable numerical method of integration. One should not view such difficulty as an inherent feature in this course.
Summary.
This chapter introduces: (1) the basic notions in probability, and (2) the mathematical definitions of the properties of a probability distribution function. It is essential to be conceptually clear and mathematically proficient in dealing with these subjects.
In the former, we should be clear about the following:
The event X: the probability of X to occur is denoted by P{X}; the probability of X not to occur is
P{X’}=1P{X}.
If event X is a random variable, X can have a range of possible values: denoted by x if value of x is continuous or by x _{i} if value of x _{i} is discrete. The probability of X=x is denoted by P{X=x}= f(x); similarly, the probability of X= x _{i} is denoted by f(x _{i} ).
f(x) is called pdf since x is continuous; f(x _{i} ) is called pmf since x _{i} is discrete; f(x) is the probability when
X is exactly equal to x.
F(x) is called CDF and F(x _{i} ), CMF. F(x) is the probability when X £ x; it is the area under the f(x) curve up to the specified value of x.
For multiple events X, Y, Z or X _{1} , X _{2} , X _{3} etc., make clear the physical meanings of their “intersection” and “union”, as well as the meanings of “dependent”, “independent” and “mutually exclusive” events.
If occurrence of X depends on the occurrence of Y, the probability of X to occur is P{X/Y}; this is known
as 
the conditional probability. But if X and Y are independent events, P{X/Y}=P{X}; and P{Y/X}=P{Y} 
as 
well. 
For identical and independent events, the binomial distribution applies. The simpler Poisson distribution
is a degenerated binomial distribution, when N is large and p small. For identical but mutually dependent
events, the binomial distribution will not apply.
As for the properties of a probability function f(x), note the definition of
The total probability axiom; the distribution mean, variance and the skewness.
The value of X may be discrete or continuous; handle them with care.
Elementary integration skill will be helpful.
Chapter II
Fundamentals
II  19
Assigned Homework.
2.1. Suppose that P{X}=0.32, P{Y}=0.44 and P{X » Y}=0.58. Answer the following questions with proof:
(a) 
Are X and Y mutually exclusive? 
(b) Are X and Y independent events? 
(c) 
What is the value of P{X/Y}? 
(d) What is the value of P{Y/X}? 
[Partial answer: (a): no; (b): no; P{X/Y}=0.409; P{Y/X}=0.5625]
2.2. Suppose that P{A}=0.4, P{A » B}=0.8 and P{A « B}=0.2. Compute:
(a) P{B};
(b) P{A/B};
(c) P{B/A};
[Partial answer: P{B}=0.6; P{B/A}=0.5]
2.3. In a QC test of 126 computer chips from two different suppliers, the following are the test results:
Pass the QC test
do not pass
Supplier 1 
80 
4 
Supplier 2 
40 
2 
Now, let A denote the event that a chip is from supplier1 and B the event that a chip pass the QC test. Then, answer the following questions:
(a) 
Are A and B independent events? 
(b) Are A’ and B independent events? 
(c) 
What is the meaning of P{A » B}? 
(D) What is the value of P{A » B}? 
[Hint: P{A}=84/126; P{B}=120/126 and P{A _{«} B}=80/126]
2.4. Use the Venn diagram and show the following equalities:

P{Y} = P{Y « X} + P{Y « X’}, X and Y may be dependent events; 

P{Y} = P{Y/X} P{X} + P{Y/X’} P{X’}, X and Y may be dependent events; 

P{Y} = P{Y} P{X} + P{Y} P{X’}, if X and Y are independent events. 
2.5. An electric motor is used to power a cooling fan; an auxiliary battery is used in the event of a main power
outage. Experiment indicates that the chance of a main power outage is 0.6%. And, when the main power is on, the chance that the motor itself fails is p _{m} = 0.25x10 ^{}^{3} ; when the auxiliary battery is on, the chance of the motor
failure is p _{b} = 0.75x10 ^{}^{3} . Determine the probability that the cooling fan fails to function.
[Hint: let X be the event of main power outage, Y, the event of fan failing to operate; note that Y depends on X and X' and P{Y}=P{Y/X}P{X}+P{Y/X'}P{X'} applies. Note also P{Y/X}=p _{b} and
P(Y/X'}=p _{m} . Answer: P{Y}=0.253x10 ^{}^{3} ].
2.6. The values of the random variable X are (1,2,3); the associated pmf is: f(x _{i} ) = Cx _{i} ^{3} , i=1,3.
(a) 
Find the value for C; 
(b) Write the expression of F(x _{i} ); 
(c) 
Determine m and s and sk; 
(d) Plot f(x _{i} ) and F(x _{i} ) graphically. 
[Partial answer: C=1/36; m=2.722; s=0.506; sk = 1.62]
Chapter II
Fundamentals
II  20
2.7. A single die is rolled repeatedly 6 times; each time the #6 is desired.
(a) 
Use (2.18) to compute the pmf, f(i), i=0,6 is the number of times the #6 appears; 
(b) 
Plot a barchart for f(i); i=0,6; indicate the largest f(i) value; 
(c) 
Determine the mean and variance of f(i); 
(d) 
Redo the computation of f(i) by the Poisson equation (2.21); comment on the difference. 
2.8. Show the details how Equation (225) is reduced to (226); similarly, show how Equation (231) is reduced to
Equation (232).
2.9. In a QC test of a lot of engines, 3% failed the test. Now, 8 such engines are put into service; what is the
probability of the following situations?
(a) 
None will fail; 
(b) All will fail; 
(c) 
More than half will fail; 
(d) Less than half will fail. 
[Partial answer: (a): 0.784; (b): nearly 0]
2.10. The probability of a computer chip being defective is p=0.002. A lot of 1000 such chips are inspected:
(a) 
What is the probability that 0.1% or more of the chips are defective? 
(b) 
What is the probability that more than 0.5% of the chips are defective? 
(c) 
What is the mean (expected) number of the defective chips? 
[Hint: Use the Poisson distribution for p<<1 and N>>1. Partial answer: (a): P{n>1} = 1 f(0) f(1) =
0.594]
2.11. Suppose that the probability of finding a flaw of size x in a beam is described by the pdf
f(x) = 4xe ^{}^{2}^{x}
0 £
x £
µ
x is in microns (10 ^{}^{6} m).
(a) 
Verify that f(x) satisfies the “total probability” axiom; 
(b) 
Determine the mean value of the flaw size distribution; 
(c) 
If a flaw is less than 1.5 micron, the beam passes inspection; what is the chance that a beam is 
accepted?
[Partial answer: (b) m=1 micron; (c) P{X<1.5}= 0.8]
2.12. A computer board is made of 64 kbit units; the board passes inspection only if each kbit unit is perfect. On line inspection of 1000 boards finds 60 of them unacceptable. What can you say about the quality of the kbit unit?
[Hint: let p be the probability that a kbit unit is imperfect; and then find the value of p].
2.13. (Optional for extra effort) A cell phone vendor assures that the reliability of the phone is 99% or better; the buyer would consider an order of 1000 phones if the reliability is only 98% or better. The two sides then agree to inspect 50 phones; if no more than 1 phone fails the inspection, the deal will be made.
(a) 
Estimate the vendor’s risk that the deal is off; 
(b) 
Estimate the buyer’s risk that the deal is on. 
[Hint: For the vendor, p = 0.01; estimate the chance that more than 1 in 50 inspected fail. For the buyer, p = 0.02; estimate the chance that no more than 1 in 50 inspected fails].
Chapter II
Fundamentals
II  21
ChapterIII
Data Sampling
III1
CHAPTER III. DATA SAMPLING AND DISTRIBUTIONS
In the preceding chapters, some elementary concepts in probability and/or random variables are introduced; and these are illustrated using a number of examples whenever possible. We note that the central element in these examples is the pertinent probability distribution function (pmf or pdf) for the random variable X identified in the particular problem. In general, the pertinent pmf or pdf may be determined, or estimated, by one of the following two approaches:
(a) Probabilistic Approach. If the exact mechanisms by which the random variable X
is generated are known, the underlying pmf or pdf for X can be determined on the
basis of a certain probability theory. In Chapter II, we have illustrated this approach using simple examples such as rolling dice or flipping coins. In addition, a somewhat more complex mechanism involving the socalled Bernoulli Trial was shown to lead to
a class of pmf's known as binomial distribution.
(b) Data Sampling Approach. Engineering issues, such as the lifetime of a product
in service or the expected performance level of a machinery, often involve intrinsic mechanisms that are not exactly known; the underlying pmf or pdf for the identified random variable cannot be determined exactly. Then, the alternative is to estimate the pmf or pdf using techniques involving data sampling. In Chapter I, we demonstrated briefly how a statistical sample could provide an estimate leading to the underlying pdf for the identified random variable. But in that process, there are certain techniques and the details of which need be thoroughly discussed.
Thus, this Chapter discusses the basic elements in the data sampling approaches, along with their connection to some of the wellknown probability distribution functions.
III1. Sample and Sampling.
At the outset, let us introduce the following terms:
Population. A “population” includes all of its kind. Namely, when a random variable
X is defined, all of it's possible values constitute the “population”.
The underlying pdf of X must be defined for each and every element in the population;
it is referred to as the true pdf of X.
Clearly, for a continuous random variable, the population size is infinite.
Sample. A “sample” is a subset of the population. Thus, the size of the sample is finite even if the population is infinite. Elements in the sample represent only partial “possible values” of X. In general, more than one sample may be taken from the same population.
If a sample contains N elements, it is referred to as a “sample of size N”.
ChapterIII
Data Sampling
III2
Sampling. This refers to the "creation" of sample data in order to estimate the underlying pdf of the random variable X. Depending on the sample is taken, the estimated pdf may not be close to the true pdf of the population; this is especially the case when the sample size is small compared to that of the population. Hence, proper techniques in sampling become important. Moreover, one would also want to have a degree of confidence in the sampling technique as well as the estimated pdf.
Random Sampling. This refers to sampling techniques that guarantee each possible value in the population will have an equal chance of being sampled. Such techniques ensure a closer agreement between the estimated pdf and the true pdf.
Sampling Error. This refers to the difference between the estimated pdf (from a sample) and the true pdf of the population. Logically, the sampling error can be easily assessed if the true pdf of the population is known. But the true pdf may never be known in some cases; hence, the error in sampling can be estimated only based on some statistical reasoning.
III2 Sample Statistics.
Random Sample. Consider a random sample of size N; denote the data in the sample as {x _{i} }, i=1,N. Assume each x _{i} is selected in random; the sample pdf is then a uniform distribution:
f(x _{i} ) = 1/N for i=1,N 
(3.1) 

The sample Mean, according to (2.24), is then: 

m _{s} = S x _{i} (1/N) = (1/N)S x _{i} 
S sums over 1,N 
(3.2) 

The sample variance and skewness are, respectively: 

(s _{s} ) ^{2} = 
(1/N)S (x _{i}  m _{s} ) ^{2} 
S sums over 1,N 
(3.3) 
(sk) _{s} = [1/(Ns ^{3} )] S (x _{i}  m _{s} ) ^{3} 
S sums over 1,N 
(3.4) 
Note that (3.2) is actually the averaged value of the sample {x _{i} }; it is based on the assumption that each x _{i} has the same chance being sampled. However, if N is not large, this assumption can bias the variance; it is likely that at least one value outside {x _{i} } may have been excluded in the sampling. To admit this possibility, the sample variance in (3.3) is often modified in the form:
(s _{s} ) ^{2} = [1/(N1)]S (x _{i}  m _{s} ) ^{2}
S sums over 1,N
(3.5)
The expression (3.5) will be used in all subsequent discussions and in all homework problems.
ChapterIII
Data Sampling
III3
Note: In most engineering situations, one cannot assume a uniform pdf for a sample, e.g. the one shown in (3.1), without a justification. The following example is a case in point:
Example 31. Seventy (70) cars are picked randomly from the assembly line for QC inspection. For each car, the “braking distance” from the running speed of 35 mph to a complete stop is recorded. The following table lists the “raw data” obtained in the QC tests, referred to as the sample: {x _{i} },
i=1,70:
Braking distance, in feet
Bien plus que des documents.
Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.
Annulez à tout moment.