Vous êtes sur la page 1sur 20

06-Dec-18

ME-5101
Engineering Analysis &
Statistics
Lect. # 6
Standard Deviation & Variance

Dr. Nazeer Ahmad Anjum


Mechanical Engineering Program
Engineering University Taxila

Measures of Central Tendency

Mean Median Mode

15-2

1
06-Dec-18

Measures of Variability

Variance

Quartile Standard
Deviation Deviation

Interquartile
Range
Range

15-3

Symbols

Variable Population Sample


_
Mean µ X

Proportion  p

Variance 2 s2

Standard deviation  s

Size N n

Standard error of the mean x Sx


Standard error of the proportion p Sp

15-4

2
06-Dec-18

Key Terms

• Central Tendency • Normal Distribution


• Descriptive Statistics • Quartile Deviation (Q)
• Deviation Scores • Skewness
• Frequency Distribution • Standard Deviation
• Interquartile Range • Standard Normal
(IQR) Distribution
• Kurtosis • Standard Score (Z score)
• Median • Variability
• Mode • Variance

15-5

Kurtosis

15-6

3
06-Dec-18

Mean and Standard Deviation of 7


Grouped Data
• Make a Frequency Table
• Compute the Midpoint (x) for each class.
• Count the Number of entries in each
class (f).
• Sum the „f‟ values to find „n‟, the total
number of entries in the distribution.
• Treat each entry of a class as if it falls at
the class midpoint.

06-Dec-18

Introduction to Variance 8
Populations include each element from the set
of observations that can be made. It is usually
denoted by the letter “µ.” Population attributes
are called parameters.

Sample include one or more observations


from the population. The provided list
represents a statistical sample, then the mean
is called the sample mean. The sample mean is
denoted by “𝑿 ” Sample attributes are called
statistics.

06-Dec-18

4
06-Dec-18

Sample Mean for a Frequency 9


Distribution
If the n observations in a sample are denoted by
x1,x2, . . ., xn, the sample mean is

x 
xf
n
n
 xi
x  x 2  ...  x n i 1
x  1 
n n
06-Dec-18

Calculation of sample & Papulation Variance


10
• We could just drop the negative signs, which is
the same mathematically as taking the absolute
value, which is known as the mean deviations.
• The concept of absolute value does not lend
itself to the kind of advanced mathematical
manipulation necessary for the development of
inferential statistical formulas.
• The average of the squared deviations about
the mean is called the variance.

06-Dec-18

5
06-Dec-18

Calculation of Sample & Papulation Variance


11
The sample variance S2 based on (n-1) degrees
of freedom.
The term degrees of freedom results from the fact
that the n deviations x1 – 𝑥 , x2 - 𝑥 , . . ., xn − 𝑥
always sum to zero, and so specifying the values
of any (n−1) of these quantities automatically
determines the remaining one.

The sample range is r = max(xi) – min(xi)

06-Dec-18

Introduction to Variance 12
Variance is a measure of how data points
differ from the mean
Data Set 1: 3, 5, 7, 10, 10
Data Set 2: 7, 7, 7, 7, 7
What is the mean and median of the above data set?

Data Set 1: mean = 7, median = 7 Sample Population


𝒙𝒊 𝒙𝒊
Data Set 2: mean = 7, median = 7 𝒙= 𝝁=
𝒏 𝑵
But we know that the two data sets are not identical. The
variance shows how they are different.
We want to find a way to represent these two data set
numerically.
06-Dec-18

6
06-Dec-18

Sample Standard Deviation for a 13


Frequency Distribution
If x1,x2, …,xn is a sample of n observations, the
sample variance is n
 (x  x)
2

s 2  i 1
n 1
Standard deviation tells how spread
out numbers are, Sample Standard
deviation can be calculated: s  ( x  x )2 f
n 1
n
 (x  x)
2
i 1
s
n 1
06-Dec-18

Sample Standard Deviation for a 14


Frequency Distribution
Population mean, denoted μ, can be
determined by: N N
 xi f ( xi )  xi
  i 1  i 1
 f N
Population Variance, denoted by 2;
N
 ( xi   )
2
i 1
 2

N N
 ( xi   )
2
Population Standard deviation can i 1
be calculated: 
N
06-Dec-18

7
06-Dec-18

Calculation of the Mean of Grouped


15
Data (Sample)
f x 𝒙f
 xf  xf
Ages:

30 - 34 4 32 128 x 
n f
35 – 39 5 37 185

40 - 44 2 42 84
820
45 - 49 9 47 423   41 . 0
20
f = 20  𝒙 f = 820

06-Dec-18

Calculation of the Standard Deviation 16


of
Grouped Data (Sample)
Ages: f xi xi  X ( xi  X ) 2 ( xi  X ) 2 . f
30 – 34 4 32 –9 81 324

35 – 39 5 37 –4 16 80

40 – 44 2 42 1 1 2

45 - 49 9 47 6 36 324

f = 20 ( x i  X ) 2 . f  730

X  (x  x) f
2
730 s  38 . 42  6 . 20
s 
n 1 20  1 06-Dec-18

8
06-Dec-18

Computation Formula for Standard


17
Deviation (Sample)
x f xf x2 f
where SS x   x 2
f
 xf 
2

32 4 128 4096 n
2
820
37 5 185 6845  34350   730
20
42 2 84 3528

47 9 423 19881
f = 20 xf = 820 x2f = 34350

SS x 730
s   6.20
n 1 20  1
06-Dec-18

Calculation of Sample Variance 18


Observation Score XX ( X  X)2
X
1 3
2 5
3 7
4 10
5 10
Totals 35

The mean is 35/5=7.


06-Dec-18

9
06-Dec-18

Calculation of Sample Variance 19


Observation Score XX ( X  X )2
X
1 3 3-7=-4
2 5 5-7=-2
3 7 7-7=0
4 10 10-7=3
5 10 10-7=3
Totals 35

06-Dec-18

Calculation of Sample Variance 20


Observation Score XX ( X  X )2
X
1 3 3-7=-4 16

2 5 5-7=-2 4

3 7 7-7=0 0

4 10 10-7=3 9

5 10 10-7=3 9

Totals 35 38

06-Dec-18

10
06-Dec-18

Calculation of Sample Variance 21


Observation Score XX ( X  X )2
X
1 3 3-7=-4 16
2 5 5-7=-2 4
3 7 7-7=0 0
4 10 10-7=3 9
5 10 10-7=3 9
Totals 35 38

n
 (x  x)
2
38
s 2  i 1   9 .5
n 1 4
06-Dec-18

Calculation of Sample Variance 22


Example 2
Dive Moon Soon
1 28 27
2 22 27
3 21 28
4 26 6
5 18 27
Find the mean, median, mode, range?
mean 23 23 The mode is the value that occurs with
median 22 27 greatest frequency.
Mode NO 27 The range is the difference between
range 10 21 highest and lowest value

What can be said about this data?


Due to the outlier, the median is more typical of overall
performance.
Which diver was more consistent? 06-Dec-18

11
06-Dec-18

Calculation of Sample Variance 23


Dive Moon's XX ( X  X )2
Score X
1 28 5 25
2 22 -1 1
3 21 -2 4
4 26 3 9
5 18 -5 25
Totals 115 0 64
Moon’s Variance = 64 / 5 = 12.8
Soon’s Variance = 362 / 5 = 72.4
Conclusion: Moon has a lower variance therefore
he is more consistent.
06-Dec-18

Calculation of Sample & Papulation Variance


24
Problem 1A: Calculate the Terms for the Sample
Variance and Sample Standard Deviation for the pull-off
force (lb) for data given below.
i 𝑥𝑖 2
𝑥𝑖 − 𝑥 𝑥𝑖 − 𝑥
1 12.6
2 12.9
3 13.4
4 12.3
5 13.6
6 13.5
7 12.6
8 13.1
Total 06-Dec-18

12
06-Dec-18

Calculation of Sample & Papulation Variance


25
Solution: i 𝑥𝑖 2
𝑥𝑖 − 𝑥 𝑥𝑖 − 𝑥
1 12.6 -0.4 0.16
2 12.9 -0.1 0.01
3 13.4 0.4 0.16
4 12.3 -0.7 0.49
5 13.6 0.6 0.36
6 13.5 0.5 0.25
7 12.6 -0.4 0.16
8 13.1 0.1 0.01
Total 104.0 0.0 1.60
8

𝑥𝑖 − 𝑥 2 = 1.60
𝑖=1
06-Dec-18

Calculation of Sample & Papulation Variance


26
Solution:

How Sample variance measures variability


1 .6
Sample Variance, s2   0.2286 (lb) 2
8 1
Sample standard deviation, s 2  0.2286  0.48 lb
06-Dec-18

13
06-Dec-18

Calculation of Sample & Papulation Variance


27
Solution:
Short Cut Method to calculate Sample Variance,

Sample standard deviation,

06-Dec-18

Calculation of Sample & Papulation Variance


28
Relationship between a population and a sample

06-Dec-18

14
06-Dec-18

Sample & Papulation Variance 29


1. Calculate the sample mean and sample standard deviation.
2. Construct a stem-and-leaf diagram for data shown in
problems and comment on any important features that you
notice.
3. What is median, first and third quartiles? What is IQR ?
4. Draw Box Plot indicating outliers in the diagram.
5. Construct a dot diagram
Problem 2A: Eight measurements were made on the inside
diameter of forged piston rings used in an automobile engine.
The data (in millimeters) are 74.001, 74.003, 74.015, 74.000,
74.005, 74.002, 74.005, and 74.004.
Problem 3A: An article in the Journal of Structural Engineering
describes an experiment to test the yield strength of circular
tubes with caps welded to the ends. The first yields (in kN) are
96, 96, 102, 102, 102, 104, 104, 108, 126, 126, 128, 128, 140,
156, 160, 160, 164, and 170.
06-Dec-18

Sample & Papulation Variance 30


1. Calculate the sample mean and sample standard deviation.
2. Construct a stem-and-leaf diagram for data shown in
problems and comment on any important features that you
notice.
3. What is median, first and third quartiles? What is IQR ?
4. Draw Box Plot indicating outliers in the diagram.
5. Construct a dot diagram

Problem 4A: The following data are direct solar intensity


measurements (watts/m2) on different days at a location in
Pakistan: 562, 869, 708, 775, 775, 704, 809, 856, 655, 806,
878, 909, 918, 558, 768, 870, 918, 940, 946, 661, 820, 898,
935, 952, 957, 693, 835, 905, 939, 955, 960, 498, 653, 730, and
753.

06-Dec-18

15
06-Dec-18

Sample & Papulation Variance 31


Problem 5A: The following data are the numbers of
cycles to failure of aluminum test coupons subjected to
repeated alternating stress at 21,000 psi, 18 cycles per
second.
1115 865 1015 885 1594 1000 1416 1501 1310 2130
845 1223 2023 1820 1560 1238 1540 1421 1674 375
1315 1940 1055 990 1502 1109 1016 2265 1269 1120
1764 1468 1258 1481 1102 1910 1260 910 1330 1512
1315 1567 1605 1018 1888 1730 1608 1750 1085 1883
706 1452 1782 1102 1535 1642 798 1203 2215 1890
1522 1578 1781 1020 1270 785 2100 1792 758 1750
Does it appear likely that a coupon will “survive” beyond
2000 cycles? Justify your answer.

06-Dec-18

Problem 6A 32
The diameter of shafts manufactured is normally
distributed with a mean of 3.0 cm and a standard
deviation of 0.009 cm. The shafts that are with
2.98 cm or less diameter are scrapped and shafts
with diameter more than 3.02 cm are reworked.
Determine the percentage of shafts scrapped and
percentage of rework.

06-Dec-18

16
06-Dec-18

Problem 6A 33
Solution: Mean ( ) = 3.0 cm, Standard deviation ( ) =
0.009 cm Let upper limit for rework (U) = 3.02 cm, Lower
limit at which shafts are scrapped (L) = 2.98,
Now let us determine the Z value corresponding to U and L

From standard normal tables

That is, percentage of rework = 1.32

Similarly,
06-Dec-18

Problem 6A 34

06-Dec-18

17
06-Dec-18

Percentiles 35
A percentile provides information about how the
data are spread over the interval from the
smallest value to the largest value.
The pth percentile is a value such that at least p
percent of the observations are less than or
equal to this value and at least (100 - p) percent
of the observations are greater than or equal to
this value
Interquartile range (IQR = Q3 – Q1) may also
used as a measure of variability. The
interquartile range is less sensitive to the extreme
values in the sample than is the ordinary sample
range. 06-Dec-18

Calculation of Percentiles 36

Step 1. Arrange the data in ascending order (smallest


value to largest value).
𝑝
Step 2. Compute an index i, 𝑖 = 𝑛 , where p is the
100
percentile of interest and n is the number of observations.
Step 3.
(a) If i is not an integer, round up. The next integer greater
than i denotes the position of the pth percentile.
(b) If i is an integer, the pth percentile is the average of the
values in positions i and i + 1.

06-Dec-18

18
06-Dec-18

Quartiles 37
It is often desirable to divide data into four parts, with
each part containing approximately one-fourth, or 25% of
the observations.

Q1 First quartile, or 25th percentile


Q2 Second quartile, or 50th percentile (also the median)
Q3 Third quartile, or 75th percentile.

06-Dec-18

Self Assessment-1 38
Calculation of Percentiles
Gradua Monthly Graduate Monthly
te Starting Starting Pay
Pay
1 3450 7 3490
2 3550 8 3730
3 3650 9 3540
4 3480 10 3925
5 3355 11 3520
6 3310 12 3480
Calculate: Mean, Median, Mode, Rage, Quartiles and
determine the 85th percentile
Mean=3540, Median=3505, Mode =3480, Range =615, 50th
percentile is (3490+ 3520)/2 = 3505, Q1= 3465, Q3=3600

19
06-Dec-18

Self Assessment-2 39
The Dow Jones Travel Index reported what business travelers pay for hotel
rooms per night in major U.S. cities (The Wall Street Journal, January 16,
2004). The average hotel room rates for 20 cities are as follows:

a) What is the mean hotel room rate?


b) What is the median hotel room rate?
c) What is the mode?
d) What is the first quartile?
e) What is the third quartile?
06-Dec-18

Standard Deviation 40
Two classes took a quiz. There were 10
students in each class, and each class had an
average score of 81.5
Since the averages are the same, can we
assume that the students in both classes all did
pretty much the same on the exam?

The answer is…….. No.


The average (mean) does not tell us anything
about the distribution or variation in the
grades.

06-Dec-18

20

Vous aimerez peut-être aussi