Académique Documents
Professionnel Documents
Culture Documents
6th Presentation
Copyright 2009
Recap
Mean is the average of a set of numerical values. Standard deviation
A measure of dispersion of data (how spread out your data is) from the mean A smaller standard deviation value will indicate that most of the data points are nearer to the mean.
Frequency table
The frequency table shows all possible outcomes of an event and the number of times each outcome occurs.
Salary Range ($) Frequency Relative Frequency
Relative Frequency
Frequency Total Frequency
24 22 19 29 32 11
.....
.....
31 22 15 27 25 1000
.....
Note: Relative frequency can be also interpreted as the probability of the occurrence of a particular event. E.g. The probability of a randomly selected executive obtaining a salary in the range of $1950 to $1975 is 0.027.
0.2
Relative Frequency
0.15
The distribution shows that all outcomes have about the same chance of occurrence for our data.
0.1
0.05
The variance and standard deviation are 83849 and 289.6 respectively.
Sample size
Sample size is the number of observations that constitute the sample. For example, when we are calculating the average of 2 entries, the number of observations is 2. Hence, the sample size will be 2. Likewise, if we are calculating the average of 30 entries, then the sample size will be 30.
Average of 2 entries
The distribution of the average of 2 entries is as shown below:
Distribution of the averages of two entries of salary
0.25
Relative Frequency
Mean = 1501
The relative frequency is the lowest at the two tails of the distribution, and it increases as we move towards the mean.
0.15
0.1
0.05
The mean of the distribution is approximately $1494, i.e. the mean of the distribution of the individual entries.
The variance is reduced by a factor of 2, which is the number of entries in each sample.
Average of 10 entries
The distribution of the average of 10 entries is as shown below:
Distribution of the averages of ten entries of salary
0.25
0.2
Relative Frequency
Mean = 1499
0.15
The spread of the distribution is further reduced with the highest frequency occurring at the mean.
0.1
0.05
Average of 30 entries
The distribution of the average of 30 entries is as shown below:
Distribution of the averages of thirty entries of salary
0.25
0.15
( 83849 / 30 )
The mean is approximately $1494 and the spread of the distribution becomes even smaller.
0.1
0.05
The probability of obtaining an average that is approximately $1494 increases with the number of entries in each sample.
$1501
$1499
$1500
Normal distribution
Mean
The Normal curve is symmetrical about its mean. It is described by its mean and its standard deviation (or variance). The area under the Normal distribution curve represents the probability of an event occurring where the total area is 1.
Normal distribution
90%
10%
Mean
163 cm
172 cm
The height of a population is normally distributed. It can be modeled by a Normal distribution curve. For the example above, there is a 10% chance for a randomly chosen individual to have a height exceeding 172 cm.
Estimating average
From the discussions made so far, the distribution of the average of a reasonably large sample size approaches a Normal distribution and the spread of the data will be very narrow. As the sample size gets larger, the average obtained from this sample is less likely to deviate too far from the actual average of the population. Thus, to estimate the average of a population, we can use the average computed from a randomly selected large sample ( 30) of the population. In conclusion, adopting Peters suggestion is an efficient and effective way of solving Serenes problem.
Learning points
Understand the relationship between frequency and relative frequency Understand the relationship between relative frequency and probability Understand that for large sample size ( 30), the Normal distribution curve is a fairly accurate representation of the distribution of the sample means Understand that the variance of the distribution of sample means is inversely proportional to the sample size
Discussion
Suppose the distribution of the individual entries is as follows:
Distribution of the individual entries of salary
0.25
Relative Frequency
Suggest how the distribution of average might look like when sample size is large (30).