Vous êtes sur la page 1sur 20

STATISTICS CONTINOUS DATA

Dr Hjh Madihah Khalid

Discrete and Continuous Data


A set of data is said to be continuous if the values / observations belonging to it may take on any value within a finite or infinite interval. You can count, order and measure continuous data. For example height, weight, temperature, the amount of sugar in an orange, the time required to run a mile. A set of data is said to be discrete if the values / observations belonging to it are distinct and separate, i.e. they can be counted (1,2,3,....). Examples might include the number of kittens in a litter; the number of patients in a doctors surgery; the number of flaws in one metre of cloth; gender (male, female); blood group (O, A, B, AB).

Histogram
A histogram is a special type of graph used to display continuous data. It is similar to a bar graph but, because of the continuous nature of the variable, a number line is used for the horizontal axiz and the columns are joined together. The rectangles may have different widths but the area is proportional to the frequency. The histogram is only appropriate for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets (>100 observations), when stem and leaf plots become tedious to construct. A histogram can also help detect any unusual observations (outliers), or any gaps in the data set.

Box and Whisker Plot (or Boxplot) A box and whisker plot is a way of summarising a set of data measured on an interval scale. It is often used in exploratory data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and variability. The picture produced consists of the most extreme values in the data set (maximum and minimum values), the lower and upper quartiles, and the median. A box plot (as it is often called) is especially helpful for indicating whether a distribution is skewed and whether there are any unusual observations (outliers) in the data set. Box and whisker plots are also very useful when large numbers of observations are involved and when two or more data sets are being compared.

Example
Make a frequency distribution and graph this

Height (cm) Arm Span (cm) 155- 165- 175- 185- 155- 165- 175- 185- 195164 174 184 194 164 174 184 194 204 Female 6 4 1 1 7 4 0 1 0

Male
Total

0
6

1
5

6
7

5
6

0
7

1
5

4
4

5
6

2
2

Histogram
8 7 6

5
Frequency 4 Frequency 3

2
1 0 155-164 165-174 175-184 height bin 185-194 More

Steps for constructing a histogram:


Draw and label the x (horizontal) and the y (vertical) axes. Represent the frequencies on the y axis and the class boundaries on the x axis. Using the frequencies as the heights draw vertical bars for each class. Note: For the histogram we need the frequencies and the class boundaries.

Frequency polygons/Ogives
A frequency polygon is a graph that displays the data by using lines that connect points plotted for the frequencies at the midpoints of the classes. In the Cartesian system OXY the midpoints are the first coordinates of the vertices of the polygon and the frequencies are the second coordinates. An ogive is a graph that represents the cumulative frequencies for the classes in a frequency distribution. It shows how many of values of the data are below certain boundary.

Steps for constructing a frequency polygon


Draw and label the x (horizontal) and the y (vertical) axes. Represent the frequencies on the y axis and the midpoints on the x axis. Plot the vertices of the polygon. Connect adjacent points with line segments. Draw a line back to the x axis at the beginning and the end of the graph at the same distance that the previous and the next midpoints would be located. Note: For the frequency polygon we need the frequencies and the midpoints.

Steps for constructing an ogive


Draw and label the x (horizontal) and the y (vertical) axes. Represent the cumulative frequencies on the y axis and the class boundaries on the x axis. Plot the cumulative frequency at each upper class boundary with the height being the corresponding cumulative frequency. Connect the points with segments. Connect the first point on the left with the x axis at the level of the lowest lower class boundary. Note: For the ogive we need the class boundaries and the cumulative frequencies

Example
Construct a histogram, frequency polygon and an ogive for the data below:
Class Limits Class Boundaries Frequency (f) Cumulative Frequency 2 10 28 41 48 49 50 Midpoints (Xm)

100-104 105-109 110-114 115-119 120-124 125-129 130-134

99.5-104.5 104.5-109.5 109.5-114.5 114.5-119.5 119.5-124.5 124.5-129.5 129.5-134.5

2 8 18 13 7 1 1

102 107 112 117 122 127 132

From the data, draw a histogram

Line graphs
A graph that uses points connected by lines to show how something changes in value (as time goes by, or as something else happens).

Let's define the various parts of a line graph.

title labels scales points lines

The title of the line graph tells us what the graph is about. The horizontal label across the bottom and the vertical label along the side tells us what kinds of facts are listed. The horizontal scale across the bottom and the vertical scale along the side tell us how much or how many. The points or dots on the graph show us the facts. The lines connecting the points give estimates of the values between the points.

The line graph shows people in a store at various times of the day.

QUESTION 1. 2 3. 4. 5. 6. 7.

What is the line graph about?


What is the busiest time of day at the store? At what time does business start to slow down? How many people are in the store when it opens? About how many people are in the store at 2:30 pm? What was the greatest number of people in the store? What was the least number of people in the store?

Exercise
Age in years 0 age < 20 20 age < 30 30 age < 40 40 age < 50 50 age < 70 70 age < 100 Frequency 56 72 96 45 135 36

Draw the histogram of the data

City Cape Town Durban Gauteng

2000 2715 2370 2732

2005 3063 2631 3254

2010 3316 2804 3574

2015 3401 2876 3674

a. Use the information above and draw a line graph for population in thousands. b. What is the trend of population? c. Draw a line graph for the table given below and comment on the trends South Africa Urban Population (thousands) 2000 2030 Year Population 2000 25948 2005 28119 2010 29505 2015 30722 2020 32017 2025 33312 2030 34523 South Africa Rural Population (thousands) 2000 2030 Year Population 2000 19662 2005 19313 2010 18314 2015 17181 2020 16083 2025 14985 2030 13882

From the frequency distribution above, answer the following questions: a. approximately what percentage of students is above 179 cm tall? b. If you are 170 cm tall, approximately what percentage of students is taller than you? c. To be in the top 20% of the class in terms of height, approximately how tall would you have to be?

Vous aimerez peut-être aussi