Vous êtes sur la page 1sur 9

Statistics 312 Uebersax

http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

Old Business
Picking up pace
Mac issues
Topics to Cover
Frequency distributions/histograms in Mac Excel
Review last homework
Central tendency: mean, median, mode
Percentiles, quartiles,
Descriptive statistics in JMP
Measures of Dispersion: the Variance
Homework assignment

1. Frequency Distributions/Histograms with Mac


Mac only: Status of Excel Data Analysis ToolPak

Not available for Mac as of 2008


Alternative: StatPlus:MacLE

Download and install: http://www.analystsoft.com/en/products/statplusmacle/


Might not make frequency distributions (so see below)

Frequency Distributions/Histograms via Excel FREQUENCY() function

Place data in one column (e.g., a1:a10)


Place bins in another column (e.g., b1:b4)

In another column, select vertical range of blank cells, which contains one more
than number of cells in bin array (e.g., c1:c5)

Type formula: frequency(a1:a10, b1:b4), then press COMMAND+ENTER (Mac)


or CONTROL+SHIFT+ENTER (PC)

Note: double-check bin range address; formula editor may obscure first cell.
Explained here: http://support.microsoft.com/kb/100122
Class demonstration: Female Dover sole lengths

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

2. Review Last Homework


1. For males and females separately, make a distribution table:
In separate columns show range, bin, frequency, percentage, cumulative percentage.
Label columns, include units of mg/dL.
2. For females only, make histogram:

Resize histogram to look nice

Place legend on bottom (not side)

Label x-axis: Upper Limit of Range (Cholesterol, mg/dL)

Add chart title: Distribution of Cholesterol for Females (mg/dL)

If necessary fix secondary y-axis to range from 0 to 100%


3. Make final report comparing male and female distributions:
Col 1:
Col 2:
Col 3:
Col 4:
Col 5:

Range
Male frequency
Male percentage
Female frequency
Female percentage

Remember to save your worksheet.


4. Based on the results, what conclusions can you reach concerning differences between male
and female patients?
Place results of 1. (male and female), histogram (female), final comparison table, and answer
to question into Word document.
Using JMP: Enter data for females into Date Table; produce histogram & basic statistics; cutand-paste results into the same Word document as above.

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

3. Measures of Central Tendency: Mean, Median, Mode,


Review: Watch Khan Academy video on Average, Median, Mode
http://www.youtube.com/watch?v=uhxtUt_-GyM
The Arithmetic Mean

x
N

( for a population)

x=

( for a sample)

Ex: The data represent the number of textbooks purchased by a sample of seven students:
10 4 7 5 7 8 9

10 + 4 +7 +5 +7 +8 +9
7

50
7

= 7.14

Excel AVERAGE() FUNCTION

The mean is affected by any outliers and skews. Because the mean is nonresistant, there are
alternative measures that are more resistant to outliers and skews
The Median
The median is a resistant measure of central tendency that occupies the middle position of
data placed in order of magnitude.
If n is odd, the median is the middle number of the data placed in order of magnitude. It
th

n +1

occupies the

position.

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

If n is even, the median is the average of the middle two numbers of the data placed in order
th

n
2

of magnitude. It is the average of the numbers in the

th

n + 2

and

positions.

Ex Reordering the sample of books: 4 5 7 7 8 9 10.


The median is 7. If there were an eighth person who purchased 12 books, the median would
be 7.5.
Excel MEDIAN() FUNCTION

The Mode
The mode, by definition, is the most frequently occurring value in a series.

There can be more than one modes


There can be no mode
Excel MODE() FUNCTION

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

4. Percentiles and Quartiles


The kth percentile, Pk, is such that no more than k percent of the data are less than Pk and no
more than (100 - k) percent are greater than Pk. Usually used with large data sets.
The first quartile (Q1) is the point that separates the lower 25 percent of the values from the
n +1
upper 75 percent = value corresponding to the
ordered observation.
4
The third quartile (Q3) is the point that separates the upper 25 percent of the values from the
3( n +1)
lower 75 percent = value corresponding to the
ordered observation.
4
Ex Books: 4 5 7 7 8 9 10.
n +1
= 2, so Q1 = 5;
4

3( n +1)
= 6, so Q3 = 9.
4

(If position = #.5, average two nearest values; else, if not integer, round.)

5. Descriptive Statistics in JMP


Method 1: Distribution Function

Enter data into a Data Table (Important: do not mix character and numerical values in a
column!)
Highlight column (takes some practice; hint: to refresh selection: Rows > Clear Row States
Analyze > Distribution > OK

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

More statistics available by clicking red arrow beside Summary Statistics


Method 2: Summary Function
Tables > Summary
JMP Summary Statistics Menu

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson
From Statistics drop-down menu (see above), select statistics one at a time. Selected
statistics will then appear in box to right. (Note: drop-down menu does not appear in picture
below)

For Q1 and Q3, choose Quantile statistic twice, specifying 25% and 75% in this box:

Click: OK

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

For more info: http://www.jmp.com/support/help/Summarize_Columns.shtml

6. Measures of Dispersion: the Variance


Range
Range = Maximum - Minimum
Ex Books: 4 5 7 7 8 9 10

Range = 10 - 4 = 6

Interquartile Range
IQR = Q3 - Q1
Ex The sample of books:

Q1 = 5, Q3 = 9,

IQR = 9 - 5 = 4

Variance (Population and Sample)


The variance is the average squared distance of observations from the mean.
Population variance formula:

The square root of the variance is the standard deviation.


Spreadsheet calculation of population variance:
Ex Books: 4 5 7 7 8 9 10

Statistics 312 Uebersax


http://www.john-uebersax.com/stat312/
06 Measures of Central Tendeny & Disperson

Variance = Average[X mu]^2 =26.857/7 = 3.84


Video: Variance of a Population
http://www.youtube.com/watch?v=6JFzI1DDyyk

7. Homework
Read pp. 104-117, Prob 3.1, 3.2a [skip(4),(6), (10)], 3.2b
Data for 3.b (bolts.xls) on course website

Vous aimerez peut-être aussi