Vous êtes sur la page 1sur 3

Statistics with Excel

These instructions tell you how to do some basic statistics operations on Excel. There is much
better software for statistics, but Excel is commonly available, and can do the thins we need in this
course.
When I use the word “select” below, it means clicking on a cell, or to select a bunch of cells,
clicking on the first one and dragging down to the last one.
After you select a group of cells in Excel, they should be highlighted.

• DATA: The first thing you need to know how to do is how to get data into Excel. There are
three ways:

1. You may already have the data in an Excel compatiable spreadsheet, in which case you
can simply open the spreadsheet in Excel.
2. You may have the data in a text file, in which case you should follow the following steps:
- Start Excel.
- go to the FILE menu and choose OPEN .
- a dialogue window should come up. You will have to set the “Enable” line at the top
to either “Enable Text Files” or “Enable All Documents.”
- Find the document you want and OPEN it.
3. You may have a small list of numbers, in which case you can open an empty Excel
worksheet, and enter the numbers by hand. Click on the top cell in the column where
you wish to enter numbers, type the first number in, then hit down-arrow, type the
second number, etc.

• BASIC OPERATIONS I can’t give an excel tutorial here, but there is one common thing
which you will need to do. Suppose you have a column of data in A2:A99, and another
column of data in B2:B99, and you wish to put the sum of the columns in column C.
Click on C2, and type “=a2+b2 ” (without quotes) and then return .
Now click on C2 again, and drag the mouse down to C99 so that all those cells are selected.
Then go up to the menu bar, and then choose EDIT , FILL , DOWN . This will enter in
each cell Cn the sum An+Bn.

• STATISTICS The basic procedure is almost the same for most of the statistics you need to
computer (mean, median, quartiles, minimum, maximum, standard deviation, etc). Suppose
your data is in cells A1 through A67.

1. Click on an empty cell in which you wish to enter the statistic you are computing.
2. Now click on the “=” sign next to the area in the toolbar for entering data. (This tells
Excel you will be entering some mathematical expression.) After you do this, an equals
sign will appear in the input area in the toolbar, and you can type in the statistic you
need to compute.
3. Now just type in the right command to get the statistic you need.
MEDIAN: median(A1:A67)
MEAN: average (a1:a67)
MINIMUM: min(a1:a67)
MAXIMUM: max(a1:a67)
STANDARD DEVIATION: stdev(a1:a67)
FIRST QUARTILE: quartile(a1:a67,1)
THIRD QUARTILE: quartile(a1:a67,3).
• HISTOGRAMS: Making a histogram is a little bit annoying, but not too hard. The main
problem is that the histogram package is an “add-on” which has to be turned on in Excel.
Again, I assume you are working on a spreadsheet where your data is already entered, say as
cells A1 to A67 again.
1. First you need to select your bins. Choose dividing points for your data. Enter them in
some empty column on the worksheet. I’ll suppose they are D1 to D9
2. Click on Tools in the toolbar and go down to Data Analysis .
If “Data Analysis” doesn’t appear, you have to click on Tools , then Add Ins and then
check the buttons next to the Analysis ToolPak entries on the menu which comes up,
and then click on OK.
3. Now choose Histogram from the menu which came up when you clicked on Tools and
Data Analysis in the step above. And click OK.
4. Now enter a1:a67 for the “Input Range” on the dialog box which comes up. Also click
on “Chart output” in that box, and to be safe, on “New worksheet ply” so that your
histogram won’t cover up any of your data.
Enter d1:d9 for the “Bin Range.” (If you leave it blank, Excel will choose a bin range
for you.)
5. Click OK. Excel will now make a really bad histogram which you need to fix.
6. First notice that you can move the histogram by clicking on it and dragging. You can
also change the size by clicking on an edge or a corner and dragging.
7. You need to get rid of the “gaps.” Double-click on one of the bars. A menu will come
up. Click on “Options” button, set “Gap Width” to 0, and click “OK.”
8. Now you need to label and title things decently. You can change the name from “his-
togram” to something resonable by clicking on the title, and typing in what you want.
Similarly for other labels.
9. Finally, if you wish to print, click on the body of the histogram, and then choose Print
from the File menu.
• SCATTERPLOTS: To make a scatterplot, you’ll need to be examining two variables, x
and y. Typically x will be in one column of the spreadsheet, and y will be in another.
There are several ways to make a scatterplot.

– The easiest is if x and y are in adjoining columns with x to the left of y. Then select the
data from both columns, click on the Chart Wizard button in the toolbar, select XY
(Scatter) under “Chart type” and the picture which looks like a bunch of dots under
“Chart sub-type.” Then click Next. You can probably click next again until you get to
the page which allows you to enter a title and labels for the axes.
– If x and y are not in adjoining columns but x is still left of y, you can select the data
from both columns (wihout selecting the intermediate columns) and proceed as in the
previous step.
– If x is not to the left of y, or if you are having trouble selecting the data you can enter
the data ranges directly into the scatterplot command. Suppose x is in cells E6 through
E123 and y is in cells C6 through C123. Then click on the Chart Wizard button in
the toolbar, select XY (Scatter) under “Chart type” and the picture which looks like
a bunch of dots under “Chart sub-type.” Then click Next. You’ll get a spot to enter
the x data (enter E6:E123 ) and the y data (enter Y6:Y123 ). Then continue as above
with the title and axes labels.

• CORRELATION COEFFICIENTS: Assuming, as above, that you have your data in E6


through E123 and in C6 through C123, double click on a cell where you wish to enter the
correlation coefficent and type =correl(e6:e123,c6:c123).

• REGRESSION LINE (least squares): Once you have made your scatterplot, you can click
on the plot, and go up to the Chart menu and select Add Trendline. . . . Click on Linear.
If you want to display the equation for the regression line or the r2 value, click on Options
and check the appropriate checkboxes.

• NORMAL DISTRIBUTION PROBABILITIES:

– To work with the standard normal distribution: if you double click on a cell, and enter
=normsdist(z) you will get the area under the standard normal curve to the left of z.
This is the equivalent of looking z up in Table A.
– To work with some other normal distribution: if you double click on a cell and enter
=normdist (x,µ,σ,TRUE) you will get the area under the normal curve with mean µ
and standard deviation σ to the left of x.
– Double-clicking on a cell and entering =normsinv(p) will give the z-vaue so that the
part of the curve to the left of z has area p. (This is using Table A backwards.)

• t-DISTRIBUTION PROBABILITIES: For a t-distribution of n degrees of freedom, to


get the probability that t ≥ a (or that t ≤ −a), double click on a cell, and enter =tdist
(a,n,1).
The “1” describes that you are asking for the area of one tail. If you want the area of two
tails, (this will be the probability of t ≥ a and t ≤ −a) enter =tdist (a,n,2).

Vous aimerez peut-être aussi