Vous êtes sur la page 1sur 87

SPSS for Windows Student Manual

Created by H Andrews (2008) Updated (2009)

Contents
Introduction.3 Section 1: Entering and Saving Data4 Section 2: Descriptive Statistics.14 Section 3: Graphs..27 Section 4: Correlation Analysis..45 Section 5: T-tests..53 Section 6: Chi-Square Tests....62 Exercise Feedback.74

Introduction to the SPSS Student Manual


This manual is designed to introduce you to a computer package known as the Statistical Package for the Social Sciences, or more commonly SPSS. SPSS is used to analyse data. The manual is a self-study tool that is constructed to help you master the basics of data analysis in SPSS. It broadly covers three areas: 1. Entering and saving data; 2. Descriptive statistics and graphs; 3. Basic analyses. Each step builds upon the last and the manual should be worked through in sequence from section 1 to 6. In each section you will be asked to follow an example to familiarise yourself with the steps you need to take in SPSS. There will then be an exercise for you to work through alone. You do NOT need to remember how to perform each operation in SPSS you can use the instructions. You should work through the manual in your own personal lab time. We will not cover the background theory of statistics and research methods in this manual. These issues will be addressed in other aspects of your Methods course. To get the most out of SPSS a good understanding of statistical theory and research design is essential. Please ensure that you attend your lectures and seminars in order to support your learning. You will not be able to conduct successful research simply by completing this manual. To use this manual you should also be computer literate. You will ideally be familiar with Microsoft Office programmes and able to use a computer to perform basic tasks e.g. creating a Word document. If you are not confident in your computing skills please see your course tutor. If you have any problems with completing the exercises in this manual or using SPSS then please see the psychology technicians or your Methods 1 tutor.

Section 1 Entering and Saving Data


Creating and saving an SPSS data file enables you to create an electronic record of your results that can be analysed. By the time you complete this session you will be able to: Open a new SPSS file Create variables in SPSS Enter Data Save an SPSS file Open an existing SPSS file 1.1 1.1.1 1.1.2 1.1.3 Opening a new Data file in SPSS Click on the start menu in the bottom left-hand corner of your screen. Look for the SPSS Statistics 17.0 for Windows icon. If it is not immediately visible, click on All Programmes. Click on SPSS Inc., then Statistics 17.0.
SPSS Statistics 17.0 for Windows Icon

1.1.4 1.1.5

Click on the SPSS Statistics 17.0 for Windows icon. This will load SPSS. As SPSS opens a dialogue box will appear. This provides you with options for opening SPSS. To create a new data file click in the circle next to Type in data.

You should then have a new SPSS data file open. This window is called the Data Editor.

1.2

Entering Data

The SPSS Data Editor has two views: the Data View and the Variable View. Data view displays something similar to a spreadsheet. This is where you enter your raw data. Variable view displays all the variables that you are using. You can provide information about each of your variables in this view to make your results easier to understand later. To switch between the two in SPSS, click on the tab at the bottom of the page.

Data View tab Variable View tab

The first step in entering data is to define your variables. Throughout this manual we will be using a simple example to demonstrate how to use some of the basic functions in SPSS.

Example A teacher was interested in the seemingly large differences in height between the pupils in her school. She wanted to investigate how these were linked to the pupils age and sex. She collected data on age, sex and height from 15 pupils to investigate how the three were related. From the example, you can see that there are two independent variables; age and sex and a single dependent variable; height. Follow the steps below to create a list of the variables in SPSS. 1.2.1 Ensure that you are in Variable View by clicking on the Variable View tab. Each row on this sheet represents one variable. The columns are characteristics of the variables. In row 1, click on cell under the Name column. Here you can enter a name for your variable. It is good practice to make your first variable one that uniquely identifies each case. Call this variable ID. Type ID into the Name cell. SPSS will automatically fill in the other characteristics for you.

1.2.2

FYI There are several rules that apply to naming variables in SPSS, they: a. can be up to 64 characters in length; b. must be unique; c. must begin with a letter d. cannot include full stops, blanks or other characters e. cannot include words commonly used as commands by SPSS (all, ne, eq, to, le, lt, by, or, gt, and, not, ge, with) 1.2.3 You may wish to alter some characteristics in the variable view. The two characteristics most commonly altered are Type and Measure. Type refers to the format your data will take (i.e. words, numbers, currency). To change Type click in the cell you want to change and click on the grey box. Choose the format of data that you will enter for that variable (for words choose String). Click OK. Measure refers to the scale of measurement of your data (e.g. ordinal, ratio). To change Measure click in the cell you wish to change and choose from the options that appear in a drop-down list (for interval/ratio data choose Scale). Click the cell under the Name column in row two. Give this the name of Age. Click onto the Name column in row three. Give this the name of Sex. Sex is a categorical variable. It is good practice to provide values in order to define which numerical value entered in SPSS relates to which category. To do this

1.2.4 1.2.5

click on the cell under the Values column on row three. A grey box will appear at the right-hand side of the cell. Click on this. The following box will appear.

1.2.6

1.2.7 1.2.8

In the box labelled Value, type the number 1. In the box labelled Label, type male. Click on the Add button. This will move your entries into the large white box at the bottom. Type 2 into the Value box. Type female into the Label box. Click on Add. Now click OK. The box will close and you will return to the variable view. As a result of these steps, in your output (when you have run analyses) instead of your groups being labelled 1 and 2, they will appear as male and female. As sex is a categorical variable, we also need to change the scale of measurement. Click in the Measurement cell and select Nominal. Click on the cell under the Name column in row 4. Type in Height. Your resulting Variable View should look like this.

1.2.9

Click on the Data View tab. You should now see that your variable headings have been added as column headers in the data view.

You are now ready to input your raw data. The data for this example is in the table below. ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Age 10 7 8 9 9 10 8 8 9 8 8 9 7 10 10 Sex Male Male Female Male Female Male Female Female Female Male Male Female Male Male female Height (cm) 140 120 100 130 110 135 100 105 115 120 130 130 105 135 120

1.2.10 To add data for participant one, click in the cell under ID in row one. Type in 1 using the keyboard. Staying in row 1, move to the cell below Age. Type in 10. Move over again to the cell under sex. Remember that sex is a categorical variable, and the data we enter is the NUMBER that represents the category we want. In this case the participant is male, so we enter 1, as this is the number we defined as male when we entered values for this variable. Finally, move along to the height column, still in row 1. Enter the height of participant 1, which is 140. The resulting data should look like this:

1.2.11 Repeat the process for each participant, starting on a new row each time. Once you have finished, your data view should look like this:

This is now ready for us to analyse. Before we do this however we need to save the data that we have entered. 1.3 Saving Data in SPSS

It is crucial to save your data as you go along. Follow these steps to save this data file. You will need this file later on in the manual. 1.3.1 Click on File in the top left hand corner of your screen. Click on Save As from the drop down menu.

1.3.2

You need to select a location to save your data file in. To see all available locations click on the arrow at the far right hand end of the box labelled Look in:.

1.3.3

From this list select the location you wish to save your file in by clicking on it. SPSS data files that are already held in that location will be displayed in the larger box.

FYI The most reliable place to save data is in your student account on the G:\ drive. 1.3.4 Once you are happy with the location you are saving your file to you need to name the file. In this example we will call it Data 1. To do this, click in the box labelled File Name:. Type the name you have chosen for the file here.

1.3.5 1.3.6 1.3.7

Ensure that in the Save as type: box SPSS (*.sav) is selected. If it is not click on the arrow and select it from the list. Once you are happy with the location of the file and the name you have called it click on Save. You file should now be saved to the location you specified. If your save is successful you will receive a message in a separate Output Window like the one below.

10

1.3.8

1.3.9

To exit this message click on the red cross at the top right-hand corner of the window. When asked if you want to save the contents of the output viewer click no. To close your data file click on the red cross at the top-right hand corner of the window. When asked if you want to proceed click yes. This will close all SPSS. Opening an Existing File in SPSS

1.4

Follow steps 1.1.1-1.1.4 When the dialogue box opens select Open an existing data source.

Recently opened items will appear in the box below Open an existing data source. Click on the file you wish to open. Click on OK. If the data file you are looking for is not in the list of data sources provided double click on More Files. Search for your file in your computer, as you would any other file. When you find the file you are looking for click on it to highlight it then click on Open. This will open the data file in the Data Editor window. An output window will open showing that you have retrieved a file. FYI You can add to or alter the data and variables in an existing data file. To add new variables or data simply follow the same procedure as when adding new information. To alter data or variables simply click into the cell you want to alter, delete the contents and replace with the new content. When you have finished click on the File menu and Click on Save.

11

Exercise 1 A researcher was interested in whether music can have an effect on heart rate in humans. He tested 30 participants in total. 10 participants listened to classical music, 10 to jazz and 10 to dance music. He recorded their average heart rate in beats per minute and the sex of each participant. At the end of the session he also asked each participant to record whether they felt energised, neutral or relaxed and how much they enjoyed the music on a scale of 1-10. The data he obtained is given overleaf. 1.1 1.2 Open a new data file in SPSS Create three new variables to represent the participant, the kind of music they listened to and their heart rate.

Handy Hints! Remember that type of music, sex and feelings are categorical variables and you need to create values to label your groups. 1.3 Enter the data shown in the table into the data view.

Handy Hints! All data entered must be numerical. You cannot enter words. 1.4 Save the data file you have created to a place of your choice.

12

Participant 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Music Group classical jazz classical dance dance jazz classical classical jazz classical dance dance jazz classical jazz dance classical classical classical jazz jazz dance jazz classical dance jazz dance dance jazz dance

Average Heart Rate 65 70 68 72 70 75 72 70 75 71 78 75 73 69 75 80 70 68 67 69 74 75 78 71 81 76 76 78 74 80

Sex Female Female Male Female Male Female Male Male Female Male Female Male Male Female Female Female Female Male Male Male Female Male Female Male Female Male Male Female Female Female

Feeling Relaxed Relaxed Neutral Neutral Relaxed Energised Neutral Relaxed Neutral Neutral Energised Energised Energised Relaxed Neutral Energised Relaxed Relaxed Relaxed Relaxed Neutral Neutral Neutral Neutral Energised Energised Energised Neutral Relaxed Energised

Enjoyment 7 8 4 5 5 8 5 4 7 7 6 7 9 2 6 7 9 8 4 7 7 5 8 6 9 8 7 5 6 7

13

Section 2 Descriptive Statistics


Descriptive statistics are a vital part of analyses in Psychology. The most important function of descriptive statistics is to outline the characteristics of your sample. Measures that provide this function include the mean, median, mode, minimum and maximum values, range and standard deviation. By the end of this session you will be able to: Obtain descriptive statistics for continuous variables Obtain descriptive statistics for categorical variables Obtain separate descriptive statistics for groups within a sample Save output created in SPSS Print descriptive statistics Import descriptive statistics into Word documents 2.1 2.1.1 2.1.2 Descriptive Statistics for Continuous variables Open data file Data 1 that you created in Section 1 (see section 1.4 for instructions on how to do this). There are two continuous variables that we are interested in obtaining descriptive statistics for: age and height. To do this, begin with the Data Editor window open. Click on Analyse from the menu at the top of the page. Move your cursor down to Descriptive Statistics. Seven options should appear to the right of the original drop down box.

2.1.3

2.1.4

Click on Descriptives. The following dialogue box should open.

14

2.1.5

We now need to move the variables of interest into the box labelled Variable(s):. To do this highlight Age by clicking on it. Then click the arrow button located between the two boxes. This should move Age from the left hand box into the Variable(s): box.

2.1.6 2.1.7

Repeat step 2.1.5 with Height. You now need to select what descriptive statistics you produce. To do this click on Options. The following box will be displayed.

2.1.8

Ensure that there are blue ticks in the boxes next to Mean, Std. deviation, Minimum and Maximum. You may also wish to select additional descriptive statistics from those available. In this case we want the Variance and Range also. Tick the box next to each of these by clicking in it. If you have selected it correctly a blue tick will appear in the box. 2.1.9 Click on Continue. This will take you back to the box shown in step 2.1.5. 2.1.10 To run the analyses click on OK. It may take a few seconds for SPSS to process the analyses.

15

FYI When you run analyses in SPSS, your results are presented in another window. This window is known as the SPSS Statistics Viewer (see below).

What to look for Look at the table entitled Descriptive Statistics.


Descriptive Statistics N Age Height Valid N (listwise) 15 15 15 Range 3.00 40.00 Minimum 7.00 100.00 Maximum 10.00 140.00 Mean 8.6667 119.6667 Std. Deviation 1.04654 13.42528 Variance 1.095 180.238

On the top row of the table are the results of the analyses for the Age variable. On the second row are the results for the Height variable. The first column shows the number of cases (participants in this example) included in the analyses. Check that this is what you expect. If you have too many or too few N you have probably made a mistake in your data entry and you should check this. The second column gives the range of the scores in the dataset. The third column displays the minimum value of the variable. The forth column gives the maximum value of the variable. The fifth column displays the mean value. The sixth column displays the standard deviation. The mean and standard deviation are the two descriptives most commonly reported. The seventh column displays the variance. For more information on what these numbers mean, and how to interpret them please consult your notes from lectures and course text books.

16

2.2

Descriptive Statistics for Categorical Variables

The procedure for obtaining descriptive statistics for categorical variables is different to that for continuous variables. When looking at categorical variables we are usually interested in knowing the number of cases in each category. In this example sex is the only categorical variable. We will now obtain the descriptive statistics for this variable. 2.2.1 2.2.2 In the data editor window click on Analyse. Select Descriptive Statistics, then Frequencies.

2.2.3 In the Frequencies dialogue box that opens, highlight the variable Sex. Move this into the right-hand box labelled Variable(s): using the arrow button in the centre of the two boxes.

2.2.4 Click on the Statistics button. In the Dispersion section, click on Minimum and Maximum.

17

2.2.5

Click on Continue, then OK. The output will appear in the SPSS Viewer.

18

What to look for From the output we can see a number of things. Firstly, in look at the box labelled Statistics.
Statistics Sex N Valid Missing Minimum Maximum 15 0 1.00 2.00

The first row shows the number of cases that were included in the analyses. The second row shows how many cases in the data set were not included in the current analysis. The most common reason for a case being missing is that there is no data for that case on the variable on question. We can also see that the minimum value entered for sex is 1 (row 3), and the maximum value is 2 (row 4). These numbers correspond with what we would expect. If they did not we would want to check the data file for errors. Now look at the box labelled Sex.
Sex Cumulative Frequency Valid male female Total 8 7 15 Percent 53.3 46.7 100.0 Valid Percent 53.3 46.7 100.0 Percent 53.3 100.0

Column 1 shows the number of cases falling into each category. In our sample there are 8 males and 7 females. The second column tells us what percentage of the TOTAL sample fall into each category. In this case 53.3% of the sample is male and 46.7% female. The third column tells us the percentage of cases that fall into each category EXCLUDING cases that were missing (i.e. of all the cases with a value for sex, what percentage were male and female). As we had no missing cases the values in column 2 and 3 are the same. The Cumulative Percent column adds the percentage in a category to the percentage in the previous categories.

19

2.3

Separate Descriptive Statistics for Groups within a Sample

In some situations we are interested in examining descriptive statistics not for all of our sample, but for the subgroups within our sample. In this example, we are interested in looking at the average age and height of males and females in our sample. We do this because if there were no differences in the mean height of males and females we could conclude that there was no difference without having to conduct further analyses. 2.3.1 In the Data Viewer click on the Data menu and click on Split File.

2.3.2 In the Split File dialogue box click on the Organise output by groups option. 2.3.3 You then need to select the variable you will use for grouping. This must be a categorical variable. In this case we want our groups to be based on sex, so we get descriptive statistics for males and females separately. Click on sex and move it into the box labelled Groups Based on: using the arrow button.

20

2.3.4 Click OK. Your file will now be split by sex in ALL subsequent analyses. 2.3.5 Use the same procedure as in section 2.1 to produce the descriptive statistics. FYI To turn off the split file function after producing separate descriptive statistics Click on the Data menu and Split File. Select Analyse all cases, do not create groups. Click OK.

What to look for As you can see, the descriptive statistics are produced for males and females separately. The first table provides descriptive statistics on the age and height of male participants (footnote a. tells you which category is being analysed).
Descriptive Statisticsa N Age Height Valid N (listwise) a. Sex = male 8 8 8 Range 3.00 35.00 Minimum 7.00 105.00 Maximum 10.00 140.00 Mean 8.6250 126.8750 Std. Deviation 1.30247 11.31923 Variance 1.696 128.125

We can see that the mean age of male participants was 8.625 and the average height was 126.875.

21

The second table shows that the mean age of female subjects was 8.7143 and the mean height was 111.4286.
Descriptive Statisticsa N Age Height Valid N (listwise) a. Sex = female 7 7 7 Range 2.00 30.00 Minimum 8.00 100.00 Maximum 10.00 130.00 Mean 8.7143 111.4286 Std. Deviation .75593 11.07335 Variance .571 122.619

2.4

Saving Output

2.4.1 With the SPSS Statistics Viewer open click on File and select Save As . 2.4.2 Find the location you desire (as in section 1.3) 2.4.3 In the box labelled File Name enter the name you wish the file to be stored under (e.g. descriptive statistics 1). 2.4.4 Check that in the box labelled Save as type: , Viewer Files (*.spv) is selected. If it is not select it from the drop down menu. 2.4.5 Click on Save.

2.5

Printing Tables

You may wish to print the tables that SPSS produces in its output. In order to do this you need to ensure that the table will fit onto the page, so you can read it easily. 2.5.1 Double click with the left mouse button on the centre of the table that you wish to print. The table will become highlighted and the Formatting Toolbar will appear. A separate Pivoting Trays window will also open. To format for printing we do not need to use the toolbar or pivoting trays however. For more information on these consult your course text.

22

2.5.2

Click on Format, then Table Properties.

2.5.3 In the Table Properties dialogue box that opens, click on the Printing tab. 2.5.4 Make sure the Rescale wide table to fit page and Rescale long table to fit page boxes are selected (i.e. have a blue tick next to them). If they are not click in the box next to them.

23

2.5.5 Click on OK. 2.5.6 To print the table go to the File menu and select Print. 2.5.7 In the Print range area of the Print dialogue box that opens ensure that Selection is highlighted. This will print the table you are working on only. If you would like to print all of your output, then select All visible output. FYI Before you print all output make sure that you have followed steps 2.5.1-2.5.5 for all the tables in your output. 2.5.8 In the box labelled Name: ensure that the correct printer appears. Click on the arrow to display all printers available. Select the correct printer by clicking on it. 2.5.9 Click on OK. Your table will now print.

2.6

Copying Output into Word

You may wish to copy the tables produced in SPSS output into a word document e.g. a practical report. 2.6.1 Firstly, ensure that you have both SPSS and Word open. 2.6.2 In SPSS, right click over the centre of the table you wish to copy. 2.6.3 From the list that appears, select Copy.

24

2.6.4 Move into word. From the Edit menu, select Paste. 2.6.5 The table should now be copied into your Word document.

25

Exercise 2 Using the data file you created in Exercise 1 complete the following; 2.1 Obtain descriptive statistics for the variables, heart rate, feeling and enjoyment. Print out the tables produced.

Handy Hint! Remember that different statistics are needed to describe different kinds of data. 2.2 Produce descriptive statistics for heart rate for each music category separately. What can you say about the relationship between heart rate and the type of music listened to? Would you suggest further analyses? Import each of the tables into Word. Save the descriptive statistics you have produced.

2.3

26

Section 3 Graphs
This section will cover how to create visual representations of your data in the form of a graph. Graphs can be very useful in summarising data and in making key trends clearer. We will cover three kinds of graph in this section: histograms, bar charts and scatterplots. When you complete this session you will be able to: Produce a histogram Produce a bar chart Produce a scatterplot Edit a graph SPSS produces Save a graph Print a graph Import a graph into Word 3.1 Histograms

Histograms are used to display the distribution of scores on a single, continuous variable. The shape of a histogram gives information on whether a variable is normally distributed. In this case we will look at the distribution of height. 3.1.1 In the data editor click on Graphs from the main menu across the top of the page, then click on Chart Builder.

3.1.2

The Chart Builder will open in a new window. In order for Chart Builder to work effectively you must have set the correct measurement level for each of your variables (i.e. nominal, ordinal or scale). A warning message to this effect will be displayed. If you have set the correct level of measurement already (as we have) click OK. If you have not click on Define Variable Properties and do this now.

27

3.1.3

Choose the type of graph you want to create in the Gallery tab. We want to create a histogram, so click Histogram. The display panel will then show the options for a histogram. Hovering the mouse over an image will display a description of the type histogram. We wish to create a simple histogram, so drag the Simple Histogram (far left image) into the chart preview area. This begins the building process in the preview area and opens the Element Properties box.

3.1.4

To create your graph you need to drag variables of interest into the chart preview area. We wish to create a histogram of Height, so click on the height variable and drag into the box labelled X-Axis?. This label will be replaced by Height and the bars will change.

28

FYI The chart presented in the preview area is based on example data not your actual data. Your data will be used to create the chart once you have finished building the chart. 3.1.5 In some kind of graph you also require a variable on the Y-Axis. This is not the case with a histogram where the Y-axis represents the frequency of variable on the X-axis. We can therefore leave the Y-axis as it is. Click on Display normal curve in the Element Properties box. Click Apply. This will superimpose a normal distribution curve over the top of your histogram.

3.1.6

3.1.7 3.1.8

To add a title to the histogram click on the Titles/Footnotes tab in the Chart Builder. Click on Title 1. Title 1 will appear in the Elements Properties box. Give the graph a title e.g. Histogram of Height Scores in the Content: box.

29

3.1.9

Click Apply. The title will show as T1 in the preview area but the text will display in the final output. 3.1.10 Click OK. The following output will be produced.

What to look for

30

Height is represented on the horizontal axis of the graph. Rather than representing one number or category, the bars on a histogram represent all the cases that fall between two specified values. For example, the bar to the far left of the histogram above represents cases where height is 97.5 or over, but less than 102.5. The number of cases that fall into this group is represented by the vertical, Frequency axis. In this example, 2 participants were between 97.5 and 102.5cm tall. The overall shape of the histogram gives us information on whether the data is normally distributed. You should know from your statistics lectures that normally distributed data has fewer cases at the extremes with the majority of cases falling around the mean. This gives the characteristic bell shaped curve. In this example, we have more cases at the extremes than we would expect in a classic normal distribution. The normal distribution curve however, is the shape that we are looking for and indicates normality. In this case we would want to look more closely at the data. FYI Further analysis of the normality of data is beyond the scope of this manual. Please consult a statistics textbook or the SPSS Help menu for more information on how to do this.

3.2

Bar Charts

Bar charts can be used to show the number of cases in a particular category or the score on a continuous variable for different groups. In this example we will create a bar chart to show the height of males and females at various ages. 3.2.1 In the Data Editor window click on Graphs, then on Chart Builder. 3.2.2 Click on Bar in the Gallery tab. Select Clustered Bar from the images (second from left on top row). Drag the image into the preview area.

31

3.2.3 To demonstrate this kind of graph to you we need to treat Age as an ordinal variable. To do this right click on Age in the Variables: box. Select Ordinal. The Icon to the left of age will now change to a bar chart. Drag Age to the XAxis? box in the preview panel. 3.2.4 Drag Height to the Y-Axis box. The Chart automatically displays Mean Height in Y-axis. As this is the statistic we want we do not need to change this. If you wish to change this statistic you can do this in the element properties box. Consult your course text or SPSS Help for more information on doing this.

3.2.5 To show males and females separately drag Sex from the Variables: box into the box labelled Cluster on X: set colour in the preview panel.

32

3.2.6 To add a title follow instructions given in 3.1.7-3.1.9. 3.2.7 Click on OK. The following output should be produced.

What to look for

As you can see in the graph, the age groups are listed along the horizontal axis, with mean height represented on the vertical axis. If you look at the legend in the top

33

right-hand corner, you will see that blue bars represent males and green bars represent females. From the graph we can therefore tell that the mean height of males aged 8 was 125cm and females aged eight was 100cm. As well as displaying information in a clear manner, the bar chart can also give us information on trends within the data. In this example, mean height increases with age and at all ages, males are taller on average than females.

3.3

Scatterplots

Scatter plots are typically used to explore the relationship between two continuous variables. In this case we will look at the relationship between age and height. 3.3.1 Click on the Graphs then Chart Builder. 3.3.2 Click on Scatter/Dot in the Gallery tab. From the summary images select Simple Scatter (far left of the top row). Drag the image into the preview area.

3.3.3 Click on the variable that you consider to be the dependent variable, in this case height, and drag it into the box labelled Y-Axis?. This variable will appear on the vertical axis of the graph. 3.3.4 In the same way, move the other variable (age) into the box labelled X-Axis?. This variable will appear on the horizontal axis.

34

3.3.5 You can mark which variables belong to a certain category in a Scatterplot. In this case for example, we can mark which cases are male and female. To do this click on the Groups/Points ID tab. Select Grouping/Stacking Variable. A box entitled Set Colour will appear in the preview panel.

3.3.6 Drag sex into the Set Colour box. 3.3.7 To add a title follow instructions given in 3.1.7-3.1.9. 3.3.8 Click on OK. The output should look as follows.

35

What to look for

36

The age of participants is represented along the horizontal axis. Height is represented on the vertical axis. Each circle represents a single case. The age and height of the participant is represented by the position of the circle. For example, if you look at the circle closest to the bottom left-hand corner of the graph we can see from the horizontal position that this person is 7 years of age, and from the vertical position that this person is 105cm tall. If you look at the legend in the top right-hand corner, you can see that blue circles represent males and green ones females. The circle is blue therefore this person is male. You can see from the scatterplot that generally, as age increases so does height. This trend appears to be the same for males and females. This kind of relationship is known as positive. If height were to decrease as people got older (something we might expect with elderly people rather than children for example) then the relationship would be negative. If the scatter diagram revealed points positioned randomly in the graph space this would suggest that there was no relationship between the variables. More information is given on what to look for in a scatterplot in section 4.1.

3.4

Editing Graphs

You can alter graphs in SPSS to make them suitable for use in your work (e.g. practical reports). To alter a graph you need to open the Chart Editor window. 3.4.1 Double left click on the graph you wish to alter. The Chart Editor window will open complete with additional menu options and icons (see below).

37

FYI You can alter many things about your graph in the Chart Editor. Examples include the text on the graph, the colours and patterns used and the positioning of the graph. The best way to learn is to experiment, however, we will look at how to change the colours in the graph as a guide here. 3.4.2 To alter the colours of the graph click ONCE on the area that you wish to change (i.e. on the bars). This will create a blue outline around the selected area. 3.4.3 Right click once over the bars. This will produce the following box (see below).

FYI To make changes to any part of your graph, ensure that you click over that particular part of the graph. 3.4.4 Click on the Properties Window option. The following box will appear.

38

3.4.5 Click on the Fill and Border tab. 3.4.6 To change the colour used to fill the bars click on the coloured box to the left of Fill. This is the current colour of the graph. Then click on the colour you would like the graph to be from the selection on the right (there are more options available if you click on Edit). This will change the colour in the box next to fill to your selection. In this example we will choose a blue.

3.4.7 To change the colour of the border click on the box to the left of Border. Then choose your new selection from the right in the same way as when changing the fill colour (3.4.6). In this example we will choose red.

39

3.4.8 To add a pattern click on the drop down arrow box below Pattern. Select a pattern from those available by clicking on it. For this example choose spots.

3.4.9 To make the changes to the graph click on Apply. The resulting graph should look like this.

3.4.10 To exit the Chart Editor click on the red cross in the top right-hand corner of the window. You should then return to the SPSS viewer. The changes to your chart will be seen here.

40

FYI If your graphs are to be printed in black and white it is better to use patterns than colours to distinguish between different bars in a bar chart or different symbols to distinguish categories in a scatterplot.

3.5

Saving Graphs

3.5.1 In SPSS Viewer click on the File menu and select Save As. 3.5.2 Select the correct location in the drop down box labelled Look in: (see section 1.3). 3.5.3 Give your file a name in the box labelled File Name: (e.g. Height Histogram). 3.5.4 Check that in the Save as Type: box, Viewer Files (*.spv) is selected. If it is not click on the arrow button and select this file type from the drop down menu. 3.5.5 Click on Save.

3.6

Printing Graphs

Once you are happy with your graph you may wish to print it. The following steps outline this procedure. 3.6.1 In the SPSS Viewer, click once on the graph you wish to print. This will create a box around the graph (see below).

3.6.2 Click on the File menu, then on Print.

41

3.6.3 A dialogue box will open. Ensure that the correct printer (i.e. the one you intend to print from) is selected in the Printer: box. Use the drop down menu to see all printers available. Click on the correct one to select it. 3.6.4 Ensure that under Print Range, Selected output is highlighted.

3.6.5 Click on OK. This will print your graph.

3.7

Importing Graphs into Word

Graphs are often produced so they can be added to the results section of a report. To do this you need to import your graph into Word. 3.7.1 3.7.2 Ensure that Word and the SPSS Viewer containing your graph are both open. Click once on the graph you would like to copy. A border should appear around it. 3.7.3 Click on Edit from the menu at the top of the page. Select Copy.

42

3.7.4 Move into Word. Place your cursor where you would like your graph to appear. 3.7.5 Click on Edit from the top menu, and click on Paste. Your graph will now be copied into your word document.

43

Exercise 3 Using the data file created in Exercise 1 complete the following: 3.1 3.2 Produce a histogram of heart rate. Change the colour of the bars to red. Print the graph. What does the shape of the distribution tell you about heart rate? Create a bar chart with music type as the categorical variable on the horizontal axis and heart rate as the continuous variable on the vertical axis. Add sex as an additional variable to break scores down by. Import the graph into a Word document. What does this graph tell you about the trends within the data? Create a scatterplot showing the relationship between heart rate and enjoyment levels. Add markers that identify whether a case was in the classical, jazz or dance music group. Edit the graph so that the markers are shown as different symbols rather than different colours. Save this graph.

3.3

Handy Hint! To change the shape of one marker not all of them, click on the marker in the legend, not in the graph itself.

44

Section 4 Correlation Analysis


Simple Bivariate correlation analysis is used to describe the strength and direction of the relationship between two variables. This section will describe the procedure for obtaining a Pearson product-moment correlation coefficient. This statistic can be used with two continuous variables or one continuous variable and one categorical variable. We will focus on the situation where you are analysing two continuous variables. Upon completing this section you will be able to: Analyse a scatterplot prior to performing a correlation analysis. Conduct a Pearson product-moment correlation analysis 4.1 Analysing the Scatterplot

Before conducting a correlation analysis you should produce a scatterplot of your two variables (see section 3.3). The scatterplot should be analysed to ensure the suitability of the data to correlation analysis. The scatterplot also provides useful information on the relationship between the variables of interest. 4.1.1 From the scatterplot you should look for any outliers. These are data points that do not fit with the general trend of points. If you find any extreme outliers it is worth checking whether there is an error in the data entry. In the diagram below, the red data point may be an outlier and should be investigated further. For further information on outliers consult your course text.

45

4.1.2

You should also check the general distribution of points. For correlation analysis to be appropriate you need a linear relationship between your variables. If you have no relationship, a cone-shaped distribution or a curvilinear relationship, then correlation analysis is not appropriate. The ideal shape of the cluster of points is a cigar shape; roughly straight, wider in the centre and tapering to either end, representing a linear relationship. If your distribution roughly resembles this you should continue with correlation analysis.

No relationship

Cone-shaped distribution

Curvilinear relationship

Linear relationship

46

4.1.3

The scatterplot can also provide useful information on the kind of relationship that you expect between your variables. If the points on the graph fall pretty much along a straight line then you would expect a very strong relationship between the two variables. If they are more spread out then you would expect a weaker relationship.

Stronger relationship

Weaker relationship

4.1.4

The direction of the line also gives you information on the relationship. If the line goes upwards from left to right (i.e. low scores on one variable are associated with low scores on the other variable) then you have a positive relationship. If the line slopes downwards from left to right (i.e. high scores on one variable are associated with low scores on the other variable) then you have a negative relationship.

Positive relationship

Negative relationship

47

4.2

Pearson product-moment correlation coefficient

Once you have ascertained from the scatterplot that your data is suitable for correlation analysis you can perform the procedure in SPSS. 4.2.1 In the Data Editor window click on the Analyze menu, then on Correlate then Bivariate.

4.2.2 In the box that appears select the two variables you wish to investigate (age and height) and move them into the box labelled Variables: using the arrow button. FYI You can put as many variables as you like into the variables box. The final output will show the correlation for each possible pair of variables.

48

4.2.3 Check that the Pearson and Two-tailed options are selected. If they are not then click in the box/circle next to them to select them (there will be a blue tick/dot if they are selected) 4.2.4 Click on the Options button. For Missing Values select Exclude cases pairwise. If you would like descriptive statistics as well as the correlation analysis, you can request those here. In this example we will click in the box to select Means and standard deviations in the Statistics area.

4.2.5 Click Continue, then OK. Your output should look like this (see below).

49

What to look for The first box labelled Descriptive Statistics provides the Mean, standard deviation and number of cases for Age and Height.
Descriptive Statistics Mean Age Height 8.6667 119.6667 Std. Deviation 1.04654 13.42528 N 15 15

The first row gives details for Age, the second row height. The mean is presented in the first column, the standard deviation in the second column and the number of cases for each variable in the third column. The second box labelled Correlations provides the Pearson Correlation Statistics.
Correlations Age Age Pearson Correlation Sig. (2-tailed) N Height Pearson Correlation Sig. (2-tailed) N 15 .627* .012 15 15 1 Height .627* .012 15 1

*. Correlation is significant at the 0.05 level (2-tailed).

You will see that Age and Height are listed on the rows and columns of the table. Each of the four cells provides the information on the relationship between the variable on that particular row and column. For example, if we look at the top right hand corner of the table, this cell provides details on the relationship between Age and Height. The first thing to check in the correlations table is the N. this number tells you the number of cases that had a score for both variables (age and height in this case). Does this look correct? If there are too few cases you should check whether you have missed out cases in your data entry. Alternatively, you may have forgotten to select Exclude cases pairwise. Excluding cases listwise (the other option) may reduce your N unnecessarily as it removes cases that have a missing value on any of the variables you selected for correlation, not just the

50

two being correlated. This will not affect the N when you have just two variables, but may affect it when you have three or more. Secondly, you need to determine the direction of the relationship. This information can be found by looking at the r value (the Pearson Correlation). This value can range from between -1 and 1. If the number is negative there is a negative relationship between the variables. If the number is positive there is a positive relationship between the variables. In this case there is a positive relationship between age and height. As age increases, so does height. FYI Remember to think carefully about what high and low scores on your variable actually mean when interpreting the relationship. You can then go on to look at the strength of the relationship. An r value of -1 or 1 indicates a perfect negative or positive correlation respectively. An r value of 0 indicates no relationship at all. All numbers in between indicate that there is a relationship to some extent. Numbers closer to zero indicate a weaker relationship, numbers closer to+/- 1 indicate a stronger relationship. FYI Whether the number is positive or negative has no effect on the strength of the relationship, only the direction. Cohen (1988) suggested the following guidelines for interpreting r values. r= +/- .10 to +/- .29 small correlation r= +/- .30 to +/- .49 medium correlation r= +/- .50 to +/- 1.0 large correlation In this example our r=.627 which indicates a large correlation between the two variables. Finally, you can assess the significance of your result. If you look at the correlation table, you will see an asterix (*) next to our r value. A footnote at the bottom of the table tells you that this means the result is significant at the 0.05 level (2-tailed). If there is no asterix next to your r value then the result is nonsignificant. FYI Significance in correlation analysis is greatly determined by sample size. Large correlations in small samples may be non-significant whereas small correlations in very large samples may be statistically significant.

51

Exercise 4 You have measured the arithmetic and reading abilities of twelve children by rating their performances on standard tests given to them by you. When completed the results below were obtained. The prediction is that there is a positive correlation between arithmetic ability and reading ability, i.e. that children who are good at reading are also good at arithmetic, children who are average readers are average at arithmetic and children who are poor readers score low on the arithmetic test.

Arithmetic Score 32 54 68 93 87 24 49 35 97 62 44 73 4.1 4.2 4.3 4.4 4.5

Reading Score 17 13 14 10 16 7 6 18 19 13 9 12

Create a new SPSS file containing the above data. Create a scatterplot of the data above. What can you tell about the relationship between arithmetic and reading ability? Conduct a Pearsons product-moment correlation analysis. What can you conclude on the basis of your results?

52

Section 5 T-tests
T-tests are used to compare scores on the same continuous variable either between two groups or between the same group in two different situations. By the end of this session you will be able to: Conduct an independent samples t-test Interpret the results of an independent samples t-test Conduct a paired-samples t-test Interpret the results of a paired-samples t-test

5.1

Independent Samples t-test

An independent samples t-test is used when you are comparing two different groups on the same, continuous variable. For example, we will look at whether there is difference in height between males and females in our sample. 5.1.1 In the data editor window click on the Analyze menu. Click on Compare Means, then on Independent-Samples T-Test.

5.1.2

In the box that opens move the dependent, continuous variable (height) from the panel on the left into the area labelled Test Variable(s): by clicking on it and the clicking the arrow button.

53

5.1.3

Move the independent, categorical variable (sex) into the area labelled Grouping Variable: using the arrow button in the same way as above.

5.1.4

Click on Define Groups. Here you need to use the numbers that you used in the data set to code each group. In this case we used the number 1 to represent males and the number 2 to represent females. In the Group 1: box type 1. In the Group 2: box type 2.

5.1.5

Click on Continue then OK. Your output should look as follows.

54

What to look for Firstly, look at the group statistics table. This table contains the number of cases, mean height, standard deviation and standard error of the mean for male and female participants. Check these values. Do these values seem correct? If not there may be an error in data entry or in the numbers you entered to code the groups (e.g. 0 and 1 rather than 1 and 2). Now we look at the box labelled Independent Samples Test.

Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference Sig. (2F Height Equal variances assumed Equal variances not assumed 2.667 12.809 .020 15.44643 5.79074 2.91733 27.97553 .002 Sig. t df 13 Mean Std. Error Upper

tailed) Difference Difference Lower .020 15.44643

.967 2.663

5.79987 2.91657 27.97629

The first two columns provide information relating to Levenes Test of Equality of Variances. This tests whether the variance in each group is the same. If the significance value in the second column is greater than .05 then you should use the values in the first row of the table (Equal variances assumed) when assessing the significance of the test. If the significance value is less than 0.05 then you should use the second row in the table (Equal variances not assumed). In this example the significance level is greater than 0.05, therefore we will use the top row of the table in subsequent steps. To assess whether your t-test was significant look in the column labelled Sig (2tailed) which appears under the section labelled t-test for Equality of Means. As we are assuming equal variances we refer to the value on the first row of this column. The value is 0.02. As this is less than 0.05, we can conclude that there is a significant difference between the height of males and females. The significance of the result is based on the t -value and the degrees of freedom of the sample. These can be found

55

in the table under the columns labelled t and df respectively. You will need these to write up your results. 5.2 Paired-samples t-test

Paired-samples t-tests are used when you have one group of participants whom you collect data from on two different occasions for example, at two different time points or under two different conditions. To demonstrate the paired-samples t-test we need to add some data to our example set. Lets assume that we measured the height of the children in our original sample again 1 year later. Add a new variable to your data set of Heighttime2 (see section 1.2). Add the following data under this variable. Participant ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Your data set should now look like this: Heighttime2 146 125 134 140 126 133 107 139 107 115 104 108 120 136 125

56

We will now use a paired-samples t-test to assess whether there is a significant difference between the participants height at time 1 and time 2. 5.2.1 Click on the Analyze menu at the top of the screen. Click on Compare Means then Paired Samples T-test.

5.2.2

We will now identify the two variables that we are interested in comparing for each participant. In this case we want to compare Height and Heighttime2. Highlight height and click on the arrow button. Height will move into the Variable 1 position in Pair 1 of Paired Variables:.

5.2.3 Highlight Heighttime2 and click on the arrow button. Heighttime2 will move into the Variable 2 position in Pair 1 of Paired Variables:.

57

FYI You can conduct more than one t-test at a time. Once you have added one pair of variables into the Paired Variables box just repeat the process with other pairs. 5.2.4 Click on OK. The resultant output should look like this.

What to look for Look at the table labelled Paired Samples Test.
Paired Samples Test Paired Differences 95% Confidence Interval of the Std. Mean Pair Height 1 Heighttime2 4.66667 Deviation 1.34519 Std. Error Mean .34733 Difference Lower -5.41161 Upper -3.92173 t 13.436 df 14 Sig. (2tailed) .000

The final column in this table gives the significance level of the t-test (labelled Sig. (2-tailed)). In this case the value is less than 0.05 so we can conclude that there is a significant difference in height between the original measurement at time 1 and year later at time 2. When reporting your result you will need the t value and degrees of freedom also. These are to be found under the column labelled t and df respectively.

58

Now you need to assess whether there has been an increase or decrease in the continuous variable between your two time points/conditions. In this case, we can be fairly confident that height has increased over time, but with other variables you may not know the direction of the effect! To check, look at the box labelled Paired Samples Statistics.

Paired Samples Statistics Mean Pair 1 Height Heighttime2 119.6667 124.3333 N 15 15 Std. Deviation 13.42528 13.69393 Std. Error Mean 3.46639 3.53576

Looking down the first column we can see that the mean height from our first set of measurements was 119.7cm. Mean height at time 2 was 124.3cm. We can therefore conclude that there was a significant increase in height between time 1 and time 2.

59

Exercise 5.1 A teacher was interested in the effect of performance anxiety on memory. He thought that the stress of exams was affecting his pupils ability to recall information and consequently reducing their grades. To investigate this he gave all of his students a memory test. He told half of them that the test was just for fun and the other half that their scores would count towards their grade. The results for each pupil are given below. Fun test 20 18 14 22 19 15 15 21 17 18 20 24 20 18 22 16 5.1.1 5.1.2 5.1.3 5.1.4 5.1.5 Grade test 15 14 20 16 18 15 17 17 14 18 21 14 15 18 15 13

Create a data file in SPSS with the data in. What type of t-test would you use to assess whether exam stress had an effect on memory? Perform the t-test you have chosen. Are there equal variances between your two groups? Is stress affecting the pupils ability to recall information?

Exercise 5.2 A prison psychologist had been commissioned to design a therapeutic programme to increase empathy with the victim in violent offenders. She was interested in knowing whether her programme was effective. To do this she measured programme attendants victim empathy prior to beginning the course and again following completion of the course. The scores from the two sittings of the questionnaire are shown below.

60

Participant 1 2 3 4 5 6 7 8 9 10 11 12 5.2.1 5.2.2

Score prior to course 20 18 25 19 23 20 16 26 21 21 19 20

Score after course 21 18 28 22 22 24 17 29 21 23 22 21

Create a data file in SPSS containing the participants data. Conduct a paired-samples t-test on the data to assess whether there is a difference in empathy scores prior to beginning the course and after completion. 5.2.3 Was the programme successful?

61

Section 6 Chi-Square Tests


Chi-square tests are used when you are dealing with categorical data. The chisquared goodness-of-fit test is used when you have one variable. The chi-square test for independence is used when you have two categorical variables. By the end of this session you will be able to: Perform a chi-square goodness-of-fit test Interpret the results of a chi-square goodness-of-fit test Perform a chi-square test for independence Interpret the results of a chi-square test for independence

6.1

Chi-Square Goodness-of- Fit (one-sample Chi-Square)

This test examines how many cases fall into each category of a single variable and compares these with hypothesised values. In order to conduct a meaningful chi-square goodness-of-fit we need to first create a categorical variable to analyse. Follow this procedure to create two height groups: taller and shorter than average. 6.1.1 Recall from when we looked at the descriptive statistics for height (the original variable, not heighttime2), the mean height of the children was approximately 119cm. We will use this as a cut-off point. Any case with a height up to or equal to 119 will be coded as 1. Any case with a height over 119cm will be coded as 2. In the Data Editor, click on the Variable View. Create a new variable, heightcat. Give this variable values 1=shorter than average, 2=taller than average (see section 1.2). Next, switch to the Data View. Look at the number in the original height variable. If this number is 119 or lower then enter 1 in the same row of the new heightcat variable. If the number is larger than 119 then enter a 2. Repeat this for all 15 cases.

6.1.2

6.1.3

62

Your data view should now look like this.

Now we can perform the chi-square goodness-of-fit analysis. 6.1.4 Click on the Analyse menu, then Nonparametric Tests then Chi-Square.

6.1.5

In the Chi-Square Test dialogue box highlight the variable heightcat by clicking on it and move it into the box labelled Test Variable List: by clicking the arrow button.

63

SPSS gives you the option to specify your expected values (i.e. what you would expect the number of cases in each category to be if your null hypothesis is true). You can set this to be an equal number in each category. Alternatively, if your null hypothesis is supported by unequal numbers of cases in each category, you can specify the numbers of cases you expect to be found in each cell based on previous knowledge. These become our expected values and the chi-square tests whether our observed values are significantly different to this. 6.1.7 In this example, we would expect that an equal number of people would be in the shorter and taller than average categories. To set this as the expected values ensure that the All categories equal selection is highlighted in the Expected Values section. 6.1.8 To specify unequal expected values, click on the Values: selection. In the box type in the value you expect for your first category (the one that is first numerically in the coding). Click on Add. Repeat again with the expected value for your second category and so on until all your expected frequencies are in the larger white box. 6.1.9 The remaining steps are now the same regardless of your expected values. Click on Options. 6.1.10 Ensure that in the Missing Values section, Exclude cases test-by-test is selected. If you would like descriptive statistics as part of your output, select Descriptive in the Statistics section.

6.1.6

6.1.11 Click on Continue then OK. Your output should look as follows.

64

What to look for The first table, labelled descriptive statistics, provides these if selected. The second table, labelled Heightcat, provides the observed and expected frequencies (N) for each category. The residual statistic shows the difference between observed and expected frequencies. Here we can see that we expected there to be 7.5 cases in each category. In reality there were 6 people who fell into the shorter than average category and 9 people that fell into the taller than average category.

Heightcat Observed N shorter than average taller than average Total 6 9 15 Expected N 7.5 7.5 Residual -1.5 1.5

The final table gives the actual test statistics.

Test Statistics Heightcat Chi-Square df Asymp. Sig. a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 7.5. .600a 1 .439

The first row provides us with the Chi-Square value. The degrees of freedom are given on row 2 and the final row gives the associated significance. Here we can see that the significance is .439 which is larger than 0.05 and therefore our result is nonsignificant. We can conclude that there is no significant difference in the number of people above and below average height in this sample.

65

FYI In order for chi-square analyses to work accurately, the expected frequency of at least 80% of cells should be no less than 5. The number of cells that violate this assumption is given as a footnote to the chi-square statistic. This is true for all chisquare tests. If you encounter this situation consult a text book for help.

6.2

Chi Square Test for Independence

This test is used to determine whether two categorical variables are related. Each variable can have two or more categories. In this example, we will use the categorical variables of sex and heightcat to assess whether there is a relationship between sex and height. FYI Transforming a continuous variable to a categorical is often used when the continuous variable is not normal and cannot be analysed with parametric statistics. 6.2.1 In the Data Editor click on the Analyze menu, then on Descriptive Statistics, then on Crosstabs.

6.2.2 Click on the variable sex. Click the arrow button to move it into the box marked Row(s):. 6.2.3 Click on the variable heightcat. Click the arrow button to move it into the box marked Column(s):.

66

6.2.4 Click on the Statistics button. Choose Chi-square.

6.2.5 Click on Continue, then on the Cells button. 6.2.6 In the Counts box, click on Observed and Expected boxes. In the Percentages section, click on the Row, Column and Total boxes.

6.2.7

Click on Continue, then OK. Your output should look like this.

67

What to look for The results of the chi-square analyses are contained in the table labelled Chi-Square Tests.
Chi-Square Tests Asymp. Sig. (2Value Pearson Chi-Square Continuity Correctionb Likelihood Ratio Fisher's Exact Test Linear-by-Linear Association N of Valid Cases 5.042 15 1 .025 5.402a 3.225 5.786 df 1 1 1 sided) .020 .073 .016 .041 .035 Exact Sig. (2sided) Exact Sig. (1sided)

a. 4 cells (100.0%) have expected count less than 5. The minimum expected count is 2.80. b. Computed only for a 2x2 table

The first thing to check in this table is whether you had enough cases for chi-square to work effectively. Chi-square requires a minimum expected cell frequency of 5 in at least 80 per cent of cells. This information is provided in footnote a. to the ChiSquare Tests table. Because of the small number of cases in our analysis all of our cells (100%) have an expected count of less than 5. This means that our test may not give us an accurate result, and should be remembered when interpreting the results. Consult your course text and lecture notes for more information on this.

68

This table also contains the main, chi-square statistic you are interested in. If both your categorical variables had only 2 categories each, as in this example, you should refer to the second row in the table, labelled Continuity Correction. This chi-square value is corrected for the fact that when you have a 2 by 2 table (i.e. 2 categories in each variable) the chi-square value is overestimated. If one or both of your variables have more than two categories, then you should use the first row of the table labelled Pearson Chi-Square. The chi-square value is presented in the first column (labelled Value), and the associated significance in the third column (labelled Asymp. Sig. (2-sided)). In this case, as we are using the corrected values, our Chisquare value is 3.225 and the associated significance value is 0.073. As this value is greater than 0.05 we can conclude that there is not a significant difference between the proportion of males and females who are shorter or taller than average. The second table, labelled Sex*Heightcat Crosstabulation, provides us with information on the numbers of cases that fall into each cell (i.e. male and shorter, male and taller, female and shorter and female and taller).

Sex * Heightcat Crosstabulation Heightcat shorter than average Sex male Count Expected Count % within Sex % within Heightcat % of Total female Count Expected Count % within Sex % within Heightcat % of Total Total Count Expected Count % within Sex % within Heightcat % of Total 1 3.2 12.5% 16.7% 6.7% 5 2.8 71.4% 83.3% 33.3% 6 6.0 40.0% 100.0% 40.0% taller than average 7 4.8 87.5% 77.8% 46.7% 2 4.2 28.6% 22.2% 13.3% 9 9.0 60.0% 100.0% 60.0% Total 8 8.0 100.0% 53.3% 53.3% 7 7.0 100.0% 46.7% 46.7% 15 15.0 100.0% 100.0% 100.0%

69

There is a lot of information contained in the table, but it is quite easy to understand. Males and females are represented on the first and second row of the table respectively. Shorter and taller than average people are represented in the first and second column respectively. The final row combines information for males and females. The final column combines information for shorter and taller than average people. We are mainly interested in the four cells that represent each combination of our two variables. For example, the top left hand cell represents males who are shorter than average. We will work through the information in this cell. The same information is provided in the other cells for the respective combination of categories. The first row within the cell gives the Count. This is the actual number of cases that were male and shorter than average. You can see that there was one participant falling into this group. The second row of the table tells you how many cases you would expect there to be in this cell if there were no relationship between sex and height. In this case, we would have expected 3.2 participants to be male and shorter than average (we ignore the fact that we cant have 0.2 of a case when interpreting a chi-square!). These numbers are very useful in determining trends within the data. From the information we have just looked at, we can see that there are fewer males who were shorter than average than we would expect if there was no relationship between gender and height. The third, fourth and fifth rows provide percentage information. The third row tells us what percentage of males were shorter than average (12.5%). The fourth row what percentage of shorter than average people were male (16.7%). The fifth row tells us what percentage of the total sample were male and shorter than average (6.7%). This information can be useful when describing your results.

70

Exercise 6.1 A health psychologist was interested in whether having a diagnosis of cancer affected whether people smoked or not. She predicted that there would be a significant difference in the proportion of cancer patients who smoked than in the general population. From previous studies the researcher determined that the base rate of smoking in the general population is 20%.She surveyed 25 cancer patients on their smoking habits. She found that 2 were smokers and 23 were non-smokers. 6.1.1 6.1.2 Create a data file in SPSS with the cancer patients data. Conduct a chi-square goodness-of-fit test on the data to assess whether there was a difference in rates of smoking in cancer patients compared to the general population.

Handy Hint! Think about what your expected values will be given the base rate of smoking in the general population. 6.1.3 What do the results tell you?

71

Exercise 6.2 A psychology student was interested in finding out whether the number of nights students spent socialising per week had an effect on their degree performance. He surveyed 30 final year students and asked them the average number of nights they spent out with friends per week. He also obtained permission to use their final degree classification. He created three categories for the number of nights out students had per week; 0-2, 3-4, 5-7. He split the students degree results into two groups; 2:1 or above, 2:2 and below. He obtained the following data. Participant
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Number of Nights out per Degree Classification week


3 3 2 5 4 0 6 3 4 1 6 2 4 5 2 1 6 5 3 3 2 6 1 5 3 5 3 5 2 2 1st 2:2 2:1 3rd 2:2 2:1 2:2 2:2 2:2 1st Fail 2:1 2:2 3rd 1st 2:1 3rd 2:1 2:1 2:1 1st 3rd 2:1 2:2 2:2 3rd 2:1 2:2 2:1 1st

6.2.1

Create a data file in SPSS with the students data.

Handy Hint! Dont forget to code the data into the correct categories.

72

6.2.2 Conduct a chi-square test for independence on the data to find out whether there is a relationship between the amount of socialising and degree classification. 6.2.3 Should students who want good grades stay in more?

73

Exercise Feedback
Feedback for Exercise 1 1.1 1.2 Open a new SPSS data file. Enter the name for your variables into the first column. As music type is a categorical variable create values for each of the three types of music e.g. classical = 1, jazz = 2 and dance = 3. Do the same for sex and feelings. Your variable view should look like this.

1.3

Switch to the data view. Add the data for participant one into the first row. Ensure that you use the correct number to represent the music category, sex and feeling. Repeat for all participants. Save your file where it will be safe and name it something clear e.g. data exercise 1.

1.4

Feedback for Exercise 2 2.1 Heart rate and enjoyment are continuous variables. Analyse them using the Descriptives tab under the Descriptive Statistics menu. Feelings is a categorical variable. Analyse this using the Frequencies tab under the Descriptive Statistics menu. The results of your analyses should look like this.

Descriptives
Descriptive Statistics N heartrate enjoyment Valid N (listwise) 30 30 30 Range 16.00 7.00 Minimum 65.00 2.00 Maximum 81.00 9.00 Mean 73.1667 6.4333 Std. Deviation 4.16954 1.69550 Variance 17.385 2.875

74

Frequencies
Statistics feeling N Valid Missing

30 2

feeling Frequency 10 11 9 30 2 32 Percent 31.3 34.4 28.1 93.8 6.3 100.0 Valid Percent 33.3 36.7 30.0 100.0 Cumulative Percent 33.3 70.0 100.0

Valid

Missing Total

relaxed neutral energised Total System

2.2

Split the data file using musictype as the variable to organise groups by. Conduct Descriptives selecting heartrate as the variable. Your output should look like this.

musictype = classical
a Descriptive Statistics

N heartrate Valid N (listwise) a. musictype = classical 10 10

Range 7.00

Minimum 65.00

Maximum 72.00

Mean 69.1000

Std. Deviation 2.13177

Variance 4.544

musictype = jazz
a Descriptive Statistics

N heartrate Valid N (listwise) a. musictype = jazz 10 10

Range 9.00

Minimum 69.00

Maximum 78.00

Mean 73.9000

Std. Deviation 2.68535

Variance 7.211

75

musictype = dance
a Descriptive Statistics

N heartrate Valid N (listwise) a. musictype = dance 10 10

Range 11.00

Minimum 70.00

Maximum 81.00

Mean 76.5000

Std. Deviation 3.59784

Variance 12.944

From the results of the analyses we can see that participants who listened to classical music had the lowest mean heart rate, those who listened to dance the highest and the heart rate of participants who listened to jazz fell in the middle of the two groups. I would suggest conducting further analyses to assess whether these differences were significant. 2.3 2.4 Import the tables into a word document. Save the output to a desired location.

Feedback for Exercise 3 3.1 Create a histogram specifying heartrate as the variable. Your graph should look like this.

Frequency

Mean =73.1667 Std. Dev. =4.16954 N =30


0 66.00 69.00 72.00 75.00 78.00 81.00 84.00

heartrate

76

To edit a graph remember to open the Chart Editor window. To change the colour of the bars ensure that you click over the bars and not another area of the graph. Change the colour to red. The edited graph should now look like this.

Frequency

Mean =73.1667 Std. Dev. =4.16954 N =30


0 66.00 69.00 72.00 75.00 78.00 81.00 84.00

heartrate

The shape of the distribution is roughly normal. We can tell this from the shape of the normal distribution curve and the fact that there are more cases in the centre of the distribution than towards the ends. 3.2 Create a bar chart dragging heartrate to the Y-Axis, musictype to the X-axis and sex to the Cluster on X: set colour box. The graph should look like this.

77

80.00

sex male female

60.00

Mean heartrate

40.00

20.00

0.00 classical jazz dance

musictype

Import the graph into a Word document. The graph shows that there is a slight increase in mean heart rate from the classical category to jazz to dance. In the classical group, males had a slightly higher mean heart rate than females. In the jazz and dance groups females had slightly higher mean heart than males. 3.3 Create a scatter plot with heartrate and enjoyment as the variables for the x and y axis. Add musictype as set markers by. The graph should look like this.

78

84.00

musictype classical jazz dance

81.00

78.00

heartrate

75.00

72.00

69.00

66.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

enjoyment

To change the symbols open the graph in Chart editor and right click over a marker IN THE LEDGEND. Select the properties window. In the dialogue box you will see a tab labelled Marker. In this tab you can change the shape of the marker my clicking on the drop down menu under type. Select a different shape marker to the circle. Repeat this for each of the markers and exit the chart editor. The resulting graph may look something like this.

79

84.00

musictype classical jazz dance

81.00

78.00

heartrate

75.00

72.00

69.00

66.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

enjoyment

Save the graph.

Feedback for Exercise 4 4.1 Create three variables in SPSS representing the participants ID, scores on the arithmetic test and reading test. Enter the data.

80

4.2

Use the Chart Builder to create a scatterplot of the data. You should use the arithmetic and reading scores as variables in the graph. The output should look as follows.

The Relationship between Scores on an Arithmetic and Reading Test

100.00

80.00

Arithmetic

60.00

40.00

20.00 7.50 10.00 12.50 15.00 17.50 20.00

Reading

81

4.3

4.4

Looking at the graph there are no obvious outliers, and the distribution is not cone shaped or curvilinear. There is a slight cigar shape present in the distribution, suggesting that the data is suitable for correlation analysis. Looking at the distribution, I would expect there to be a weak, positive relationship between the two variables. This is because in general, as scores on one variable increase so do scores on the other, but the points are not close to a straight line. Conduct a Pearsons product-moment correlation analysis. You should use arithmetic and reading scores as your variables again. The resulting output should look as follows.
Descriptive Statistics

Arithmetic Reading

Mean 59.8333 12.8333

Std. Deviation 24.38641 4.23907

N 12 12

Correlations Arithmetic Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Arithmetic 1 12 .280 .378 12 Reading .280 .378 12 1 12

Reading

4.5

From the results of the correlation analysis we can conclude that there is a small, positive relationship between childrens scores on an arithmetic test and reading test. The relationship was not significant however. We therefore cannot conclude that arithmetic and reading ability are related.

Feedback for Exercise 5.1 5.1.1 Create three variables in SPSS; participant ID, participant group (fun test or graded test) and memory test score. Remember to add values to the group variable to code which group received the fun instructions and which group the graded instructions. Your data set should look something like this.

82

5.1.2 5.1.3

You would conduct an independent-samples t-test, as there are two separate groups of participants in each condition. Analyse the data using an independent-samples t-test. Memory score should be the test variable and group the grouping variable. The resulting output should look like this.
Group Statistics

testscore

group fun grade

N 16 16

Mean 18.6875 16.2500

Std. Deviation 2.84532 2.29492

Std. Error Mean .71133 .57373

Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference Lower Upper .57113 .56762 4.30387 4.30738

F testscore Equal variances assumed Equal variances not assumed .698

Sig. .410

t 2.667 2.667

df 30 28.713

Sig. (2-tailed) .012 .012

Mean Difference 2.43750 2.43750

Std. Error Difference .91387 .91387

5.1.4 5.1.5

Levenes Test for Equality of Variances is non-significant. We can therefore assume that there are equal variances in both groups. Pupils who were told that the memory test was just for fun had a higher mean score than those who were told that their score would count towards their final grade. This difference was significant. We can therefore conclude that there is a negative effect of exam stress on memory recall.

Feedback for Exercise 5.2 5.2.1 Create a data file in SPSS with three variables; participant ID, empathy score prior to programme, empathy score after programme. Enter the data,

83

ensuring each participants data is on a single row. The final data file should look like this.

5.2.2

Conduct a paired-samples t-test. Empathy before and empathy after completing the course should be identified as the pair of variables for analysis. The output should look like this.

Paired Samples Statistics Mean 20.6667 22.3333 N 12 12 Std. Deviation 2.83912 3.47284 Std. Error Mean .81958 1.00252

Pair 1

empathybefore empathyafter

Paired Samples Correlations N Pair 1 empathybefore & empathyafter 12 Correlation .897 Sig. .000

Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Lower Upper -2.65594 -.67740

Mean Pair 1 empathybefore empathyafter -1.66667

Std. Deviation 1.55700

Std. Error Mean .44947

t -3.708

df 11

Sig. (2-tailed) .003

5.2.3 There was a significant difference between offenders empathy scores before and after completing the programme. There appears to be an increase in offenders empathy for their victims after completing the programme. This suggests that the programme was successful in increasing victim empathy.

84

Feedback for Exercise 6.1 6.1.1 Create 2 variables in SPSS; participant ID and smoker. Add values to the smoker variable to code smokers and non-smokers. Give each participant an ID number from 1-25. Code 2 participants as smokers and the remaining 23 as non-smokers. You data file should look something like this:

6.1.2

Conduct a chi-square goodness of fit test. Smoker should be your test variable. Due to the base rate of smoking in the general population, you cannot assume equal values in your categories. You need to specify the numbers of people you expect to be smokers and non-smokers if there was no difference between cancer patients and the general population. We know that 20% of the general population smokes. We will expect 20% of the cancer patients to smoker therefore. 20% of 25 is 5. Enter this as your expected value for smokers. That leaves 20 who will not smoke. Enter this as your expected value for non-smokers. Your output tables should look like this:
Smoker smoker non-smoker Total Observed N 2 23 25 Expected N 5.0 20.0 Residual -3.0 3.0

Test Statistics Chi-Squarea df Asymp. Sig. Smoker 2.250 1 .134

a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 5.0.

85

6.1.3

The results showed that there was a lower proportion of smokers in a sample of cancer patients than in the general population. This result was not significant however, therefore we cannot conclude that the proportion of cancer patients that smoke is different to the general population.

Feedback for Exercise 6.2 6.2.1 Create three variables in SPSS: ID, nights out and degree classification. Create values to represent the categories of nights out and degree classification. Enter the data into the data view, remembering to enter the correct codes for nights out and degree classification.

6.2.2 Conduct a chi-square test for independence analysis. Nights out and degree classification are the variables that should be added to the rows and columns boxes. Your output tables should look like the ones below.
Nightsout * Degree Crosstabulation Degree 2:1 or above 2:2 or below 10 0 5.0 5.0 100.0% .0% 66.7% .0% 33.3% .0% 4 6 5.0 5.0 40.0% 60.0% 26.7% 40.0% 13.3% 20.0% 1 9 5.0 5.0 10.0% 90.0% 6.7% 60.0% 3.3% 30.0% 15 15 15.0 15.0 50.0% 50.0% 100.0% 100.0% 50.0% 50.0%

Total 10 10.0 100.0% 33.3% 33.3% 10 10.0 100.0% 33.3% 33.3% 10 10.0 100.0% 33.3% 33.3% 30 30.0 100.0% 100.0% 100.0%

Nightsout

0-2

3-4

5-7

Total

Count Expected Count % within Nightsout % within Degree % of Total Count Expected Count % within Nightsout % within Degree % of Total Count Expected Count % within Nightsout % within Degree % of Total Count Expected Count % within Nightsout % within Degree % of Total

Chi-Square Tests Value 16.800a 21.627 15.660 30 df 2 2 1 Asymp. Sig. (2-sided) .000 .000 .000

Pearson Chi-Square Likelihood Ratio Linear-by-Linear Association N of Valid Cases

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 5.00.

86

6.2.3 There is a significant relationship between the number of nights out a student goes on per week and their final degree classification. Students who went on fewer nights out were more likely than expected to receive a 2:1 or above, whereas students who went out for the majority of nights were more likely than expected to receive a 2:2 or below. It appears that the more nights out per week a student has, the less likely they are to achieve a top degree classification. On the basis of this I would advise students who want top grades to minimise the number of nights out they have per week.

87

Vous aimerez peut-être aussi