Académique Documents
Professionnel Documents
Culture Documents
Outline
Categorical variables Continuous variables Regression Analysis of variance Repeated measures analysis
Some references
Landau,S. and Everitt (2003) A handbook of statistical analyses using SPSS, Chapman & Hall/CRC, New York Brace, N, Kemp, R. and Snelgar, R.. (2003) SPSS for psychologists, Palgrave Macmillan Norusis, M.J (2002) SPSS 11.0 guide to data analysis, Prentice and Hall SPSS Inc. (2002) Regression models, Prentice and Hall [MANUAL] SPSS Inc. (2002) SPSS Advances models 11.0, Prentice and Hall [MANUAL] SPSS Inc. (2001) SPSS Base 11.0 Users Guide, Prentice and Hall [MANUAL] George, D. and Mallary, P.(2002) SPSS for Windows step by step, Allyn and Bacon Field, A. (2000) Discovering statistics using SPSS for windows, Sage Publications Bryman, A. and Cramer, D. (2001) Quantitative data analysis with SPSS release 10 for windows, Routledge. Pallant, J. (2001) SPSS survival manucsal, Open University Press Kinnear, P.R. and Gray, C.D. (2000) SPSS for windows made simple, Psychology Press Ltd, Hove. Voelkl, K.E. and Gerber, S.B. (2000) Using SPSS for Windows: Data analysis and graphics, Springer-Verlag, New York Everitt, B.S. (1996) Making sense of statistics in psychology, Oxford University Press, Oxford.
Measurement scales
The possible outcomes from a qualitative variable are from a finite mutually exclusive set of categories. The outcomes from a quantitative variable are measurements taken on an interval scale (outcomes mutually exclusive, logical order, differences meaningful). If the variable additionally is on a scale where zero means the absence of the characteristic the scale is a ratio scale.
Measurement scales
Hierarchy of scales
Measurement Variable Outcome
Ratio scale
Course conventions
Course structured according to type of data Sections split into
Data description and presentation (data exploration) Statistical inference Interpretation
Valid
Missing
Total
depressi
none mild moderate
60
50
Count
40
30
20
10
0 none m i ld m o d er a te
depress i
Type the observed proportions of the four possible categories into a SPSS spreadsheet column Name the column estimate Transform - Compute... - Target Variable=lower - Numeric Expression=estimate-1.96*sqrt(estimate*(1-estimate)/110) Repeat previous step with Target Variable=upper - Numeric Expresssion=estimate+1.96*sqrt(estimate*(1-estimate)/110)
Example statement: The proportion of women suffering from mild depression was estimated to be 60.9% (95% CI from 51.8% to 70.0%).
choice of card Observed N 37 73 40 150 Expected N 50.0 50.0 50.0 Residual -13.0 23.0 -10.0
A B C Total
COIN
At the 5% level the proportion of heads did not differ significantly from the proportion of tails (binomial test, p=0.23). There was no evidence for a biased coin.
Contingency tables
When two variables have been measured on the same units the result can also be displayed as a two-way table, now called a contingency table. This is similar to displaying several one-way tables in an aggregated two-way table except that now neither the column totals nor the row totals have been fixed. For two categorical variables each cell in a contingency table contains the frequency at which the combination of its row and column categories occurred. The row and column totals represent the frequencies of the two variables singly. The concept can be extended into multi-way contingency tables.
100
PILLNESSnormal
mild
severe
Total
Count % within PILLNESS Count % within PILLNESS Count % within PILLNESS Count % within PILLNESS
80
60
40
20
0 norm al m i ld sever e
p illn es s
10
Estimates of interest
In a 22 table the parameter of interest is the relative risk (RR) or odds ratio (OR) of a category (A1) of outcome A between two samples (or two categories of the second variable, B) B1 and B2.
Category A1 Category not A1 Total Category/sample B1 Category/sample B2 Total a c a+c b d b+d a+b c+d n=a+b+c+d
a (a + b ) RR of A1 comparing B1 and B2 = c (c + d )
OR of A1... =
ab cd
11
Value
Total 99 100.0% 99 100.0% 198 100.0%
PILLNESS
normal
severe
Total
.138 .073 .262 The prevalence of thinking about suicide was lower in the normal 198 group than in the severely psychiatrically ill group (RR=0.14, 95% CI from 0.07 to 0.26).
Odds Ratio for PILLNESS (normal / severe) For cohort NTHOUGHT = Has not thought about suicide For cohort NTHOUGHT = Has thought about suicide N of Valid Cases
19.118
8.582
42.590
2.647
2.002
3.500
12
Value Pearson Chi-S quare Likelihood Rat io Linear-by-Linear A ssoc iation N of V alid Cases 91.253 100.535 73.501 295
a
df
a. 0 cells (.0% ) have expected count less t han 5. The minimum ex pec ted c ount is 10.19.
At the 5% level the proportions of the thought categories differed significantly between the normal, the moderate and the severely psychiatrically ill groups (2=91.25, d.f.=6, p<0.001).
13
14
Use the variable nthought generated previously to consider category has thought about suicide
Analyze - Descriptive Statistics - Crosstabs... - Row(s):=pillness Column(s):=nthought - Cells... - Counts=Observed - Percentages=Row Continue - Statistics... - Chi-square - Continue - OK
PILLNESS * NTHOUGHT Crosstabulation
Value Pearson Chi-Square Likelihood Ratio Linear-by-Linear Association N of Valid Cases 73.353 82.877 64.262 295
a
df
.000
severe
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 42.09.
Total
At the 5% level the proportions of subjects that had thought about suicide differed significantly between the normal, the moderate and the severely psychiatric ill groups (2=73.35, d.f.=2, p<0.001).
15
Value
df
b
There was a significant difference in the prevalence of thinking about suicide between the normal and the severely psychiatric ill group (2=67.7, d.f.=1, p<0.001).
Pearson Chi-Square Continuity a Correction Likelihood Ratio Fisher's Exact Test Linear-by-Linear Association N of Valid Cases
a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 37.00.
16