Notes on Discovering SPSS Statistics


Chapter 1:
nominal variable: two things equivalent in some sense are given the same name or
number, but there are >2 possibilities (think #s on a football team; omnivore, vegan,
vegetarian, etc.)
ordinal variable: nominal, but categories have logical order; think ratings at a beauty
pageant (ordered, just that)
continuous variable: score for each person and can take any value on measurement scale

o interval variable: equal intervals on the scale represent equal differences in the
property being measured
o ratio variables: interval variable; ratios of values along scale are meaningful;
scale has true and meaningful 0 point (lecturer rated as 4 is 2x as helpful as one
rated 2)
discrete variable: can only take on certain values (age)
validity: does an instrument measure what it's set out to measure

o criterion validity: does instrument measure what it claims to measure
o content validity: self-report measure/questionnaires can assess degree to which
individual items represent the construct being measured
o test-retest reliability: test same group of people twice
reliability: can an instrument be interpreted consistently across different situations
correlational/cross-sectional research: observe what naturally goes on in the world
(ecological validity)
confounding variables (confounds): breast implants + suicide elevation (not caused by
breast implants -- must be an external factor)

o systematic variation: variation due to experimenter doing something to all of the
participants in one condition but not the other
o unsystematic variation: variation in performance due to unknown/random
o practice effects: participants perform differently in 2nd condition because of
familiarity with experimental situation and/or measures being used
o boredom effects: participants perform differently in 2nd condition because they
are tired/bored from having completed first condition

o skew: lack of symmetry (most frequent score clustered at one end of the scale
o kurtosis: pointyness
o leptokuric: many scores in the tails (heavy tails)
o platykuric: thin in the tails and tends to be flatter than normal (flatter than
central tendency

o mode: occurs most frequently in data set
o median: middle score when scores ranked in order of magnitude (less affected by
extreme scores)
o mean: the average (influenced by extreme scores)
o z score: center data around zero, with standard deviation of 1: take each score and
subtract it by the mean of all and then divide it by the standard deviation
whether score of certain size are likely or unlikely to occur in a
distribution of a particular kind
two hypotheses

o null hypothesis: opposite of the alternative hypothesis; states that an effect is
o alternative hypothesis: what you're testing

Chapter 2: Everything you ever wanted to know about statistics


o deviance: difference between observed data and fitted model
o sum of squared errors (SS): good measure of accuracy of model, but variance is
variance = SS/N-1; the average error between the mean and the
observations made (how well the model actually fits the data)
degrees of freedom: think 15 football players, 1 fewer role for sequential
one every time you assign a position
standard deviation: square root of the variance --> tells you how well
the mean represents the data
o the statistics equation: outcome = model (or mean) + error
deviation = sigma (observed-model)^2
o standard error: how well a particular sample represents the population; standard
deviation of sample means
small standard error: likely that sample population is accurate reflection of
the population
o sampling variation: samples vary because they contain different members of the
o sampling distribution frequency distribution of sample means from the same
o standard error of the mean (SE): standard deviation between sample means -
measure of how much variability there was between the means of different
central limit theorem: SE = s/sqrt(N);
if sample is large, we can use central limit theorem to approximate
standard error
if sample is small, we use t-distribution
confidence intervals: chance mean falls within a range of values

o confidence intervals: good for large samples
o t-tests/t-distribution: for small samples without a normal distribution -- change
shape as sample size gets bigger
o usually displayed as error bar
o 95% CI: range of scores constructed such that the population mean will fall
within this range in 95% of samples; it is NOT an interval within which we are
95% confident the population mean will fall
o when experimental manipulation is successful, we expect to find samples have
come from different populations. Otherwise, if it's unsuccessful, we expect to find
they came from the same population
test statistic: (variance explained by the model)/(variance not explained by the model) =

o non-significant results do not mean that no effects are happening
o cannot conclude that the null hypothesis is false when our results are significant -
o one-tailed test tests directional hypothesis
o two-tailed test tests non-directional hypothesis
o type I error: when we believe there is a genuine effect in our population, when in
fact there isn't
o type II error: when we believe there is no effect in the population, when in effect
there isn't
o effect size: when we measure the size of an effect
r=.10 (small effect); effect explains 1% of the total variance
r=.30 (medium effect); effect explains 9% of the total variance
r= .50 (large effect); effect explains 25% of the total variance
o meta-analysis: combine effect sizes from different studies researching the same
question to get better estimates of population effect sizes
power: the ability of a test to determine an effect of that size - probability that a given
test will find an effect assuming one exists in the population

o want to aim for power of .80; or 80% chance of detecting an effect if one
genuinely exists
o usually use algorithm to determine how many patients need to be recruited in
order to have sufficient power
alternatively can calculate the power of a test

Chapter 3: The SPSS Environment

saving tips

o edit > options > file locations tab > browse and navigate to data box
use labels to write longer variable descriptions; this is really helpful for looking at the
data later when you go back
coding variable: variable that uses numbers to represent different groups of data
(numeric, but numbers represent names)

o values > ... > value labels box > click white space next to value > type in a code
o ...figure it out yourself
o specify nominal or ordinal (order matters) > go to "measure" and select
missing values

o discrete: single values that represent missing data
o range of values (exclude data from a range of numbers)
o both
the SPSS viewer

o graphs and results of statistical analyses
SPSS Syntax

o File > Syntax

Chapter 4: Exploring Data with Graphs

