Académique Documents
Professionnel Documents
Culture Documents
Handbook
Module 10
Data Analysis and Interpretation
Introduction
Now that you understand how evaluation questions begin the
process, which then continues through design, data collection,
and sampling, you will be able to think about the next stage –
data analysis and interpretation. In this module, you will look
at the ways to analyze and interpret your data so that you can
answer the questions about the intervention or policy that you
are evaluating.
This module has four topics. They are:
• Data Analysis Strategy
• Analyzing Qualitative Data
• Analyzing Quantitative Data
• Linking Quantitative Data and Qualitative Data.
Data Analysis
Strategy
Intervention Qualitative Data
or Quantitative Data
Policy Linking
Module 10
Learning Objectives
³ Key Words
You will find the following key words or phrases in this module.
Watch for these and make sure that you understand what they
mean and how they are used in the course.
qualitative analysis
quantitative analysis
themes and patterns
content analysis
affinity diagrams
descriptive statistics
measures of central tendency
mode
median
mean
nominal data
ordinal data
interval/ratio data
measures of dispersion
range
standard deviation
normal distribution
descriptive statistics
frequencies
percent (proportion) distributions
percent change over two points in time (rate of change)
comparisons
measures of association
measures of correlation
direct relationship
inverse relationship
correlation
independent variable
dependent variable
crosstabs
inferential statistics
statistical significance tests
Data
Analysis
Hours Spent
Pilot
Data Collection
Time
kkk
Fig 10.1A: Qualitative Data Analysis Worksheet (example, blank).
Content Analysis
Content analysis is a systematic approach to qualitative data
analysis that identifies and summarizes the messages that the
data are sending. We usually use the term content analysis to
refer to the analysis of such things as books, brochures,
written documents, speeches, transcripts, news reports, and
visual media.
A typical content analysis might be to examine the content of
children’s textbooks to see whether they cover the necessary
material for learning a particular subject, presented in a way
that is reading level appropriate and fits the context in which
the children live and study. A deeper analysis might examine
whether the textbooks convey a specific political agenda or
biased interpretation of history.
Sometimes content analysis is used to when working with
narratives such as diaries or journals, or to analyze qualitative
responses to open-ended questions on surveys, interviews, or
focus groups.
Measures of Dispersion11
There are two measures which are commonly used to measure
the spread of quantitative variables: the range and the
standard deviation.
Range
The range is defined as the difference between the highest and
lowest value of variable. Using the data in Table 10.6, the
range for the percent (%) urban population is 75 – 41 = 34.
Although it is simple to calculate, the range is not very
revealing, because it is determined exclusively by two
observations, with all other cases ignored.
Standard Deviation
The most commonly used measure of dispersion for interval or
ratio data is standard deviation. Standard deviation is a
measure of the spread of the scores on either side of the mean.
The more the scores differ from the mean, the larger the
standard deviation will be. One way to think of the standard
deviation is as a kind of “mean of the mean.”
11 G. Swimmer “Qualitative Data Analysis, Part I,” IPDET 2005 course notes,
1
For example, you have data about the height of a large group of
people. The frequency of each height will probably fall into a
normal distribution. Most of the people will be towards the
middle (mean) with fewer people being shorter and taller. At
the extreme ends there will be the very shortest person and at
the other end, the very tallest person. Most people’s height will
be close to the mean because most people are somewhere in
between short and tall.
Figure 10.1 shows a normal distribution. The “y” axis shows
the frequency of the data, the x axis shows the value of the
data. For example if this were a normal distribution of the
height of a population, the “x” axis would show the value
(measure) of the height, the “y” axis would show the number of
people with this measure.
y
Frequency
0 Value
x
Not all sets of data will have data that match the normal
distribution. Some will have steeper curves; others will have
more flat curves, while others will have the curve closer to one
end than the other. Figure 10.2 shows examples of some of
these differences. But all normally distributed data will look
very similar to the normal distribution.
y
Mean
One standard deviation
from the mean
0 x
Three standard
68% deviations from the mean
95%%
98%
Strongly
Disagree Neither Agree
Strongly
disagree agree
I wait a long time
10% 20% 10% 35% 25%
before being seen.
They are willing to
5 10 30 30 25
answer my questions
I receive good health
15 25 10 25 25
care at the clinic
N=36
Source: Fabricated Data, 1999 Survey.
1
If the results were different, one might use a different strategy.
For example, consider if the results looked like those in Table
10.10.
Table 10.10: Client Views on Health Care Services at the Local Clinic
1. Considering your experiences with the local health clinic,
do you agree or disagree with the following statements?
Strongly Strongly
Disagree Neither Agree
disagree agree
I wait a long time
50% 20% 10% 15% 5%
before being seen.
The staff are willing
to answer my 0 5 0 30 65
questions
I receive good health
0 20 0 55 25
care at the clinic
N=36
Source: Fabricated Data, 1999 Survey.
12 Swimmer, Gene. Quantitative Data Analysis, Part I,” IPDET 2005 course
notes, p. 22. This example and the text following used with the permission of
the author.
Measures of Relationship
Measures of relationship (or association) tell you how strongly
variables are related. Association never proves cause, but it
can suggest a causal relationship if (and only if) there is a
strong measure of association.
While there are many kinds of measures of association, they
are usually reported in terms of a zero to 1 scale to indicate the
strength of the relationship. A perfect relationship would score
1. A relationship showing no association at all would score
zero. In other words, the closer the measure is to zero, the
weaker the relationship and the closer the measure is to 1, the
stronger the relationship.
Measures of correlation are calculated on a scale of -1 to +1.
These measures show the direction of the relationship through
the sign (positive or negative). A measure with a positive sign
means that as the variables change in the same direction: both
go up or both go down. This is called a direct relationship: for
example, as the years of education increases, individual
income increases. A negative sign indicates that the variables
have an inverse relationship meaning that they move in the
opposite directions: for example, as age increases, health
decreases. A measure of association of -1 would therefore
mean a perfect inverse relationship. A measure of association
of -.1 would be close to zero and is a very weak inverse
relationship.
You can do a correlation between two interval or ratio variables
or use a multiple regression technique to estimate the impact
of several variables simultaneously on the dependent variable.
This technique works with interval and ratio level data. There
are other techniques that can be used for ordinal and nominal
data but they are less commonly used.
Inferential Statistics
Inferential statistics enable you to make an estimate about a
population based on a random sample selected from that
population. Whenever you are using sample data, your major
concern is whether the results are a function of some
quirkiness of your sample rather than a true picture of the
population. If you had picked a different sample, would your
results be fairly similar?
Statisticians have developed tests to estimate this. These are
called statistical significance tests and do a very simple
thing:
Statistical significance tests allow you to estimate how
likely it is that you have gotten the results you see in your
analysis by chance alone.
notes, p. 29.
Chi Square
Chi-square is one of the most popular statistical tests because
it is easy to calculate and interpret. The purpose of chi square
is to determine whether the observed frequencies (counts)
markedly differ from the frequencies that we would expect by
chance. Chi square is used when you want to compare two
nominal values (for example, marital status and religious
affiliation). Or you want to compare two ordinal variables
(scaled responses) or a combination of nominal and ordinal
variables.
The chi square statistic is the sum of the contributions from
each of the individual cells in a data table. Every cell in the
table contributes something to the overall chi square statistic.
If a given cell differs markedly from the expected frequency,
then the contribution of that cell to the overall chi square is
large. If a cell is close to the expected frequency for that cell,
then the contribution of that cell to the overall chi square is
low. A large chi square statistic indicates that somewhere in
the table, the observed frequencies differ markedly from the
expected frequencies. It does not tell which cell (or cells) is
causing the high chi-square, only that they are there.
Chi square measures whether two variables are independent of
one another based on observed data.
t-Test
A t-Test is a statistical technique that can determine whether
one group of numerical scores is statistically higher or lower
than another group of scores.
This analysis is appropriate whenever you want to compare the
means of two groups. It is especially appropriate for the
analysis of a project where you want to compare the mean
scores of the group affected by the project with the mean
scores for the control group (unaffected by the project). This
leads us to a very important conclusion: when we are looking
at the differences between scores for two groups, we have to
judge the difference between their means relative to the spread
or variability of their scores. The t-test does just this.
1
In a study by Kell 24, researchers looked at the effects of
computers on classroom teaching and learning. The research
team repeatedly visited classrooms in six school districts over a
school year. During the visits, they interviewed teachers and
administrators and systematically observed classroom activity.
They also asked teachers to complete two different
standardized questionnaires at three times during the year.
These questionnaires covered not only the teachers’ ideas
about using computers to enhance the learning of reading
skills, but also their concerns about the adoption and use of
such innovations. In this way, the two data types were linked,
and allowed researchers to study changes occurring over the
whole school year.
Mutimethod designs’
1
In a study of care for the elderly, Wilson and Bolland 25 tested
competing theoretical models. They collected data on referrals
and goal-oriented actions for 49 organizations and then built a
nominated sample of community leaders. They analyzed the
referral and goal oriented data to determine which
organizations were most central in the network of
organizations. They examined the leader data to see which
leaders were most central in their network. They found the
combined analysis supported one of the theoretical models.
They could not have done this with either one or the other data
set alone.
Ev
If you are analyzing qualitative data consider the following:
• start your analysis as you collect the data by making
good notes
• draw out themes and patterns
• summarize your data
• control for bias
• if working with several people, consider using affinity
diagrams.
• If you are working with quantitative data you will need
to use statistics to help you analyze your data. You will
use one or more of the following:
• measures of central tendency (how similar) – mode,
median, and mean
• measures of dispersion (how different) – standard
deviation and variance.
To analyze the data, look at it and use these guidelines:
• choose a standard way to analyze the data and then
apply it consistently
• focus on one side of the scale or the other when
analyzing ordinal data
• do not combine the middle category with either ends of
the scale with ordinal data
• when reporting on one category out of the five, choose
an extreme category (strongly agree or strongly disagree)
• do not report an agree (or disagree) category without
also reporting the strongly agree or strongly disagree
category
• analyze and report percentages (not numbers, and
definitely not averages)
• provide the number of respondents
• if there is a little difference in the data, raise the
benchmark: what do the results look like when you
focus on the questions that received a majority response
of “very satisfied”?
• remember that analyzing data is an art and a skill; it
gets easier with training and practice.
Summary
In this module, you learned about how to analyze qualitative
and quantitative data. Use the following checklist to check if
you have learned the important items. Check those that you
know and review those that you do not.
describe the guidelines to determine the size of
sample that is needed
discuss the when to use qualitative and quantitative
data
discuss how to analyze qualitative data. including:
− making good notes
− drawing out themes and patterns
− content analysis
− summarizing qualitative data
− controlling for bias
− affinity diagrams
discuss how to analyze quantitative data
discuss measures of dispersion including standard
deviation
discuss ways to analyze survey results
define commonly used descriptive statistics,
including:
− frequency
− percent
− mean
− median
− mode
− money vs. non-money costs
− percent change over two points in time
− ratio
− comparisons
discuss measures of relationship
discuss inferential statistics
Quiz Yourself
Reflection
h Instructions
Obtain three or four fairly substantial newspaper articles about
an issue in the area where you work. Read each of the articles
to determine the major themes.
If you are working with others on this module, have them do
the same. Then together use the affinity diagram process to
identify themes. [If you are working alone, please skip this and
go on to Exercise 10-2.]
You will be given pieces of paper. Write one theme or idea on
each of the pieces of paper. This is to be done in silence. Once
you have written all your themes, stick the pieces of paper
randomly on the wall. As others add their pieces of paper to
the wall, begin to arrange into similar groups—again in silence.
Once the pieces of paper begin to take shape according to
themes, people may talk as they continue to re-arrange the
post-its.
The large group will then identify the themes.
h Instructions
Obtain three or four fairly substantial newspaper articles about
an issue in the area where you work. Read each of the articles
to determine the major themes.
If you are working by yourself and have access to a computer,
set up an Excel grid (or other spreadsheet program, or even a
large, handwritten paper grid if you prefer). In the top left cell,
write “article,” and at the top of the second column, write the
word “excerpt.” Next, enter each of the comments from the
interviews, one per row, in the column labeled “excerpt.”
Identify themes from the articles, enter them as column
headings, and mark the appropriate cell where a particular
article excerpt contains each theme. Finally, write a narrative
summarizing the findings from the three articles.
h Instructions:
Please complete this survey, and have two colleagues who are
also working through these training modules also complete it.
[If no one else is working through these modules, have
colleagues complete the first half only.]
Collect the surveys and tally the results. Working by yourself
or with others, summarize all the results. Include some
conclusion about the overall findings.
IPDET Survey ID: _______
1. To what extent, if at all, would you say that you currently
had the following ability to do each of the following?
Little or Some Moderate Great Very
no extent extent extent great
extent extent
a. Design an evaluation
b. Analyze data
c. Develop a survey
d. Conduct a focus group
e. Facilitate a stakeholders’
meeting
f. Write an evaluation report
g. Prepare an oral briefing
Any comments?
Denzin, N. & Lincoln, Y. (Eds.) (2000). Handbook of Qualitative
Research, (2nd Ed.) Thousand Oaks, CA: Sage Publications.
Glass, G. & Hopkins, K. (1996). Statistical methods in education
and psychology (3rd Ed.). Boston: Allyn and Bacon.
Jaeger, R.M. (1990). Statistics: A spectator sport (2nd Ed.).
Thousand Oaks, CA: Sage Publications.
Miles, M.B., & Huberman, A.M. (1994). Qualitative data
analysis: An expanded sourcebook (2nd ed.). Thousand
Oaks, CA: Sage Publications.
Patton, M.Q. (2002). Qualitative research and evaluation
methods (3rd ed.). Thousand Oaks, CA: Sage Publications.
Porteous, Nancy L., Sheldrick, B.J., and Stewart, P.J. (1997).
Program evaluation tool kit: A blueprint for public health
management. Ottawa, Canada: Ottawa-Carleton Health
Department. (ISBN 0-9694956-3-3.) Available online at
http://www.phac-aspc.gc.ca/php-psp/toolkit.html
(English) or http://www.phac-aspc.gc.ca/php-
psp/toolkit_fr.html (French)
Websites:
5 2. a
3. a
4. b
5. c
6. d
7. a
8. c