Applying Social Science in The Real World

Applying Social Science in the Real World
Informing practice with the best available research and making research more relevant to practice
are easier said than donewhether in health care, education, or adult learning. Making a
measurable difference in peoples lives is harder still.
The following short reflections on these challenges point to how we might make headway applying
what is learned from research studies to the real worlds of practice and policy.
The researchers who contributed here work in different fields and research traditions, but all hope to
prime conversation and collaboration with policymakers and practitioners, strengthening both
research and practice.
Both David Osher and Terry Salinger see the messiness of real-world settings, compared to the
controlled conditions of many research studies, as high but surmountable hurdles in generalizing,
adopting, or scaling up evidence-based practices.
Commercial applications pose particular problems related to educating both developers and
consumers, as George Rebok points out in his think piece on adopting cognitive training programs
for older adults. And so do efforts to inform government policies, as George Bohrnstedt contends
witness researchers frustration when decades of research on achievement gaps between Black and
White students go largely unheeded.
Addressing these challenges requires keener understanding of the real world that researchers hope
to help improve. One way is to walk in the shoes of your researchs intended beneficiaries. In her
personal account of navigating the healthcare system as both a health and aging researcher and a
caregiver, Marilyn Moon sees the issues from a new bottom-up vantage point. This perspective
also underlies Bea Birmans call for knowing how individuals and organizations get information and
learn new approaches.

To better grasp implementation challenges, researchers may need to interact more with the
practitioners and policymakers who might use research to inform their work. Steven
Garfinkel describes a new way for researchers to work in real-world settings through rapid-cycle
evaluation, which allows both researchers and practitioners to better understand how an innovation
is being implemented and provides practitioners with a steady flow of information so they can keep
improving practice in response. Such new ways of working challenge some of researchers well-
honed traditional skills along with the underpinnings of some traditional research paradigms.
Research can help solve problems of practice, and practitioners can help make research relevant.
The key, these commentaries suggest, is balancing the needs of practitioners and policymakers with
the requirements of research rigor, often through shared work.
SIGNIFICANT DIFFERENCES!! WHAT DO THEY REALLY MEAN?
Terry Salinger
In simple terms, education evaluations typically compare
classrooms or schools that receive specific programs (the treatment) with schools that
receive business-as-usualservices. In a recent study of a program designed to improve teacher
practice and student reading, for example, all kindergarten to Grade 2 teachers in participating
schools received the same business-as-usual services while teachers in schools randomly assigned
to the treatment received extra resources, summer professional development institutes, and
instructional coaching throughout the school year.
Even the best-designed study cant stop the inevitable and often unexpected
fluctuations in research settings like schools and districts, especially in urban areas.
Theres an inherent messiness in schools and districts.
An equally simple description of our job as evaluators is that we must collect the data needed to
investigate whether extra resources and services seemed to have a positive impact on teachers or
students or both. To go on with the kindergarten example, after the second and third (and final) year
of implementation, statistically significant differences between treatment and comparison schools

emerged in teacher practice, overall reading achievement in kindergarten and second grade, and
other variables. The positive findingssolid because we had validated both the method and the data
affirmed the programs promise for improving teaching and learning.
Rarely do evaluations of interventions in early reading find such significant differences between
treatment and comparison conditions. In a 2003 meta-analysis, only nine studies out of over 1,300
met standards for high-quality, rigorous research. And this shortage of well-designed studies makes
it difficult to generalize about how strong an impact professional development really can have on
teacher and student outcomes.
But here we were with a rigorous study and statistically significant resultsand a pressing need to
understand what the results did and did not mean. In a nutshell, the evaluation found positive
impacts for the program in specific schools in specific districts, but the findings did not guarantee
that positive impacts would be found in other districts or even in other schools in the same study
districts. The study met standards for rigor (including matched schools, large sample size, consistent
data collection over three years of implementation, and equal attrition rates under the two
conditions). But standing by the results is one thing; overgeneralizing from them is another. So we
cautioned the programs developer that the findings were indeed a big deal but still needed to be
viewed realistically.
Why? For starters, even the best-designed study cant stop the inevitable and often unexpected
fluctuations in research settings like schools and districts, especially in urban areas. Theres an
inherent messiness in schools and districts. Teachers, students and administrators move around
frequently, and curriculum changes often, too. Amid such instability, even positive findings like ours
may not justify districts adoption of the new approach.
More messiness: schools and districts grappling with poor student performance on state reading
tests or other accountability measures often search for whatever is marketed as new or special or
guaranteed to improve student achievement. They put their trust in the next big thing instead of in
the slow and steady process of building professional knowledge and teachers instructional capacity.
Then, too, the business-as-usual professional development and training or overall instructional
procedures in study districts may be intrinsically strong, raising the possibility that all teachers are
getting the support needed to improve their skills. As other scholars have pointed out, the nature and
quality of instruction in comparison classes and the training provided to those teachers need to be
measured carefully if researchers are to understand the real impact of positive program results.
All these factors can cloud the story that evaluation data tell about treatment and comparison
schools, making it difficult to determine the extent to which the program being evaluated has
produced real change. Evaluators like to assume that the messiness will be equally distributed
across treatment and comparison schools, but experimental studies rarely collect the data to prove
or disprove this assumption.
So while studies may find real, significant differences, theres no guarantee that the program
evaluated would have the same impact in other settings, even those nearby. This is the impact
evaluators dilemma.
IMPLEMENTING EVIDENCE-BASED INTERVENTIONS IN REAL-WORLD

SETTINGS
David Osher
So why dont many practitioners implement evidence-based programs and practices? And, when
practitioners do practice what is preached, why dont they strictly follow the recipe? And, when they
implement the research with fidelity, why dont they get the results that efficacy studies say are
possible?
It is possible to implement evidence-based practices and programs successfully in

earthen trenches. But doing so takes time, and ... organizational readiness, support [for]
practitioners ... and the ability to adapt evidence-based programs to individual contexts
while maintaining the programs core ingredients.
These questions point to three research-to-practice challenges. Addressing them now is particularly
important as the rotten social outcomes identified by Lisbeth Schorr and Paul Steele and the
wicked policy problems that relate to them get increasing attention.
The first two challenges have been referred to as the research-to-practice gap and the third as the
gap between efficacy research (which is implemented under relatively ideal conditions) and
effectiveness research (implemented under more normal conditions). Addressing these challenges
requires (in the words of Peter Jensen, Kimberly Hoagwood, and Edison Trickett) moving research
from ivory towers, where graduate and postdoctoral students implement interventions to well
selected samples, to earthen trenches where children are more complex and resources exigent, to
examine what is palatable, feasible, durable, affordable, and sustainable in real-world settings.
Earthen trenches are messy and complex, contextually rich and interdependent, where in-the-
moment (hot action) decisions are often required and practitioners must grapple with multiple and
competing demands for their time, attention, energy and cognitive reserve. Teachers, for example,
work in what Michael Huberman referred to as busy kitchens while other practitioners (to borrow
Donald Schoens metaphor) confront tough and complex decisions in the swampy lowlands of
practice where situations are confusing 'messes' incapable of technical solution. Think here about
how the diverse academic, social, emotional and behavioral needs of every student in a classroom
can change from day to day or even hour to hour.
Think, too, about having to decide whether a child has been abused or neglected by family members
or whether a youth accused of delinquent behavior should be diverted from the juvenile justice
system. Change, rarely easy, is especially hard in highly stressed settings, particularly without ample
resources and support for learning, reflecting, collaborating and mastering new approaches and
technologies.
Paradoxically, successful implementation of evidence-based strategies and programs may depend
on moving from a developer/research-centric perspective to one focused on setting. Research-
based interventions are not just matters of adhering to blueprints and implementing plans faithfully.
Rather, their ecology includes other programs and competing demands on practitioner and
consumer time and attention. These so-called setting effects can either amplify or diminish
intervention effects. In short, research, evaluation and technical assistance should account for how a
multiplicity of evidence-and non-evidenced-based practices affect particular outcomes.
All this said, it is possible to implement evidence-based practices and programs successfully in
earthen trenches. I have seen it happen as a researcher, evaluator and technical assistance
provider. But doing so takes time, and success also depends on organizational readiness, the
support practitioners changing practices receive, and the ability of those promoting scale-up to adapt
evidence-based programs to individual contexts while maintaining the programs core ingredients.
APPLYING RESEARCH TO PRACTICE ON A PERSONAL LEVEL

Marilyn Moon
A major challenge of being a health and aging researcher arises
when facing those issues personally. Its humbling to try to reconcile theory and research with
practice. But understanding how issues and policies play out in real life can help. As health care
becomes more complicated and fragmented, consumers are increasingly responsible for making
good choices and even managing what happens at various stages of treatment. Consequently,
researchers have worked hard to both measure quality and good practice and to develop materials
that consumers can use in decision-making. All that said, practical advice during times of need is
hard to come by. Most of us are just-in-time information usersseeking advice while in the throes
of our complex and fragmented health care system. More needs to be done to empower consumers
so the tools that have been developed get used.
Research tells us that we dont want health care providers steering people to their own
agencies or best friends, so we need a better way of providing decisionmaking
information than a midnight-to-1 a.m. search activity by an exhausted caregiver.
The fragmented system we have is difficult to navigate. My firsthand experience with helping my
spouse get care following a stroke is pretty typical. While there is a fairly common path to getting
care, it wends into different settings managed by different organizations, with almost no coordination
or even shared knowledge. Even when the same overarching institution is presumably involved,
each handoff occurs with uncertainty and with little sense of how one set of services helps or informs
the next. Even knowing the formal rules surrounding health care policy, as I do, helps little since the
practice can look quite different from what is implied in the regulations governing Medicare, for
example.
For a stroke victim and other patients requiring hospitalization and considerable follow-up care, the
usual progression is inpatient hospital, inpatient rehabilitation hospital, home health care, and then
outpatient therapy. Technically, discharge planning is offered or required at various stages, but it can
amount to as little as handing the family a list of eligible providers, with no supporting information or
documentation. Research tells us that we dont want health care providers steering people to their
own agencies or best friends, so we need a better way of providing decisionmaking information than
a midnight-to-1 a.m. search activity by an exhausted caregiver (my experience).
Care providers should be knowledgeable about the quality and ratings information available and
share copies of such materials for those moving on to the next site of care. Currently, this is one
missing link in health care decision-making. Busy professionals in one setting have little knowledge
of how the other settings operate so can offer little guidance. Materials developed wont be used if
they dont make sense to both patients and care providers.
AIR research done several years ago found that health care professionals and consumers often talk
past each other: They are looking for different things and often express very different reasons for
ignoring quality information, for example. Getting them on the same page can be challenging.
Other AIR research has also found that many people use proxy information as a shorthand for
qualitysuch as equating higher prices with higher quality care. But many studies have shown that
lower cost providers may provide equivalent or higher quality care.
Timing also complicates information-seeking. In my husbands case, each time there was to be a
handoff to another setting I would be reassured that I had several days to make arrangementsbut
would actually be forced to make a decision on the spot. Quality information that is supposed to help
with these decisions is difficult to access and understand when under the stress of both a deadline
and the general worry over being the caregiver for someone who is very ill. For that matter, other
information, such as on the availability of services, often does not exist.

Some researchers have suggested adding a care coordination specialist to the mix. That might help,
but only if that person follows the patient and isnt housed in a single caregiving setting. And even
then, who would coordinate and oversee the coordinators? How would they be accessedor
compensated? At the moment, such activities are largely cottage industries. And services are
available only to those who can afford to pay out of pocket.
One answer might be to have a single organization provide all necessary care at each stage of the
process. Integrated health care systems promise that they will manage the handoffs and see that the
care is seamless. In practice, though, it does not always work that way. In one short-turnaround
handoff, it seemed the best approach would be to work with the home health agency affiliated with
the rehabilitation facility. But, absent coordination and any advantage of using related entities, I had
to fire the home health agency. After a brief orientation, it was our responsibility to call all the
individual aides to set up appointments; the nurse finally called back after two weeks (a week after I
had informed the agency that we were going elsewhere) and said she was ready to meet with us. A
homebound, very ill patient in a coordinated situation was not going to go untreated for over two
weeks! Fee-for-service gave us the option of finding another provider. In a managed care
environment, we would have to use the designated agency. Again, research indicates that, overall,
quality is fairly equivalent for Medicare Advantage (coordinated) plans and traditional Medicare. But
it is hard to find information on the various practical dimensions of receiving care when choosing
among health plan options.
Our second experience with home health was more successfulbut only because I used personal
connections. None of the information on quality or availability indicated anything about actual access
to services and timeliness of care. Someone without a network of professional friends would have
been hard-pressed to figure out what to do. Moreover, research on this topic needs to recognize the
subtle differences between acute care needs and supportive services when both are needed but, in
our system, do not come from the same providers.
Every new twist and turn in the caregiving process further convinces me that there must be a better
way. And now I know firsthand that its nearly as confusing to look at the problems facing the U.S.
health care system from the standpoint of a participant as from the standpoint of a researcher.
(When I find the time and insight to combine my practical experience and research knowledge, I
expect to have more lessons learned to share.) Research needs to help inform consumers but can
do so only if researchers choose to study the key questions that matter to patients. So far,
consumers must learn the hard way that there are no easy paths for navigating our current
healthcare system.
WHAT IS FIDELITY IN EVALUATION RESEARCH ANYWAY?
Steve Garfinkel
Evaluation design in the social sciences is a puzzleliterally. As
governments role in everyday life expanded during the 20th century, the demand for accountability,
and with it evaluation, grew too. Investigators proposed designs, identified flaws, puzzled out
solutions, and so on. My favorite puzzle guide is Campbell and Stanleys Experimental and Quasi-
Experimental Designs for Research. In it, the authors concisely synthesize 13 classic threats to the
validity of inferences made from evaluation research and 16 evaluation designs that address those
threats.
Fidelity has become a challenging concept, particularly in evaluating health care

insurance and delivery system interventions.
Campbell and Stanley popularized the use of X to indicate the intervention being evaluated and O
to indicate observations or measurements of the interventions effects. Rereading this work recently,
I was struck by what a great choice X was. Undoubtedly, it was chosen to represent any intervention
that readers might consider. But X also conveys, perhaps unintentionally, the notion of the
intervention as a black box. Interventions in social interactionteaching, providing health careare
hard to implement precisely, and implementers can take various approaches. Without addressing
implementation fidelity explicitly, Campbell and Stanley do recognize it in their discussion of threats
to validity. However, they implicitly treat X as a single, coherent intervention common to all
participating organizations and persons and treat outside events (history) and internal growth
(maturation) as alternative explanations that compete with the uniform X.
Since this influential text was written, fidelity has become a challenging concept, particularly in
evaluating health care insurance and delivery system interventions. In 2010, the Affordable Care Act
(ACA) accorded unprecedented importance and funding to the design, implementation, and
evaluation of innovations that would improve quality and safety, control costs, and optimize patient
outcomes in Medicare, Medicaid, and the Childrens Health Insurance Program (CHIP). Congress
created the Center for Medicare & Medicaid Innovation (CMMI) at the Centers for Medicare &
Medicaid Services (CMS) to carry out this work. Expanding funding and authority to act on the
results of the kinds of rigorous evaluations that CMS had long carried out raised the stakes for all
Medicare, Medicaid, and CHIP evaluations. Under the ACA, the Secretary of Health and Human
Services can expand an innovation demonstration program widely without congressional
authorization if the CMS Chief Actuary construes evaluation results and actuarial analysis to mean
that certain cost and quality criteria are met.
To achieve cost and quality goals like the ACAs, the Institute for Healthcare Improvement has, since
the early 1990s, promoted the identification and diffusion of best practices through continuous
quality improvement. Since 2010, organizational learning and diffusion of best practices have
become essential elements of CMMIs vision, mission and operations. This drive came at about the
same time that Congress raised the stakes for evaluations.
With the importance of rigorous evaluation greater than ever and innovation evolving during the
demonstrations that are being evaluated, the notion of fidelity, so long central to rigor in evaluation,
has been challenged. On the one hand, why should the intervention remain static when we already
know how to improve its implementation? Defending scientific rigor, Campbell and Stanley might
think of these improvements as threats to validity from history or maturation that should be
minimized through experimental design and statistical control. But, by definition, organizational
learning and diffusion within the demonstration change X intentionally while it is being evaluated.
CMMI itself embraces rapid-cycle evaluation (RCE) as the answer to this conundrum. If you are
continually changing the intervention, then you must also measure outcomes as you go along to see
if those changes are harmful or helpful. This means both feeding back results to the demonstration
organizations periodically for rapid-cycle improvement and drawing evaluation conclusions from
them. Obviously, RCE can identify only short-term effects, but more traditional summative evaluation
at the demonstrations end can capture longer-term effects using the kind of rigorous evaluation
designs described by Campbell and Stanley.
All this said, does rapid-cycle improvement (RCI), intentional organizational learning, and their
challenge to traditional notions of implementation fidelity threaten or enhance the chances of getting
accurate results from the overall rigorous evaluation? With or without RCI and RCE, adherence
across demonstration sites to a well-specified intervention model (fidelity) is challenging when the
pace and direction of history and maturation vary.
At first blush, fidelity seems degraded when the intervention is altered intentionally while it is being
evaluated. But the changes made by communities of practice and rapid feedback of standard
performance measures might also move diverse participants toward consistency in implementation
and, thus, greaterfidelity, at least by the end of the demonstration.
We dont yet fully understand these trade-offs impact on our ability to draw actionable conclusions
from demonstration evaluations. Still, it is clear that carefully measuring the shifts and changes
introduced by active organizational learning activities throughout a demonstration and considering
them as explicit variables in the summative evaluation should help define fidelity for a new research
age.
TRAINING THE AGING BRAIN: FACT OR FICTION?
George Rebok
There has been ongoing debate for a decade now over whether
cognitive stimulationthrough such everyday activities as completing crossword puzzles, learning to
play a musical instrument, and participating in a book club or through more formal cognitive training
interventionscan help maintain or even enhance cognitive functioning as people age.
An equally important question is whether the results of cognitive stimulation and training will transfer
to both laboratory and real-life tasks. For example, will training people on a laboratory memory task
help them better recall the names and faces of people they meet in their everyday lives? Or does
improving processing speed on a simulated driving task improve peoples actual driving ability and
on-road safety?
Too often, in their haste to sell brain-improvement products and games, developers rely
on one or two studies to back their claims of effectiveness rather than drawing on an
accumulated body of research.
Fortunately, a growing number of randomized controlled trials on the effects of cognitive training
programs, including adaptive computer training, are assessing the immediate and long-term benefits
of cognitive performance and whether such training will generalize to abilities and skills besides
those targeted by training.
The Advanced Cognitive Training for Independent and Vital Elderly (ACTIVE) clinical trialthe
largest test of whether cognitive training can improve the cognitive and speed of processing abilities
of healthy older adultsso far shows promising results for cognitive stimulation and cognitive
training. It demonstrates that older adults can improve their cognitive abilities, though not as fast as
younger adults can, and the improvements last for several months or even yearsup to 10 years in
the case of the ACTIVE trial.
The evidence for whether training transfers is more mixed. Relatively few studies show transfer to
non-trained tasks, including those involving everyday skills. However, in the ACTIVE trial, trained
participants self-reported fewer daily living problems, and those getting processing speed training
were less likely to cease driving or have at-fault automobile crashes.
Despite positive results, there is often a disconnect between laboratory research findings on
cognitive stimulation and cognitive training and their use in commercial brain training products
designed to stave off mental decline and forgetfulness. Brain training products have become a
billion-dollar industry worldwide; revenues are projected to surpass $6 billion by 2020. However, the
promised real-life benefits from cognitive training products are often unwarranted, and some
products arent based on current research evidence.
What works in the laboratory may not work in the real world, so claims about the efficacy of these
commercial programs may be premature. For example, no study has shown that brain training
programs cure or prevent Alzheimers disease, despite claims to the contrary by some commercial
vendors.
So one key question is why research is not used more in the development of brain training programs
for older adults. Although there is steadily growing scientific evidence for the benefits of cognitive
training, many program developers are not trained scientists and often cite research findings that are
only tangentially related to their scientific claims about a product.
Developers may also be reluctant to use research findings because the results of many training
studies are modest or fleetingnot the stuff of strong advertising claims. Too often, in their haste to
sell brain-improvement products and games, developers rely on one or two studies to back their
claims of effectiveness rather than drawing on an accumulated body of research (which may not
exist for a particular program, might take time to collect, or pose product validity questions that
product developers cant answer). Although pharmaceutical claims are subject to regulatory review,
so far brain fitness programs arent, so some developers cherry-pick results and make
unsubstantiated advertising claims.
Further complicating the issue are important questions about implementing and disseminating
cognitive training programs for older adults in community settings. Many such programs are
computer-basedinaccessible to those who lack adequate computer or literacy skills, dont know
such programs exist, or find them hard to use for other reasons.
Researchers and developers need to pay more attention to making cognitive training programs
accessible and affordable for the increasingly diverse population of older persons, especially those
who are most in need. Guidelines for designing training and instructional programs for older learners
are available and could inform this translational effort.
Other important unanswered questions are how early cognitive training should begin, how much a
person should train, and how long the training can be expected to last. Until we know how to answer
these questions, potential consumers should ask questions and require scientific evidence that a
cognitive training program works. Which questions? For starters, are there scientists (ideally
neuropsychologists) and a scientific advisory board behind the program? Have these advisers
published peer-reviewed scientific papers? How many? What benefits are being claimed for using
this program? And, does the program fit my personal goals? (For more questions, see
this SharpBrains checklist.)
USING RESEARCH TO IMPROVE PRACTICE: WHICH RESEARCH MAKES

A DIFFERENCE?
Bea Birman
Efforts by policymakers and program administrators to identify what
works in education are legion. During the 1970s, the Joint Dissemination Review Panel evaluated
the impacts of educational interventions so that the federal government could share them more
widely. In recent years, the Education Departments What Works Clearinghouse has identified
practices that improve outcomes, relying primarily on researchs gold standardrandomized
controlled trials. Yet, despite some positive changes in student outcomes (such as the modest
narrowing of achievement gaps between minority and nonminority students), simply identifying
effective practices hasnt yielded widespread or system-wide improvement outcomes. Is the
research on education practices partly to blame? Does it lack rigor or, on the other hand, the breadth
needed to make results generalizable?
Simply identifying effective practices hasnt yielded widespread or system-wide

improvement outcomes... Changing what individuals and organizations do is best done
in a durable community that supports both individual and organizational learning.
Certainly one difficulty is that finding practices that work, however rigorous the research behind
them, requires taking into account what is known about the organizations using the practices
successfully and how the people in these organizationsprincipals and teacherslearn. Too often,
research to determine whether interventions work ignores knowledge from both research and
practice about what it takes for teachers and schools to implement effective practicesabout how
people and organizations develop the capacity to improve.

By the same token, few policymakers design programs that create the optimal conditions for
improving education practices. Understanding how people and organizations learn could help shape
policies that support practice improvements rather than impede them.
Take teacher learning. Available evidence suggests that teachers learn best in an atmosphere of
trust. To improve, teachers must be able to learn new skills and unlearn old habits and behaviors.
This means making mistakes, at least at first. To risk trying something new, and to practice enough
to develop expertise, teachers require the kind of trust that takes time to develop, along with
supportive colleagues.
Beyond individual teachers, school improvement requires organizational learning. Implementing new
practices often requires breaking with entrenched organizational routines, monitoring how the new
practices are working, and making improvements along the way. Such changes dont happen
overnight! Changing what individuals and organizations do is best done in a durable community that
supports both individual and organizational learning.
Schools can be such learning communities, and some already are. But education policies and
practices beyond the school level can undermine the very conditions that these communities need to
thrive. For example, schools cant initiate or sustain effective practices without a stable teaching
force. Yet, district, state or federal policies can foster churn in the teaching force if district rules
dont incentivize teachers to stay in challenging schools or if rules mandate blanket staffing changes
(if, for example, School Improvement Grants require some schools to replace leaders or half of the
teaching force). And, beyond fostering a stable teaching force, continuous school improvement
requires leadership and resources from outside the school. Here, time for ongoing professional
learning springs to mind.
Some researchers and technical assistance providers recognize that identifying evidence-based
interventions is only one part of changing practice. AIRs National Center on Intensive Intervention,
for instance, employs randomized controlled trials and other rigorous research on data-based
individualization as the foundation for designing a five-step process of diagnosis, intervention,
progress monitoring, analysis and adaptation. Beyond rigorous research, build[ing] district and
school capacity to support implementation of data-based individualization in reading, mathematics,

and behavior for students with severe and persistent learning and behavioral needsthe Centers
missionrequires helping schools prepare to initiate change and to commit to the long haul. Since
implementation is multifaceted, it cant succeed without a host of supports ranging from strong
leadership and teacher and parent involvement to opportunities for professional learning and data
systems to monitor progress. Theres no one-size-fits-all formula, but these ingredients are all
needed in some form.
Identifying interventions that work, no matter how high the research standards, is only one part of
improving education practices and outcomes. Long-term improvement requires knowledge about the
ongoing individual and organizational learning inherent in implementation itself.
CLOSING THE BLACK-WHITE ACHIEVEMENT GAP: GOOD NEWS, BAD

NEWS
George W. Bohrnstedt
With each National Assessment of Education Progress (NAEP)
release we read how sluggish American students progress is in subjects such as mathematics,
reading and U.S. history and, especially, how poor the achievement of Blacks is and, consequently,
how large the Black-White achievement gaps are. The most recent release of the 2015 NAEP
results was no different, except there were declines in Grades 4 and 8 mathematics and Grade 8
reading, and the Black-White achievement gaps remained large.

Results as a whole are very encouraging: both our White and Black students are
showing academic performance growth... And the bad news? When we compare the
Black-White achievement gaps over time, we can see that the gaps are closing but at a
snail's pace.
The NAEP assesses changes in the educational achievement of the nations fourth- and eighth-
graders in mathematics and reading every other year and several other subjects less frequently;
U.S. history is currently assessed every four years. When we examine roughly 25 years of
achievement assessments for White and Black students in mathematics, reading and U.S. history, a
good news/bad news picture emerges.
What is the good news? Save for 2015, scores have gone up for all students, and the gains have
been greater for Blacks than for Whites. For example, scores at Grades 4 and 8 in mathematics for
Whites and Blacks have all risen. In the past 25 years, the scores for Whites in Grade 4 have risen
29 points; for Blacks, 36 points. The somewhat smaller gains at Grade 8 follow this same pattern,
although the gain for Black students is only 1 point greater than for Whites22 points for Whites and
23 points for Blacks.
As a way to understand what these gains mean, consider that roughly 40 NAEP points separate the
average Grade 8 and Grade 4 scores, which implies that, on average, students gain 10 NAEP points
per year. Thus, these are considerable increases in student performance in the past 25 years, but
especially for Black students at Grade 4.
In reading, the same pattern holds, though the overall gains are less than for mathematics. Between
1992 and 2015, White fourth-graders gained 8 points, but Blacks gained 14 points. In Grade 8,
White students gained 7 points, compared to 11 for Black eighth-graders. This is all pretty good
news so far.
The pattern for U.S. history is similar. At Grade 4, the growth for Black students between 1994 and
2011, 22 points, far exceeded that for White students, 9 points. For eighth-graders who were most
recently administered the assessment in 2014, the results are similar but not as dramatic13 points
for Black students compared to 11 points for White students.

These results as a whole are very encouraging: both our White and Black students are showing
academic performance growth. Most impressive, Black student growth exceeds that of White
students for both fourth and eighth grades and in all three subjects.
And the bad news? When we compare the Black-White achievement gaps over time, we can see
that the gaps are closing but at a snails pace. Most progress has been made in Grade 4 history:
Over a 16-year period, the gap has closed 12 points. But progress has been much slower in the
other grade-subject combinations8 points in Grade 4 mathematics, 6 points in Grade 4 reading, 3
points in Grade 8 reading, 2 points in Grade 8 history, and 1 point in Grade 8 mathematics.
One way to gauge how fast gaps are closing is to examine the performance of Black students in a
given grade and subject area in the most recent assessment and compare that to the White
students score at the earliest point for which we have data. There is but a single instancefourth-
grade mathematicsin which Black students most recent score equals or exceeds that earned by
White students two or more decades earlier. The average score for Black students in 2015 was 224
just 4 points higher than White students scored in 1990. Still focusing on Grade 4 mathematics, it
took 15 years, until 2005, for Black student achievement to reach the 1990 level of White student
achievement. Importantly, Black students still have not caught up to early-1990s White student
achievement for any of the other grade-subject combinations.
So the good news is that Black students are improving their academic performance faster than
White students in key subject areas. But the bad news is that, at the current rate, closing the gaps
will take impossibly long. Even for Grade 4 mathematics, where progress has been greatest, it would
take a century to close the gap!
While the data do not tell us which policies would close this unacceptable Black-White achievement
gap, we know enough from other studies to implement changes that could speed up progress. Most
important is the need for early childhood educationeducation from birth through a childs arrival at
kindergarten. The Early Childhood Longitudinal Study indicates that Black children arrive at
kindergarten scoring over 20 percent lower on tests of cognitive ability than White students. To
address this disparity, the evidence suggests the importance of wrap-around childhood education
programs that include emotional, nutritional, and health supports in addition to learning activities in
reading and mathematics. Finally, the evidence is clear that the most effective interventions begin at
or shortly after birth.
Black students also have higher absence rates than White students and are more likely to be in
schools with less-experienced and more non-credentialed teachers. And a recent AIR study showed
that the average eighth-grade Black student attends a school that is 48 percent Black, while the
average eighth-grade White students school is about 10 percent Blacka differential negatively
related to Black male students academic performance when socio-economic status, teacher
qualifications and classroom practices are taken into account.
If we as a nation care about closing the Black-White achievement gaps, research tells us that early
childhood education, reducing segregation, and providing better teachers for our Black students
would be good places to start.
FURTHER READING
Using Social Marketing and Community Engagement to Help Low-Income Children Get
Ready to Read
AIR Experts Available to Discuss Education, Health Issues Raised in President Obamas
State of the Union Address
ESSA Health and Wellness
Long Story Short: How Can Schools Reduce Disparities in Disciplinary Action and Promote
Student Mental Health?
Moving Forward, Looking Back: Landmark Legislation for Americans with Disabilities

Applying Social Science in The Real World

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Applying Social Science in The Real World

Transféré par

Droits d'auteur :

Formats disponibles

Applying Social Science in the Real World

measurable difference in peoples lives is harder still.

research and practice.

adopting, or scaling up evidence-based practices.

White students go largely unheeded.

learn new approaches.

the requirements of research rigor, often through shared work.

SIGNIFICANT DIFFERENCES!! WHAT DO THEY REALLY MEAN?

In simple terms, education evaluations typically compare

receive business-as-usualservices. In a recent study of a program designed to improve teacher

instructional coaching throughout the school year.

of implementation, statistically significant differences between treatment and comparison schools

affirmed the programs promise for improving teaching and learning.

teacher and student outcomes.

may not justify districts adoption of the new approach.

or disprove this assumption.

IMPLEMENTING EVIDENCE-BASED INTERVENTIONS IN REAL-WORLD

It is possible to implement evidence-based practices and programs successfully in

wicked policy problems that relate to them get increasing attention.

can change from day to day or even hour to hour.

Paradoxically, successful implementation of evidence-based strategies and programs may depend

on moving from a developer/research-centric perspective to one focused on setting. Research-

APPLYING RESEARCH TO PRACTICE ON A PERSONAL LEVEL

A major challenge of being a health and aging researcher arises

so the tools that have been developed get used.

a midnight-to-1 a.m. search activity by an exhausted caregiver (my experience).

they dont make sense to both patients and care providers.

lower cost providers may provide equivalent or higher quality care.

information, such as on the availability of services, often does not exist.

available only to those who can afford to pay out of pocket.

among health plan options.

our system, do not come from the same providers.

WHAT IS FIDELITY IN EVALUATION RESEARCH ANYWAY?

Evaluation design in the social sciences is a puzzleliterally. As

Fidelity has become a challenging concept, particularly in evaluating health care

intervention as a black box. Interventions in social interactionteaching, providing health careare

(maturation) as alternative explanations that compete with the uniform X.

Services can expand an innovation demonstration program widely without congressional

that certain cost and quality criteria are met.

same time that Congress raised the stakes for evaluations.

designs described by Campbell and Stanley.

pace and direction of history and maturation vary.

introduced by active organizational learning activities throughout a demonstration and considering

cognitive stimulationthrough such everyday activities as completing crossword puzzles, learning to

interventionscan help maintain or even enhance cognitive functioning as people age.

those targeted by training.

the case of the ACTIVE trial.

were less likely to cease driving or have at-fault automobile crashes.

products arent based on current research evidence.

only tangentially related to their scientific claims about a product.

unsubstantiated advertising claims.

are available and could inform this translational effort.

this SharpBrains checklist.)

USING RESEARCH TO IMPROVE PRACTICE: WHICH RESEARCH MAKES

practices that improve outcomes, relying primarily on researchs gold standardrandomized

effective practices hasnt yielded widespread or system-wide improvement outcomes. Is the

needed to make results generalizable?

Simply identifying effective practices hasnt yielded widespread or system-wide

people and organizations develop the capacity to improve.

policies that support practice improvements rather than impede them.

supports both individual and organizational learning.

learning springs to mind.

individualization as the foundation for designing a five-step process of diagnosis, intervention,