Vous êtes sur la page 1sur 18

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [PERI Pakistan]


On: 7 January 2010
Access details: Access Details: [subscription number 778684090]
Publisher Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-
41 Mortimer Street, London W1T 3JH, UK
Assessment in Education: Principles, Policy & Practice
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713404048
The Role of Assessment in Science Curriculum Reform
Graham Orpwood
To cite this Article Orpwood, Graham(2001) 'The Role of Assessment in Science Curriculum Reform', Assessment in
Education: Principles, Policy & Practice, 8: 2, 135 151
To link to this Article: DOI: 10.1080/09695940125120
URL: http://dx.doi.org/10.1080/09695940125120
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
Assessment in Education, Vol. 8, No. 2, 2001
The Role of Assessment in Science
Curriculum Reform
GRAHAM ORPWOOD
York/Seneca Institute for Science, Technology and Education (YSISTE), 70 The Pond
Road, Toronto, Canada M3J 3M6
ABSTRACT The argument of this article is that changes in curriculum need to be closely
linked to changes in assessment and that this is true as much of the forms of assessment as
it is of its content. Using science as the case in point, the changes in the goals of science
education in the 1960s towards a greater emphasis on inquiry skills were matched some 20
years later with a change in assessment to include performance assessment. Now the new
goals of science education are focused on the need to link science to the broader social
context, but assessment practices have yet to catch up with this change. Given the relatively
greater importance on assessment in the present era, the new curriculum emphasis may well
be ignored unless new approaches to assessment are not designed and implemented soon.
Developing valid and reliable assessment instruments is complex at the best of times.
However, at times of major change in the curriculum, additional challenges and
dilemmas present themselves to test developers and force questions about the
sometimes competing roles of assessment in the larger educational context. In times
gone by, such competing roles might have been of only academic interest. At the
present, however, assessmentwhether international, national or localhas become
of such importance, both educationally and politically, that clarifying the roles and
purposes of assessment has become a priority.
In the past 10 years, I have had the opportunity of participating in or closely
observing several science curriculum development projects and also two science
assessment projects. The curriculum development projects have been in the Cana-
dian context and specically in the province of Ontario, but in the course of
undertaking these projects I have also had reason to analyse science curriculum
developments elsewhere in the world. The assessment projects with which I have
been associated have included the Third International Mathematics and Science
Study (TIMSS)a large international study for which I acted as science co-
ordinatorand the Assessment of Science and Technology Achievement Project
(ASAP)an Ontario project to develop curriculum and assessment resources for
classroom teachers. Despite the differences in purpose of these two projects, they
shared the challenge of developing valid science assessment instruments in a period
of signicant curriculum change.
ISSN 0969-594X print/ISSN 1465-329X online/01/020135-17 2001 Taylor & Francis Ltd
DOI: 10.1080/09695940120062629
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
136 G. Orpwood
Curriculum development should, of course, include the development of appropri-
ate assessment both for the classroom teacher to use, and for any external assess-
ment that is used as a summative or external assessment. However, in my
experience, many curriculum guides or policy directions are determined by one
institution, learning materials (such as textbooks) by another and assessment (both
classroom and external) by yet other individual s or examination boards. Among
these various players, there may or may not be consistency of understanding or
commitment concerning the curriculum and the assessments (by whomever given)
may or may not have high validity. This consistency and validity (or their absence)
form the central theme of this article, which takes science as its case in point.
However, the central issue may be equally applicable to other subject areas also.
The argument of the article begins with a brief contextual account of some of the
changes that have been taking place in science curriculum during the past 50 years,
at least in the English-speaking world. I shall argue that, while some of these have
constituted what can be called normal curriculum change, othersusing the new
Ontario science curriculum as a case in pointwarrant the label of curriculum
revolutions. Next, I examine the role of assessment during these periods of
curriculum revolution and identify one corresponding revolution in this area.
Finally, through reecting on these experiences, I shall argue that leadership in
assessment in support of curriculum change must come through research and the
professional development of teachers, rather than through large-scale international
assessment projects.
Science Curriculum Change
From the long-term perspective, the science curriculum can be seen always to be in
a state of ux. While governments or ofcial curriculum agencies or examination
boards may not issue a new curriculum every year, teachers are always nding new
ways to present the curriculum to students and, therefore, students always experi-
ence a new curriculum. Sometimes, the change is minor and simply constitutes
changing instructional routines, but at other times, especially after a new ofcial
curriculum has been issued, the changes called for at the classroom level may be
more signicant.
The International Association for the Evaluation of Educational Achievement
(IEA) has developed a useful framework for distinguishing three different senses in
which the term curriculum is used (Robitaille et al., 1993):
the intended curriculumas set out or mandated in ofcial statements of the
curriculum;
the implemented curriculumas actually taught or delivered in schools;
the attained curriculumas achieved by the students.
This is a useful framework as it enables us to conceptualise important relationships
among the three levels. In general (and to over-simplify the complexities of the
relationships among these three), governments (or other ofcial agencies) control
the intended curriculum, teachers the implemented curriculum and students the
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 137
attained curriculum. The rst two levels are never entirely synchronised and, at
times, there may be very signicant slippage between them. Those involved with
assessment want to make claims about the third levelthe attained curriculum.
However, since this is not directly observable, we have to use indicators of
achievement in the form of assessment instruments, from which the attained
curriculum can be inferred. The rst part of this article is focused on ways in which
the intended curriculum has changed over the years. Later, I shall consider how
approaches to assessment have reected these changes.
Normal Curriculum Change
Of course, the science content of the intended curriculumthe what should be
taught and learned that is the substance of all curriculais constantly undergoing
renement as science itself evolves and, as Spencers century-old question what
knowledge is of most worth?, is constantly given new answers. As genetics,
microbiology and ecology become recognised as critically important elements of
biology, the traditional botany and zoology that characterised curricula of the 1950s
have given way more and more to these newer aspects of life science. Earth and
space science has moved from Geography to Science in several jurisdictions as the
importance of the scientic (as opposed to the social) aspect of these areas has
increased. Chemistry increasingly focuses its attention on matters of structure,
mechanism and energy change, from the more traditional attention to the
classication and properties of materials. Curiously, school physics appears to have
retained a more traditional view of appropriate content with the term modern
physics being used to characterise aspects of the subject discovered largely in the
period of 18901920. Even the inclusion of technology in some science courses can
be seen as yet another adjustment to the course content. If Kuhnian terminology
can be employed in this context, these changes in content can be seen as aspects of
normal curriculum change.
Science Curriculum Revolutions
However, overlaying this normal evolutionary change of the science curriculum, the
past 50 years have seen at least two more important changeschanges that I believe
warrant the term curriculum revolutions [1]. These revolutionary changeslike
the paradigm shifts Kuhn described to explain the growth of sciencewere intended
to change, in a fundamental way, how the science curriculum was to be understood,
taught and learned. They focused less on the content of the science curriculum, and
more on the goals for or purposes of teaching and learning science. Scientic
knowledge still represents the core of the curriculum, but the question Why are we
learning this stuff? is given a new set of answers.
Roberts (1982) concept of curriculum emphasis helps capture the nature of the
change involved. For Roberts, the content of science teaching is always presented in
a contexthe calls it a curriculum emphasiswhich communicates to the student
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
138 G. Orpwood
(often implicitly) the purpose of learning the science content. Roberts has also
described a series of seven such curriculum emphases that have characterised science
curricula during this century. While elements of most of these can be found in some
classrooms today, it was not always so. In the 1950s, barely three of Roberts seven
emphases were to be found, and two of thesecorrect explanations and rm
foundationsboth see acquiring scientic knowledge as an end in itself, the only
worthwhile outcome of the curriculum. In one casecorrect explanationsbecause
it (science) is true and in the otherrm foundationsbecause it sets a foundation
for the further study of science. In this context, normal change of the curriculum
can be seen as the adjustment of which scientic knowledge should be learned and
at which stage. By contrast, revolutionary change involves the introduction of one
or more brand new or radically different curriculum emphases, a phenomenon
we observed rst some 30 or 40 years ago, and are observing, once again, at the
present time. The new emphasis not only adds a dimension to the curriculum. It
also changes radically the selection of science content seen to be important and
changes the ways in which students are expected to interact with that science
content.
The rst of these periods of revolutionary change began in the late 1950s and
1960s in both the British and American (i.e. USA) education systems, as well as
elsewhere in the world. During this period, science curricula became focused on the
nature and processes of the scientic discipline itself. This was the period in
England of the Nufeld science projects (and those that followed in the same
tradition) at both secondary and primary school levels. In the USA, similar
emphases were being incorporated into science curriculum projects such as PSSC
physics, ChemStudy chemistry, BSCS biology (at the secondary school level)
and Science: a process approach (SAPA), the Elementary Science Study (ESS) and
the Science Curriculum Improvement Study (SCIS, at the elementary school
level).
A whole literature sprang up, which both articulated the rationale underlying
these curriculum projects, and advocated them to teachers and schools (Hurd &
Gallagher, 1968; Hurd, 1969, 1970). For example, in a withering critique of
traditional school science, Schwab (1965) characterised it as a rhetoric of conclu-
sions, which ignored the underlying processes of inquiry that, he argued, more truly
represented the essence of science. Spurred on by the release of the Russian Sputnik
in 1957, the US government poured millions of dollars into the new science
curricula with a concern to generate more and better scientists to support the
national security imperatives.
In studying science using these curricula, students were expected not just to learn
the concepts and theories of science, but also to acquire an understanding of how
science functions as a discipline and the skills associated with scientic investigation.
Hodson (1993, p. 106) has summarised the purposes of science education following
this period in terms of students learning science, learning about science, and doing
science. Learning scientic concepts, laws and theories was still seen as important,
but equally important was the context in which the content was to be set. Doing
science meant acquiring the skills, strategies and habits of mind associated with
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 139
scientic investigation. Learning about science referred to understanding how
science functioned as a discipline, its practice, its methods, its logic and its episte-
mology. Two new emphasesscientic skill development and the structure of
science (Roberts, 1982)had been born, at least on paper.
This science curriculum revolution inuenced science curriculum talk (Orpwood,
1998)in curriculum guides, textbooks and professional development workshops
for nearly four decades. However, classroom teachers were, for the most part,
unprepared to teach these new emphases. Few had had any personal experience of
hands-on scientic inquiry or received any formal background study in the philos-
ophy of science. Despite the enormous amounts of money and effort that the
curriculum projects put into in-service teacher education, the innovations were
rarely fully taken up in schools, at least in the form that their developers had in mind
(e.g. Stake & Easley, 1978).
Over the subsequent decades, another chapter of the research literature was
devoted to explaining why the curriculum revolution had failed to take root fully in
American schools. There were many factors involved, but one was the failure of
assessment in school science to match the changes in direction adopted by the
curriculum. To this point we shall return after considering the second great curricu-
lum revolution of the past half-century.
The second period of revolutionary change began slowly in the early 1980s and
has now (in the late 1990s) gathered signicant momentum in many countries of the
world. If the rst revolution focused attention inward towards the structure and
processes of science itself, the second balances this with attention outward towards
society and the complex relationships among science, technology, society and the
environment. A signicant literature has now described the development and
rationale for the many versions of this new focus for science education (e.g. Hurd,
1975; Aikenhead, 1980; Solomon, 1981; Bybee, 1985; Fensham, 1988; Cheek,
1992a; Solomon & Aikenhead, 1994; Yager, 1996; Black & Atkin, 1996, to name
but a few). Now, in addition to acquiring basic scientic knowledge and the skills of
scientic investigation, students are being expected to understand how science is
related to technology, and how both science and technology impact on society and
the environment. This new curriculum emphasis has even acquired its own
acronym, STS (for science, technology and society) [2].
Once again, national standards and curriculum guides have begun to embrace this
second revolution (e.g. American Association for the Advancement of Science,
1995; National Research Council, 1996; Council of Ministers of Education,
Canada, 1997; Government of Ontario, 1998, 1999). Another factor is inuencing
this revolution in a way that it did not in the 1960s. In the past, goals or aims were
usually stated quite independently from the science content. The result was that
textbooks, teachers and assessors were free to embrace or ignore them. Now, since
many of the newer curriculum guides are stated in the form of outcomes, which
incorporate both goals and content, the new emphasis has become an integral part
of the curriculum specications (Orpwood & Barnett, 1997). By way of illustration,
I will describe the new Grades 18 science and technology curriculum in the
Canadian province of Ontario.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
140 G. Orpwood
The Ontario Curriculum in Science and Technology: a case in point
The document that eventually became The Ontario Curriculum, Grades 18, Science
and Technology (Government of Ontario, 1998) was developed by a consortium of
teachers and school districts led by science educators at York University as a product
of the Assessment of Science and Technology Project (ASAP). It had a series of
features that were new for the province. It represented the rst curriculum in
Ontario in 30 years that clearly articulated expectations in science for each grade of
the elementary school. It integrated the study of science with that of technology, the
rst time technology education was specically mandated in Ontario. It introduced
the study of earth and space sciences into the science curriculum (these areas having
been regarded previously as physical geography). It was set out in the form of
outcomes for what students should know and be able to do by the end of each
year. All of these could be regarded as normal changes to the curriculum, even
though these were major changes and presented signicant challenges for classroom
teachers.
However the three goal statements represented the element of the curriculum that
was revolutionary. These are that students are intended:
to understand the basic concepts of science and technology;
to develop the skills, strategies and habits of mind required for scientic inquiry
and technological design;
to relate scientic and technological knowledge to each other and to the world
outside the school (Government of Ontario, 1998, p. 4).
These goals emerged from a complex project design that combined analysis of the
following factors:
an up-to-date view of the nature of science and technology;
curriculum trends nationally and internationally;
research on childrens capacity to learn;
the experience of classroom teachers;
consideration of the needs of Canadian society;
a deliberated consensus about all of these and the needs of Ontarios children.
This is not the place for an exhaustive account of all of these factors. However, two
can serve to demonstrate the origins of the goals adopted by this curriculum. For
example, the project sought to link the concepts of science and technology, as
school subjects to the array of concepts that each embodies in the real world (see
Orpwood & Bloch, 1998, pp. 79). Both are, rst, systems of knowledgescience
seeking to describe and explain the natural and physical world, and technology
seeking to meet human needs, through inventing modifying devices, structures,
systems or processes. Secondly, both science and technology are processes of
investigation and explorationscience through the processes of inquiry and technol-
ogy through those of design. Thirdlyand this represented a new element for
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 141
manyscience and technology are both social enterprises, which exist in social
economic, political and environmental contexts. Omission of these contexts means
that only a partial view of both science and technology is presented.
The changing needs of students in the new millennium were another major
component that emerged from the research that examined trends nationally and
internationally (Orpwood & Barnett, 1997). The project held deliberations involving
a wide variety of stakeholders that led to a clear consensus about the desirable aim
of the curriculum. We should ensure that every student received the opportunity to
develop basic scientic literacy and technological capability. These, in turn, involved
three elements:
understanding the core concepts of science and technology;
acquiring the skills important for life and work in the twenty-rst century;
being able to relate the knowledge and skills acquired in school to real-life
situations.
The goals that emerged from this process (which occupied many hundreds of people
and lasted two full years) are not coincidentally very similar to those appearing in
many other new curricula in jurisdictions around the world. Indeed, analysing these
curricula was a component of ASAP. However, they do differ signicantly from
those of the rst curriculum revolution and even more from those from before that
time.
Incidentally, at the same time that ASAP was undertaking its curriculum
development work, the Council of Ministers of Education, Canada, completed the
development of its own framework for science curriculum known as the Pan-
Canadian Science Framework (Council of Ministers of Education, Canada, 1997).
There was signicant interaction between the two projects, since the principal
architect of the ASAP document (Marietta Bloch) was also a member of the
Pan-Canadian development team. The goals articulated by the Pan-Canadian
framework are entirely compatible with those in Ontario and, thus, this document
belongs to the new generation of second revolution curriculum frameworks (Aiken-
head, 2000).
The three goals, once articulated, form the conceptual glue that binds the rest of
the document together. The content is organised in ve strands, which effectively
integrate the science and technology content knowledge:
life systems;
matter and materials;
energy and control;
structures and mechanisms;
earth and space systems.
For each of these strands, at each of the eight grades, the three goals are interpreted
in the form of three overall expectations and three sections of specic expectations.
The goals are clearly integrated with the content so as to ensure their not being
omitted during the implementation.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
142 G. Orpwood
Assessment in the Context of Curriculum Change
If the curriculum went though ongoing normal change and discrete periods of
revolutionary change, then it would be reasonable to expect that the cousin activity
of assessment would experience parallel types of change and that current forms of
assessment are co-ordinated well with the current curricula. However, I shall argue
that does not turn out to be the case.
If the teaching of science knowledge for its own sakeRoberts (1982) correct
explanations and rm foundations emphasesrepresents the basic curriculum
paradigm of the pre-1960s period, then measuring how much scientic knowledge
a student has acquired represents the corresponding assessment paradigm.
Throughout the world, science assessment both in classrooms, and in national or
international projects focused on students demonstrating their scientic knowledge
chiey by responding to questions that required recall of memorised information,
solving problems through memorised algorithms, and analysis of contrived data
or situations that parallel those encountered in school science. This pattern of
school science assessment mirrors in many ways patterns experienced in university
examinations.
The pattern described here is not restricted to the use of multiple choice items
though these are popular in North America because of the ease and reliability of
scoring. The essay-type constructed response items and the short-answer format
(more familiar to teachers and students in Europe) are equally likely to call for recall
or the simple processing of memorised information. The point I am making here is
that normal science assessment comprises a very limited range of student cognitive
activities, regardless of the types of assessment item used.
Given this paradigm, validity issues in science assessments usually amount to
analysis of the distribution of items of various science content areas compared to the
distribution of the science content topics in the curriculum. One of the problems of
assessment thus consists of developing an assessment that is balanced with respect
to the many science topics covered, while still maintaining an assessment of
reasonable length. In classroom tests, teachers handle this by having frequent unit
tests covering small areas of the curriculum. School examinations handle it by a
judicious selection from the topics coveredleading, of course, to the students
having to try to second-guess which topics will be on the exam with those who
guess best being more successful than those whose predictions are less accurate.
In large-scale achievement tests, such as TIMSS, the problem is magnied, since
the range of topics covered by curricula in the many countries is very broad. This
problem can be resolved in part through a complex test design involving a very large
pool of items and the use of multiple test booklets (Adams & Gonzalez, 1996). Even
so, the problem of test development in TIMSS was signicant. With a blueprint
based loosely on the science curricula of participating nations (McKnight et al.,
1993), it involved making many compromises based on such factors as eld-test
results, national preferences and the avoidance of large item-by-country interactions
(Garden & Orpwood, 1996). While, from a technical (reliability and Item Response
Theory (IRT) scaling) perspective, the TIMSS written achievement tests worked
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 143
effectively, they have continued to attract criticism from observers in a variety of
countries (e.g. Fensham, 1998). The over-arching criticism has been a challenge to
their validity, particularly in these times of curriculum change.
There was much more to TIMSS than the simple assessment of students science
content knowledge and I shall return to further discussion of TIMSS later. First,
however, I want to discuss a revolution in assessment that corresponds to the rst
curriculum revolution described earlier.
Assessment for the 1960s Curriculum Revolution
While many of the curriculum projects that incorporated the new goals (concerning
the nature of science and the acquisition of science inquiry skills) attempted to
develop their own measures of achievement, these rarely became commonplace in
schools or in large-scale assessments. Rather, teachers and national/international
assessment projects continued to use traditional assessment measuresmeasures
that, in the main, called for recall of memorised scientic knowledge.
There were perhaps four major reasons for this. First, the assessment technology
that would permit valid assessment of students abilities to conduct investigations in
science had not been designed in the 1960s. The rst signicant performance
assessments (as they have now become known) were designed in England in the
early 1980s by the Assessment of Performance Unit (APU, 1983) fully 20 years after
the goal of instilling inquiry skills in students had rst been introduced into the
curriculum. Even after its initial successes in the UK, the APU saw its funding cut
and it was even longer before the sort of assessment using performance tasks became
familiar in North America.
Secondly, even when newer, more authentic assessments had been developed, the
psychometric communityparticularly in the United Stateswho were anxious to
maintain the reliability of the multiple choice and other objectively scored tests,
expressed scepticism over such new measures of assessment. It is only in the last
decade that signicant research on the characteristics of performance assessments
has become commonplace in the educational literature.
The third reason had to do with public credibility: universities and the public
thought they knew what traditional tests measured, and new, unproven forms of
assessment lacked the familiarity and thus the credibility of the traditional ones. This
reason is still, as we shall note later, a problem for science educators who try to make
their assessments match the intended goals of the curriculum.
The nal reason, I would submit, was a professional inertia amongst teachers
themselves, particularly at the secondary school level, for whom assessment has
often tended to mimic the examinations experienced at university. Thus, when a
national assessment was developed or reviewed by a committee of teachers, the
items most likely to be considered acceptable were those of the most traditional
variety.
These factors led to a signicant delay in the paradigm shift in assessment
corresponding to the curriculum revolution of the 1960s. It was the 1980s before
performance assessment even made its rst signicant appearance and the 1990s
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
144 G. Orpwood
before it became at all widespread. Even in TIMSS, the performance assessment
component (Harmon et al., 1997), which had initially been described as integral to
the study, was later treated as a national option, was reported (at the international
level) separately from the paper-and-pencil assessment, and was dropped entirely
from the TIMSS-R replication study taking place in 1999.
It appears that the goals that formed the essence of the science curriculum
revolution of the 1960s are still not being assessed with the same degree of attention
as those that focus on simple recall of scientic information. Stake & Raizen (1997),
commenting on this situation in a recent review of curriculum innovations in the
United States, observe that:
most reformers in the eight projects we studied agreed that the reconceptu-
alization of science education is incomplete if it leaves out the reconceptu-
alization of assessment. Yet systemic educational reform calls for the use of
rigorous, objectively scored, standardized tests as bottom-line criteria.
(p. 138)
They also point out the political dilemma:
it is difcult to assure parents, taxpayers, and sceptical teachers that the
new curricula and teaching strategies will provide students the information
that achievement testing has traditionally required. Reformers who claimed
that back in the 1960s failed to be persuasive (Stake & Easley, 1978).
(State & Raizen 1997, p. 132)
Assessment for the 1990s Curriculum Revolution
If assessment of the curriculum goals that characterised the 1960s revolution has
been delayed and still seeks credibility, that for the STS revolution in science
curriculum has barely surfaced at all beyond the research level. Some researchers
have recognised the problem (e.g. Aikenhead et al., 1987; Bybee, 1991; Cheek,
1992b) and some projects have attempted to tackle it (e.g. American Chemical
Society, 1988; Aikenhead & Ryan, 1992). Cheek (1992b) reports that STS compo-
nents are contained in the work of the New York State Education Department, the
South Australia Senior Secondary Assessment and several examination boards in the
UK. In Canada, Alberta Educations assessment branch has also attempted to
ensure that the STS components of the curriculum are truly reected in their
provincial assessments.
However, for most classroom teachers and large-scale assessment projects there
remains little guidance or exemplary work that addresses how to assess student
achievement in the context of an STS-orientated curriculum. In times gone by, this
might not have mattered if teachers were convinced that an STS emphasis was
something they felt was right to integrate into their science programmes. However,
with the increasing importance of assessment and the measuring of students
achievement of the intended outcomes, the absence of STS from classroom and
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 145
Item A1
Nuclear energy canbe generatedby ssionor fusion. Fusionis not currentlybeingused
in reactors as an energy source. Why is this?
A. The scientic principles on which fusion is based are not yet known
B. The technological processes for using fusion safely are not yet developed.
C. The necessary raw materials are not yet readily available.
D. Waste products from the fusion process are too dangerous.
FIG. 1. Item A1.
large-scale assessment is likely to have a profound inuence on whether the STS-
related outcomes in the curriculum are taken seriously.
The test development experience of TIMSS once again provides an illustration of
some of the difculties associated with trying to include assessment items that
address the STS emphasis in the science curriculum. The main TIMSS item pools
for 9- and 13-year-oldsTIMSS populations 1 and 2contained few items (5% at
most) that addressed STS issues and no more than this number that focused on the
nature of scientic investigation. However, the Mathematics and Science Literacy
(MSL) component of the third TIMSS population (school-leavers) represented the
most systematic attempt to include such items, in a category of the test labelled
Reasoning and Social Utility (RSU) [3]. While, in the end, RSU was not used as
an independent reporting category, several STS items were included in this aspect
of the MSL achievement tests. Some of them, such as Item A1, call for students to
recall previously learned scientic or technological information (Fig. 1).
Others, such as Item A7, call for the application of scientic principles to a social
situation (Fig. 2).
A third type of item, of which there were very few examples in TIMSS, but which
perhaps illustrates the STS emphasis more faithfully, is exemplied by Item A11
(Fig. 3).
This item was based on a real-life scenario (described in a newspaper article) and
the original item only contained part (B). This version of the item was challenged by
the TIMSS subject-matter specialists as containing no science and thus part (A)
was added. The second part of the item is clearly an attempt to assess students STS
understanding in that it invites consideration of the social and economic conse-
quences of the introduction of a new technology.
Item A7
Some high-heeledshoes are claimedtodamage oors. Thebasediameter of thesevery
high heels is about 0.5 cm and that of ordinary heels about 3 cm. Briey explain why
the very high heels may cause damage to oors.
FIG. 2. Item A7.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
146 G. Orpwood
Item A11
It takes 10 painters 2 years to paint a steel bridge from one end to the other. The paint
that is used lasts about 2 years, so when the painters have nished painting at one end
of the bridge, they go back to the other end and start painting again.
A. Why must steel bridges be painted?
B. A new paint that lasts 4 years has been developed and costs the same as the old
paint. Describe two consequences of using the new paint.
FIG. 3. Item A11.
The item very nearly failed to survive in the MSL achievement test because of a
variety of additional factors including:
difculties with scoringthe notion of a correct answer is dependent on the
socio-political context;
because it was not considered by educators in many countries as appropriate for
a science achievement test, even one that focused on science literacy;
the implicit introduction of values into a science assessmentthe response one
gives to part (B) of this item requires one to adopt a value-laden position, and
think through the assumptions and consequences of that position.
Nevertheless, the item did remain in TIMSS and the results, which are currently
being analysed for another paper, show some interesting patterns of response across
the world and even within countries.
However, the difculties encountered in the development and use of this item
remain and would appear to be endemic to STS assessment. Aikenhead and his
colleagues at the University of Saskatchewan have suggested that a new generation
of standardized instruments (Aikenhead et al., 1987) is required or, in the language
of this article, a new revolution in assessment. Their work in developing the Views
on Science-Technology-Society (VOSTS) instrument certainly represents a chal-
lenge to the normal conception of assessment in science. In their words, VOSTS
requires students to write an argumentative responsea reaction to a statement
about a STS topic. Rather than analyzing right and wrong answers, we let
students arguments dene various positions or viewpoints on each STS topic.
While the original VOSTS was not suitable for use in large-scale assessments, it has
since been adapted to describe students views on STS in Ontario (Crelinsten et al.,
1993). VOSTS overcame the problem of values by allowing the student to adopt
any position, but assessed the quality of the argument.
The new OECD study, Programme for International Student Assessment, known
as PISA, is also attempting to push the bounds of assessment in the area of STS.
However, it is resisting the inclusion of items that require values to be analysed.
Rather, it is presenting students with scenarios from real life and asking them to
demonstrate their abilities at using scientic processes in the analysis of the issues
involved (for more information see the PISA Framework document, Programme for
International Student Assessment, 1999).
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 147
Task 1LS/PT02 (for grade 1, Life Systems strand)
GAME TIME
Design and make a game for a child who is not able to see. Name your game and
describe the rules so that others can play it.
What other senses will people who play your game have to use?
Name the materials you used to make the game and describe why you chose them.
Draw a picture of the game and label the parts.
Describe the rules of your game.
FIG. 4. Task 1LS/PT02.
The Assessment of Science and Technology Achievement Project (ASAP) has
developed a wide range of assessment tasks for classroom use (Orpwood et al.,
1999) corresponding to the full range of the expectations contained in the Ontario
science and technology curriculum described earlier. The focus of the collection of
500 tasks covering eight grades is on what students can do with what they know,
rather than on the traditional what they know. In the area of STS, some of the
tasks put students into real-world situations and ask them to reect on the situation
in some important respect. In this respect, the ASAP collection bears similarities to
the PISA science assessment. A few sample tasks can serve to illustrate the point
(Fig. 4).
The Grade 1, Life Systems unit is entitled Characteristics and Needs of Living
Things and the task is focused on several expectations from the skills of inquiry and
design section of the curriculum including asking questions about the needs of
living things, planning investigations and communicating results. In addition, the
task addresses the STS expectations of comparing the ways in which humans use
their senses to meet their needs, and describing ways in which people adapt to the
loss or limitation of sensory ability.
Not all the tasks are hands-on in the sense of requiring students to undertake
practical work in a laboratory setting. Consider the following task, for example, from
the Grade 7 Life Systems unit on Interactions within Ecosystems (Fig. 5).
These tasks call for students to think holistically about a real-world situation,
taking into account the competing demands of apparently conicting positions. It
Task 7LS/EA04 (for Grade 7, Life Systems strand)
A construction company is about to bulldoze a wood lot with a pond nearby so that
a newhousingdevelopment can be built. Devise a plan so that the newhouses get built
and yet the environment, and the plants and animals in it, get protected. We want this
to be a win-win situation. How can the new houses be built and yet the environment
still protected?
FIG. 5. Task 7LS/EA04.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
148 G. Orpwood
calls for the creative development of solutions to a problem that clearly has no right
or wrong answers. In both cases, students must have developed prior knowledge
and skills, and in both the responses will demonstrate their abilities at these. The
focus here is on integrated, open-ended thinking of a kind not usually sought in
science assessments. Scoring responses to such a question will be hard, especially
if reliability considerations are paramount. Yet both would appear to be entirely
appropriate given the expectations of the curriculum. While these examples are
not presented as ideal examples of STS assessment items, they represent the
direction that the needed assessment revolution must pursue if the latest curricu-
lum revolution is to be reected adequately in classroom assessments.
Concluding Thoughts: what counts as science assessment?
Clearly, new directionsarguably revolutionsare emerging in science curricula in
various parts of the world. The assessments required to determine students achieve-
ment of the new goals of science curricula, however, have been slow to catch up.
While recent progress in the use of performance assessments have focused attention
on what students can do in science, as well as on what they know, the new
challenges presented by the STS revolution in science education has not been
systematically addressed by most assessments. Indeed, the problem of what counts
as science assessment has in many cases not developed much from the pre-
revolutionary era when measuring the quantity of students knowledge of science
was the major focus.
Of course, the new STS revolution appears in many varieties. It is not the case
that all versions of STS curriculum are focused on the same specic goals or
integrate STS with science content in the same way or to the same extent, as
Aikenhead (1994) has pointed out. For example, the ve items sampled above all
reect some aspect of STS in that all of them link science topics with STS content.
However, each of them does so in a different way. Some (e.g. A7 and A11a) call for
students simply to apply their scientic knowledge, albeit in an STS context. Others
(such as A1) call for students to recall specic STS information. Yet others (e.g.
item A11b) require students to demonstrate little knowledge of science content, but
rather to be able to reason about the impact of science and technology in a social
context.
Those of us who advocate STS in science education have a responsibility to clarify
more precisely what we expect students to be able to demonstrate in an assessment
context if we expect STS to appear more consistently in science assessments of any
kind. A framework that enables analyses of the varieties of STS objective that are
incorporated in a curriculum and thus the types of assessment that are appropriate,
is needed. Aikenheads framework provides a useful start, but it focuses on the
percentage of a complete assessment that is STS. As the items shown here demon-
strate, the issue is not simply one of how much of an assessment is STS, but also
what types of student performance are called for and how these relate to the intent
of the STS curriculum. Any move towards a more comprehensive framework for the
assessment of STS must take these complexities into account.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 149
International and other large-scale assessments face a particular dilemma. On the
one hand, their validity is sometimes determined (as was the case in TIMSS) not
only in reference to the content of the intended curricula, but also partly in relation
to the implemented curricula. Even in countries that intend the curriculum to
include STS, implementation may lag way behind the intended curriculum changes.
It is hard, therefore, for such international projects to provide leadership in terms
of promoting new forms of assessment having higher validity in some countries,
while also remaining acceptable to all participants. At the same time, the political
status and high-prole consequences of these large-scale international studies such
as TIMSS may encourage the maintenance of the status quo, or even slow down the
spread of curriculum revolutions across and within the countries that participate.
Leadership is therefore required from all quarters to ensure that innovations such
as performance assessment and STS assessment are not allowed to be regarded as
second-class or entirely optional ways of assessing achievement in science edu-
cation. In the case of large-scale assessments, this requires new models for address-
ing validity to be introduced such as that proposed (but not implemented) for
TIMSS by Shavelson et al. (American Educational Research Association, 1993). He
proposed what became known within TIMSS as the ower and petals model,
involving a core cluster of items as an assessment for all countries, and other clusters
of items, which would be taken by those countries selecting to do so. Such a model
might have gone some way to resolving the dilemma of validity across the many
countries participating in TIMSS.
At the same time, the professional inertia that resists change in assessment at the
classroom, local and national levels needs to be addressed. Here, I believe that the
key move is to integrate assessment with the professional development of teachers,
as is already the practice of the Ontario provincial assessment programme and in the
next phase of ASAP currently under way. Teachers will work on developing new
forms of assessment for their classrooms as part of an ongoing series of professional
development workshops, and thereby address together the challenges of a new STS
curriculum and of assessing it in an appropriate way.
Finally, academic leadership must be shown through greater collaboration be-
tween the curriculum and psychometric research communities. One of the casualties
of academic specialisation is that those of us schooled in the issues of curriculum,
teaching and learning are not often also up-to-date with developments in assess-
ment, while those whose expertise lies in assessment have not had time or interest
to understand the complexities of the revolutions that have taken place in the
curriculum. Dialogue across this divide is required if the revolutions of the intended
science curriculum are to be reected in the real and reported achievements of
students, in whose interests the entire enterprise is undertaken.
NOTES
[1] The term revolution is also used in this way by Atkin et al. (1996).
[2] Sometimes, Environment is added as an additional element to STS, making the acronym
STSE (see Council of Ministers of Education, Canada, 1997, for example).
[3] Orpwood & Garden (1998) describe the test development for the MSL component of
TIMSS in detail.
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
150 G. Orpwood
REFERENCES
ADAMS, R. & GONZALEZ E. (1996) The TIMSS test design, in: M. MARTIN & D. KELLY (Eds)
Third International Mathematics and Science Study, Technical Report, Volume 1: design and
development (Chestnut Hill, Boston College).
AIKENHEAD, G. (1980) Science in Social Issues: implications for teaching (Ottawa, Science Council
of Canada).
AIKENHEAD, G. (1994) What is STS science teaching? in: J. SOLOMON&G. AIKENHEAD (Eds) STS
Education: international perspectives on reform, pp. 4759 (New York: Teachers College
Press).
AIKENHEAD, G. (2000) STS science in Canada: from policy to student evaluation, in: D. KUMAR
&D. CHUBIN (Eds) Science, Technology, and Society: a source book on research and practice, pp.
4989 (Kluer, Plenum Press).
AIKENHEAD, G. & RYAN, A. (1992) The development of a new instrument: Views on science-
technology-society (VOSTS), Science Education, 76, pp. 477491.
AIKENHEAD, G., FLEMING R. & RYAN A. (1987) High school graduates beliefs about science-
technology-society, Science Education, 71, pp. 145161.
AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE (AAAS) (1995) Project 2061: science
literacy for a changing future, a decade of reform (Washington, AAAS).
AMERICAN CHEMICAL SOCIETY (ACS) (1988) ChemCom: chemistry in the community (Dubuque,
Kendall/Hunt).
AMERICAN EDUCATIONAL RESEARCH ASSOCIATION (AERA) (1993) TIMSS achievement test item
pools, unpublished report (Vancouver, University of British Columbia).
ASSESSMENT OF PERFORMANCE UNIT (APU) (1983) Science at Age 11 (London, Department of
Education and Science).
ATKIN, M., BLACK, P., BRITTON, E. & RAIZEN, S. (1996) A global revolution in science, mathe-
matics and technology education, Education Week (April 10).
BLACK, P. & ATKIN, M. (Eds) (1996) Changing the Subject: innovations in science, mathematics, and
technology education (New York, Routledge).
BYBEE, R. (1985) The Sisyphean question in science education: what should the scientically and
technologically literate person know and be able to do as a citizen? in: R. BYBEE (Ed.)
Science-Technology-Society, 1985 NSTA Yearbook, pp. 7993 (Washington DC, National
Science Teachers Association).
BYBEE, R. (1991) Science-Technology-Society in science curriculum: the policy-practice gap,
Theory into Practice, 30(4), pp. 294302.
CHEEK, D. (1992a) Thinking Constructively about Science, Technology, and Society Education
(Albany, SUNY Press).
CHEEK, D. (1992b) Evaluating learning in STS education, Theory into Practice, 31(1), pp. 6472.
COUNCIL OF MINISTERS OF EDUCATION, CANADA (CMEC) (1997) Common Framework of Science
Learning Outcomes (Toronto, CMEC).
CRELINSTEN, J., DE BOERR J. & AIKENHEAD, G. (1993) Measuring Students Understanding of Science
in its Technological and Social Context (Toronto, Ministry of Education).
FENSHAM, P. (1988) Approaches to the teaching of STS in science education, International Journal
of Science Education, 10, pp. 346356.
FENSHAM, P. (1998) Insights from TIMSS for Australian science education, unpublished paper
presented at the Annual Meeting of the National Association for Research in Science
Teaching, San Diego, April 2125, 1998.
GARDEN, R. & ORPWOOD, G. (1996) Development of the TIMSS achievement tests, in: M.
MARTIN & D. KELLY (Eds) Third International Mathematics and Science Study, Technical
Report, Volume 1: design and development, pp. 2.12.19 (Chestnut Hill, Boston College).
GOVERNMENT OF ONTARIO (1998) The Ontario Curriculum, Grades 18, Science and Technology
(Toronto: Ministry of Education and Training).
GOVERNMENT OF ONTARIO (1999) The Ontario Curriculum, Grades 910, Science (Toronto, Minis-
try of Education and Training).
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0
Assessment in Science Curriculum Reform 151
HARMON, M., SMITH, T., KELLY, D., BEATON, A., MULLIS, I., GONZALEZ, E. & ORPWOOD, G.
(1997) Performance Assessment in IEAs Third International Mathematics and Science Study
(Chestnut Hill, Boston College).
HODSON, D. (1993) Towards a more critical approach to practical work in school science, Studies
in Science Education, 22, p. 106.
HURD, P. (1969) New Directions in Teaching Science in Secondary Schools (Chicago, Rand
McNally).
HURD, P. (1970) New Directions in Teaching Science for Junior High Schools (Belmont, Wadsworth).
HURD, P. (1975) Science, technology and society: new goals for interdisciplinary science teaching,
Science Teacher, 42, pp. 2730.
HURD, P. & GALLAGHER, J. (1968) New Directions in Elementary Science Teaching (Chicago, Rand
McNally).
MCKNIGHT, C., SCHMIDT, W. & RAIZEN S. (1993) Test blueprints: a description of the TIMSS
Achievement Test Content Design, TIMSS document, ICC797/NRC357 (Vancouver,
University of British Columbia).
NATIONAL RESEARCH COUNCIL (NRC) (1996) National Science Education Standards (Washing-
ton DC, NRC).
ORPWOOD, G. (1998) The logic of science curriculum talk, in: D. ROBERTS & L. O

STMAN (Eds)
Problems of Meaning in Science Curriculum, pp. 133149 (New York, Teachers College
Press).
ORPWOOD, G. & BARNETT, J. (1997) Science in the National Curriculum: an international
perspective, Curriculum Journal, 8(3), pp. 331249.
ORPWOOD, G. & BLOCH, M. (1998) Implementing the Ontario Curriculum, Grades 18: science and
technology (Toronto, Ontario English Catholic Teachers Association).
ORPWOOD, G. & GARDEN, R. (1998) Assessing Mathematics and Science Literacy, TIMSS mono-
graph No. 4 (Vancouver, Pacic Educational Press).
ORPWOOD, G., BLOCH, M., BARTLEY, A., HERRIDGE, D. &MARKS, M.. (1999) Classroom Assessment
in Science and Technology: a resource handbook for teachers (Toronto, Nelson).
PROGRAMME FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) (1999) Measuring Student
Knowledge and Skillsa new framework for assessment (Paris, Organisation for Economic
Co-operation and Development).
ROBERTS, D. (1982) Developing the concept of curriculum emphases in science education,
Science Education, 66, pp. 243260.
ROBITAILLE, D., SCHMIDT, W., RAIZEN, S., MCKNIGHT, C., BRITTEN, E. & NICOL, C. (1993)
Curriculum Frameworks for Mathematics and Science, TIMSS monograph No. 1
(Vancouver, Pacic Educational Press).
SCHWAB, J. (1965) Science as Inquiry, in: J. SCHWAB &P. BRANDWEIN (Eds) Science as Inquiry, pp.
1103 (Cambridge, Harvard University Press).
SOLOMON, J. (1981) Science and society studies in the curriculum, School Science Review, 82,
pp. 213220.
SOLOMON, J. & AIKENHEAD, G. (Eds) (1994) STS Education: international perspectives on reform
(New York, Teachers College Press).
STAKE, R. & EASLEY, J. (1978) Case Studies in Science Education (Urbana, Center for Instructional
Research and Curriculum Evaluation, University of Illinois).
STAKE, R. & RAIZEN, S. (1997) Underplayed issues, in: S. RAIZEN & E. BRITTON (Eds) Bold
Ventures: 1. Patterns among Innovations in Science and Mathematics Education, pp. 11153
(Dordrecht, Kluwer).
YAGER, R. (1996) Science/Technology/Society as Reform in Science Education (Albany, SUNY Press).
D
o
w
n
l
o
a
d
e
d

B
y
:

[
P
E
R
I

P
a
k
i
s
t
a
n
]

A
t
:

0
7
:
5
7

7

J
a
n
u
a
r
y

2
0
1
0