53 1 6

Assessing speaking in the
revised FCE
Nick Saville and Peter Hargreaves
This paper describes the Speaking Test which forms part of the revised
First Certificate of English (FCE) examination produced by the University of
Cambridge Local Examinations Syndicate (UCLES), and introduced for the
first time in December 1996 (see First Certificate in English: Handbook,
UCLES, 1997). The aim is to present the new test as the outcome of a
rationalprocess of test development, and to consider why the new design
provides improvements in the assessment of speaking within the FCE
context.
While examinations by their nature tend to be conservative, the
Cambridge examinations produced over the years have kept pace with
changes in English teaching, SO that modifications to the examinations
have taken place in an evolutionary way. FCE, first introduced in 1939
under the title Lower Certificate in English, has been revised
periodically over the years in order to keep pace with changes in
language teaching and language use, and also as part of an ongoing
commitment to test validation. Prior to the revision introduced in 1996,
FCE underwent major revisions in 1984, and before that in 1973. By
changing in this way, it has been possible to continue to achieve positive
impact in the contexts where the examinations are used - especially in
relation to English language learning, and teaching around the world.
In this respect, one of the key features of UCLES EFL examinations has
been a focus on the assessment of speaking by means of a face-to-face
speaking test as an obligatory component of the examinations. As part
of the revisions to FCE and CPE, in order to keep up with developments
in the field, UCLES has introduced new procedures and a number of
different speaking-test formats have been used. Across the range of
examinations that are now produced by UCLES, there is currently no
single model for testing speaking. Some examinations, like the
International English Language Testing System (IELTS), employ a
speaking test in the one-to-one format (i.e. with one candidate and one
examiner), and all tests are recorded SO that they can be rated by other
examiners at a later stage. Other examinations make use of a group
format in the speaking tests, with more than two candidates assessed
together (as in the Certificate in English for English Language
Teachers - CEELT).
Elicitation and In designing a face-to-face speaking test such as those employed by
ratings UCLES, the test developer has to produce a suitable procedure which
involves two main aspects:
42 ELT J ournal Volume 53/1 J anuary 1999 Oxford University Press 1999
articles welcome
a) the elicitation of an appropriate sample of spoken English;
b) the rating of that sample in terms of pre-defined descriptions of
performance in spoken English, whether as a whole, or broken
down into different criteria (e.g. accuracy, range, pronunciation,
etc.).
These aspects in turn depend on two factors:
the availability of valid and reliable materials and criterion rating
scales;
the development and support of a professional oral examiner cadre.
In designing a speaking test, there are no right or wrong solutions to this
problem; as Bachman and Palmer (1996) point out, an appropriate
(useful) outcome is achieved by balancing the essential qualities of
validity, reliability, impact, and practicality to meet the requirements of
the testing context. Despite the variety of formats which are used for
testing speaking, in relation to the following examinations which form
the Cambridge 5-Level System, UCLES has taken steps in recent years
towards harmonization of approach:
Cambridge Level 5 - Certificate of Proficiency in English-CPE
Cambridge Level 4 - Certificate in Advanced English-CAE
Cambridge Level 3 - First Certificate in English-FCE
Cambridge Level 2 - Preliminary English Test-PET
Cambridge Level 1 - Key English Test-KET
The aim is to establish common features which can be applied
appropriately at the different levels. Some of the more important
features which have been identified in this process have been
incorporated into the revision of the FCE, and can be summarized as
follows:
a) A paired format, based on two candidates and two oral examiners.
b) Of the two oral examiners, one acts as interlocutor, and his or her
most important role is to manage the discourse (i.e. ensure that an
appropriate sample is elicited from each of the paired candidates);
the other acts as assessor, and is not involved in the interaction.
c) There are different phases or parts to the FCE Speaking Test,
which facilitate the assessment of different patterns of interaction,
participant roles, discourse, rhetorical functions, etc.
d) Standardization of formats is achieved partly by the use of
controlled interlocutor frames, and partly by the use of tasks based
on visual stimuli from generic sets appropriate to the level and
nature of the examination.
e) Both the interlocutor and the assessor rate the candidates
performance, but the interlocutor provides a global/holistic
assessment, while the assessor provides an analytical assessment.
With the revision of FCE, there are now four examinations which make
use of the paired format - KET, PET, FCE, and CAE (CPE is currently
Assessing speaking in the revised FCE 43
articles welcome
under review, and still retains the option of the one-to-one approach).
The paired format The decision to use the paired format as the standard model for the main
suite speaking tests has been a key feature in balancing the essential test
qualities in relation to the contexts where these examinations are used.
The paired test was first used with FCE and CPE as an optional format
during the 1980s. When CAE was introduced in 1991, the paired format
was established as an obligatory feature of one of the main suite tests for
the first time. This was extended to KET in 1993, to the revised PET in
1995, and most recently to the revised FCE in 1996. Before the decision
was made to extend the use of the paired format across the range of
examinations, and especially to FCE, the various alternative formats
were evaluated. In particular, feedback was collected from a wide range
of stakeholders in the tests from around the world (including oral
examiners, teachers, students, and candidates taking the tests). In
addition, a range of validation projects carried out by the UCLES EFL
Division in the 1990s have contributed to a greater understanding of this
kind of assessment procedure (e.g. Lazaraton 1996a, 1996b, Milanovic,
Saville, Pollitt, and Cook 1996, Young and Milanovic 1992).
One of the major advantages of the paired format is the use of two
examiners to assess a candidate. This adds to the fairness of the
assessment, and helps to make candidates feel reassured that their mark
does not just depend on one person. The paired format also allows more
varied patterns of interaction during the examination; whereas in the
one-to-one model there is only one interaction pattern possible (i.e.
interaction between one examiner and one candidate), the paired format
provides the potential for various interaction patterns between each
candidate and the examiner, and between the candidates themselves. In
addition, the paired format has the potential for positive washback, in
encouraging more interaction between learners in the classroom.
Any test format presents test developers with a range of potential
problems and issues which need to be addressed, and the paired format
of the speaking tests is no exception. Critics of the paired format are
often concerned with issues relating to the pairing of the candidates. It is
argued, for example, that the paired format may not provide each
candidate with an equal opportunity to perform to the best of their
ability, or that the pairings may influence the assessment (e.g. due to
mismatch of language level - a good candidate paired with a weaker
one - or if one candidate is paired with another of a different age,
gender, or nationality).
Many of these concerns cannot be addressed with definitive answers.
However, the potential problems need to be seen within the context of
the overall design of the examination, and balanced against the positive
advantages. For example, UCLES has attempted to address the issue of
how much spoken language is produced by each candidate:
a) by paying close attention to the design of the different parts of the
tests;
44 Nick Saville and Peter Hargreaves
articles welcome
b) by providing examiners with an interlocutor frame to follow whilst
administering the examination.
These features, together with comprehensive training for oral examiners
(described below), help to ensure that a balanced sample of speech is
elicited, and that each candidate receives an equal opportunity to
perform to the best of his or her ability during the examination.
Moreover, the EFL Division at UCLES has been conducting research
since 1992 on the discourse produced in paired-format speaking tests,
and work specifically related to the speech of candidates in FCE has
been going on since 1995. The purpose of this research is to better
understand the features of the language produced during a paired
format test. Initial findings related to the revised FCE suggest that the
features of candidate language predicted by the test specifications were
present in the samples which were analysed. In this regard, it is
important to understand how the format of the revised FCE Speaking
Test was arrived at, and the steps taken to ensure that the assessment is
standardized. The second half of this paper describes in more detail the
features of the revised FCE Speaking Test, and the way that oral
examiners are trained and co-ordinated to carry out the test procedures,
and to make appropriate ratings.
Features of the The initial context for the most recent revision of FCE was provided by
revised FCE the existing uses of the examination, and the nature of the current
Speaking Test candidature. What was already known of existing standards - the
expected level of performance by FCE candidates (centred on passing
candidates with a grade C) - provided the background for the revised
assessment criteria, and the application of the rating scales. The rating
scales themselves (for use by oral examiners) were redeveloped in
relation to the harmonized approach to the assessment of speaking,
described above.
Within this approach, all criteria used in the assessment were defined,
and related to a model of Communicative Language Ability (CLA). The
criteria and scales for revised FCE are derived from the same model
Figure 1 Spoken language ability
Language competence Strategic competence
Grammatical Discourse
Pragmatic
I I
Syntax
Morphology
Vocabulary
Pronunciation
Rhetorical
organisation
Coherence
Cohesion
eg Sensitivity
to illocution
Interaction skills
Non-verbal features
of interaction
articles welcome
which underpins the revision project as a whole, based on the
developments in this area during the 1980s, e.g. the work of Canale
and Swain (1980) Bachman (1990) and the Council of Europe
specifications for Waystage and Threshold (1990). See Figure 1.
The revised FCE has five assessment criteria in all, four analytical and
one global: grammar and vocabulary, discourse management, pronun-
ciation, interactive communication, and global achievement.
Test format and In the revised FCE, the Speaking Test consists of four parts, each of
task features which focuses on a different type of interaction: between the interlocutor
and each candidate, between the two candidates, and among all three.
The patterns of discourse vary within each part of the test, and
candidates are encouraged to prepare for the Speaking Test by
practising talking individually, and in small groups with the teacher
and with peers. The aim is to help them to be aware of, and to practise,
the norms of turn-taking, and the appropriate ways of participating in a
conversation, or taking up a topic under discussion. This is seen as one
aspect of positive impact that the test can achieve.
Oral examiners make use of a task features specification which
summarizes the features of the task which are appropriate to the level,
and purpose of the examination. Each part of the test is a separate task,
with the following specific features:
interaction pattern (examiner to candidate(s); candidate to candidate,
etc.), input (verbal and/or visual), and output by candidates.
The expected output of the candidates is predicted from the combina-
tion of features for each task, and is judged in relation to their
performance in these tasks, which have been designed according to the
level of FCE, and in order to provide an appropriate level of difficulty
for the typical FCE candidature. As noted above, the tasks include
different interaction patterns, different discourse types (short turn, long
turn, etc.), and have features such as turn-taking, collaborating,
initiating/responding, and exchanging information. Examples of other
task features include functions such as describing and comparing, stating
and supporting an opinion, agreeing and disagreeing, speculating,
expressing certainty and uncertainty. This is summarized in Table 1.
Each task has its own focus:
Part 1 - Interview
The interlocutor directs the conversation, by asking each candidate to
give some basic persona1 information about him or herself. The
candidates do not need to talk to each other in this part of the test,
though they may if they wish.
Part 2 - Long turn
Each candidate is given the opportunity to talk without interruption on
his or her own for about one minute. Each candidate is asked to
compare and contrast two colour photographs, commenting on the
Nick Saville and Peter Hargreaves 46
articles welcome
Table 1: Task features Parts
Task format
specification
Candidate output
Interaction Discourse
pattern Input features Functions
1 Interview interlocutor verbal responding to giving personal
(3 minutes) interviews questions questions information
candidates
expanding on talking about
responses present
circumstaces
talking about
past
experience
talking about
future plans
2 Individual interlocutor visual stimuli sustaining a giving
long turn delegates an with verbal long turn information
(4 minutes) individual task rubrics
to each
managing expressing
discourse:
candidate
opinions, e.g.
- coherence
through
and clarity of
comparing and
message
contrasting
- organization explaining and
of language
giving reasons
and ideas
- accuracy and
appropriacy of
linguistic
resources
3 Two-way interlocutor visual/written turn-taking: exchanging
collaborative delegates a stimuli, with initiating and information
task collaborative verbal rubrics responding and opinions
(3 minutes) task to the pair appropriately
of candidates
expressing and
negotiating justifying
opinions
agreeing and/
or disagreeing
suggesting
speculating
4 Three-way interlocutor verbal prompts initiating and exchanging
discussion leads a responding information
(4 minutes) discussion with appropriately and opinions
the two
candidates
developing expressing and
topics justifying
opinions
agreeing and/
or disagreeing
pictures, and giving some persona1 reaction to them. They are not
required to describe the photographs in detail.
Part 3 - Two-way collaborative task
The candidates are provided with a visual stimulus (one or several
photographs/line drawings/computer graphics, etc.) to form the basis for
a task which they attempt together. Sometimes the candidates may be
asked to agree on a decision or conclusion, whereas at other times they
may be told that they may agree to disagree. In all cases, it is the
articles welcome
working towards the completion of the task that counts, rather than the
actual completion of the task.
Part 4 - Three-way discussion
The interlocutor again directs the conversation by encouraging the
candidates to broaden and discuss further the topics introduced in Part 3.
In the information about the test which is provided to candidates, it is
made clear that they must be prepared to provide full but natural
answers to questions asked either by the interlocutor or the other
candidate, and to speak clearly and audibly. They should not be afraid to
ask for clarification if they have not understood what has been said. If
misunderstandings arise during the test, candidates should ask the
interlocutor, or each other, to explain further. Obviously, no marks are
gained by remaining silent, and equally, no marks are lost for seeking
clarification on what is required. On the contrary, this is an important
feature of strategic ability, which is one of the criteria for assessment
(under the interactive communication scale).
While it is the role of the interlocutor, where necessary, to manage or
direct the interaction, ensuring that both candidates are given an equal
opportunity to speak, it is also the responsibility of the candidates to
maintain the interaction as much as possible. Candidates who are able to
balance their turns in the interchange will utilize to best effect the
amount of time available, and SO provide the oral examiners with an
adequate amount of language to assess.
From the point of view of ratings, an advantage of the paired format is
that two independent ratings are obtained for each candidate, thus
making the examination fairer. In the revised FCE both the assessor and
interlocutor record marks using the same criteria, although the two
examiners are expected to have slightly different perspectives on the
performance due to their different roles - the interlocutor as participant,
and the assessor as observer. TO reflect this, the revised FCE makes use
of two types of rating scale: a set of analytical scales derived from the
criteria in the model of Communicative Language Ability, and a global
scale which combines the criteria in the analytical scales in an
appropriate way. This rating procedure involves the assessor marking
each candidate on the four analytical scales as the test is in progress, and
at the end, the interlocutor giving a single global score for each
candidate based on the global achievement scale. There is no
requirement for the examiners to discuss and agree the marks, and
the final assessment is derived from the two ratings when the mark
sheets are returned to Cambridge.
Examiner training Successful elicitation and accurate ratings are to a large extent
dependent on the knowledge and ability of the oral examiners. In the
first instance, careful test design can help to ensure that the examiners
are likely to find the elicitation procedures and rating scales easy to
apply. However, the successful functioning of a speaking test, such as
that used in the revised FCE and most other Cambridge examinations,
articles welcome
relies heavily on a system for training and standardizing the oral
examiners. For UCLES this is a major undertaking, as there are
currently about 7,000 approved UCLES EFL oral examiners around the
world involved in conducting one or more of the Speaking Tests for the
Cambridge EFL examinations. The major objectives in regard to the
performance of these oral examiners are that:
a) they consistently apply the Speaking Test procedures to obtain
representative, valid samples of the candidates spoken English in
accordance with the test specifications;
b) they rate the samples of spoken English accurately and consistently,
in terms of the pre-defined descriptions of performance, using the
rating scales provided by UCLES.
Over the years UCLES has developed a two-pronged approach to
ensuring these objectives can be met, based, firstly, on a network of
professionals with various levels of (overlapping) responsibility, and,
secondly, on a set of procedures which apply to each professional level.
In the network of professionals there are three levels, in addition to
UCLES own staff. At the operational level there are the oral
examiners. At the next level up, in countries where there are sufficient
numbers of oral examiners to merit it, team leaders are engaged by local
secretaries with the responsibility of professional supervision of oral
examiners, in a ratio of about one team leader to between five and 30
oral examiners, depending on such factors as distribution of oral
examiners, location of centres, etc. Finally, in countries where the
number of team leaders (and hence oral examiners) merit it, senior team
leaders have been appointed by UCLES to supervise team leaders in an
average ratio of one senior team leader to 15 team leaders. This forms a
hierarchy of responsibilities. See Figure 2.
The levels in this hierarchy are not sealed off from each other: it is a
requirement that team leaders and senior team leaders must also be
practising oral examiners, in order to ensure that they can draw on their
experience when it comes to dealing with the concerns of oral
examiners.
The set of procedures which regulate the activities of these three
professional levels is summarized by the acronym R-I-T-C-M-E, where
the initials stand for:
Figure 2
UCLES
A
Senior team leaders
A
Team leaders
A
Oral examiners
articles welcome
50
Recruitment, Induction, Training, Co-ordination, Monitoring, and
Evaluation.
Each of these procedures is defined by a list of Minimum Professional
Requirements (MPRs) appropriate to the level of professional
responsibility. These MPRs set down the minimum levels and standards
(for recruitment, induction programmes, etc.) which must be achieved in
order to meet the professional requirements of administering Cam-
bridge EFL Speaking Tests, and to sustain a fully effective team leader
system.
The first two procedures covered by R-I-T-C-M-E, recruitment and
induction, typically apply only once to an applicant oral examiner for a
given examination. The remainder of the procedures are recurrent, and
to some extent cyclical for each examination, in SO far as the outcome of
monitoring and evaluation feeds into training and co-ordination.
After initial training of examiners, standardization of assessment is
maintained by the annual examiner co-ordination sessions of oral
examiners approved for relevant examination, and by monitoring visits
to centres by team leaders. During co-ordination sessions, examiners
watch and discuss sample speaking tests recorded on video, and then
conduct practice tests with volunteer candidates in order to establish a
common standard of assessment. The sample tests on video are selected
by UCLES to demonstrate a range of task types and different levels of
competence, and are pre-marked by a team of experienced assessors. In
this context, monitoring and evaluation refer to both the test procedures
(e.g. whether the procedure elicits an appropriate sample), and to the
performance of the oral examiners. This latter kind of monitoring and
evaluation forms part of the human resource appraisal system, which is
necessary to guarantee the quality of the assessments. During monitor-
ing, team leaders complete evaluation sheets for the oral examiners
being monitored; they discuss results with the oral examiners them-
selves, and with the local secretaries as part of a planning/review
meeting. The evaluation sheets are then sent to the senior team leaders,
and finally on to Cambridge for analysis. In addition, greater use is now
being made of audio recordings to monitor both candidate output and
examiner performance.
Conclusion This paper has described the Speaking Test in the revised FCE in
relation to the format of the test, and the way in which assessments are
made, focusing in particular on the role of the oral examiners. While no
test achieves a perfect balance of necessary qualities, it is believed that
the current balance, with the recent revisions, takes several steps
forward in terms of improvements over earlier solutions. An on-going
commitment to validation involving data collection, monitoring, and
evaluation will ensure that the evolutionary process of change continues
in future. In this way, and as our knowledge of the complexities of
spoken language grows, further revisions can be expected in the future.
articles welcome
References
Bachman, L. F. 1990. Fundamental Considerations
in Language Testing. Oxford: Oxford University
Press.
Bachman, L. F. and A. S. Palmer. 1996. Language
Testing in Practice. Oxford: Oxford University
Press.
Canale, M. and M. Swain. 1980. Theoretical bases
of communicative approaches to second lan-
guage teaching and testing. Applied Linguistics
l/l:l-47. Oxford: Oxford University Press.
van Ek, J . A. and J . L. M. Trim. 1990. Threshold
Level 1990. Strasbourg: Council of Europe.
van Ek, J . A. and J . L. M. Trim. 1990. Waystage
1990. Strasbourg: Council of Europe.
Lazaraton, A. 1996a. Interlocutor support in oral
proficiency interviews: The case of CASE.
Language Testing 13: 151-72. London: Edward
Arnold.
Lazaraton. A. 1996b. A qualitative approach to
monitoring examiner conduct in the Cambridge
Assessment of Spoken English (CASE) in
Studies in Language Testing 3: Performance
testing, cognition and assessment: Selected papers
from the 15th Language Testing Research
Colloquium (LTRC): 18-33. Cambridge: Cam-
bridge University Press/UCLES.
Milanovic, M. N., A. Saville, A. Pollitt, and A.
Cook. 1996. Developing Rating Scales for
CASE: Theoretical Concerns and Analyses in
Validation in Language Testing. Clevedon:
Multilingual Matters.
University of Cambridge Local Examinations
Syndicate. 1997. First Certifcate in English:
Handbook. Cambridge: University of Cam-
bridge Local Examinations Syndicate.
Young, R. and M. Milanovic. 1992. Discourse
variation in oral proficiency interviews in
Studies in Second Language Acquisition 14:
403-24. Cambridge: Cambridge University Press.
The authors
Nick Saville has been Group Manager for Test
Development and Validation within the EFL Divi-
sion of the University of Cambridge Local Examina-
tions Syndicate (UCLES) since 1994. His own
research interest is in the development and valida-
tion of procedures for oral assessment, and he was a
member of the UCLES development team that
worked on the revision of the FCE Speaking Test.
E-mail: <saville.n@ucles.org.uk>
Peter Hargreaves joined UCLES as Director of the
EFL Division after working for the British Council
for over twenty years. His early background in ELT
was in teacher training, but he moved into testing
with the Council after obtaining his doctorate in the
discipline grammar of English. He now heads a team
of about 70 staff at UCLES, working on the
Cambridge EFL examinations and Integrated
Language Teaching Schemes.
articles welcome

53 1 6

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

53 1 6

Transféré par

Droits d'auteur :

Formats disponibles

Assessing speaking in the

Vous aimerez peut-être aussi