Vous êtes sur la page 1sur 13

System 80 (2019) 60e72

Contents lists available at ScienceDirect

System
journal homepage: www.elsevier.com/locate/system

“I know English”: Self-assessment of foreign language reading


and writing abilities among young Chinese learners of English
Huan Liu*, Cindy Brantmeier
Department of Education, Campus Box 1183, Washington University in St. Louis, St. Louis, MO, 63130, USA

a r t i c l e i n f o a b s t r a c t

Article history: A growing number of studies have examined self-assessment (SA) of language abilities;
Received 18 March 2018 however, SA has not been investigated extensively in the context of language teaching
Received in revised form 10 September 2018 and learning in China. This study aims to explore SA of reading and writing abilities
Accepted 19 October 2018
among young Chinese learners of English, and the relationship between SA and objec-
Available online 22 October 2018
tive tests of reading and writing. 106 Chinese learners of English (ages 12 to 14)
completed a Reading Comprehension Test (captured by Free Recall, Sentence Comple-
Keywords:
tion, and Multiple-Choice Questions), a Writing Task (a picture-based writing prompt),
Self-assessment (SA)
English reading and writing
and criterion-referenced SA Items. Correlational analyses revealed a significant corre-
Young language learners lation between scores of SA reading and reading comprehension. The correlation be-
tween scores of SA writing and writing production was also found to be significant.
Findings indicate that young learners tend to self-assess their foreign language reading
and writing abilities accurately. Findings add some empirical information useful for a
better understanding of the trajectories of SA with young learners. Young learners’ self-
perceived strengths and weaknesses in reading and writing abilities are presented. The
potential to use SA as a tool to promote foreign language instruction for young learners
is explored.
© 2018 Elsevier Ltd. All rights reserved.

1. Introduction

Learning a second (L2) or foreign language (FL) in school is becoming a requirement for young learners in more and more
countries (Rea-Dickins, 2000; Zangl, 2000), where young learners are defined as being between the ages of five and twelve
(McKay, 2006). This fact has amplified the need for research on language assessment and its corresponding washback effect
(Wolf & Butler, 2017). Though current research on language assessment for young learners has flaws in both theoretical and
empirical facets (Butler & Zeng, 2014), the consensus is that the assessment for this unique population should be cognitively
and socially appropriate to their developmental stages and maximally tailored to the English learning contexts (Butler, 2016;
Wolf & Butler, 2017). It should be conceptualized as a tool for monitoring language learning and promoting positive learning
attitudes (Weigle, 2002).
Among a variety of assessment formats, alternative assessment accompanied by a component of self-assessment (SA)
has been highlighted for appropriate use as a metacognitive tool. SA is an “internal” assessment from learners’ perspectives
to self-rate their knowledge and skills (Oscarson, 1989). Research has found that SA helps learners make decisions about

* Corresponding author.
E-mail addresses: huan.liu@wustl.edu (H. Liu), cbrantme@wustl.edu (C. Brantmeier).

https://doi.org/10.1016/j.system.2018.10.013
0346-251X/© 2018 Elsevier Ltd. All rights reserved.
H. Liu, C. Brantmeier / System 80 (2019) 60e72 61

their language abilities and set learning goals and objectives (Chapelle & Brindley, 2010; Chen, 2008). In practice, SA has
been adopted by the Common European Framework of Reference for Languages (CEFR), the European Language Portfolio
(ELP), and the Bergen “Can-Do" project (see Hasselgreen, 2000) as an instrument to capture and understand language
performance. In Japan and South Korean, the implementation of SA in classrooms has been promoted by national policy
(Butler, 2018; Butler & Lee, 2006), and various types of SA have been developed and incorporated in textbooks for young
learners (Butler, 2018).
Since the implementation of English learning in secondary schools in China in the late twentieth century (Hu, 2002; Wang
& Lam, 2009), Chinese students have been predominantly evaluated through large-scale standardized exams, which leads to a
culture of teacher-centered and examination-oriented pedagogy (Carless, 2005). Students are seldom encouraged to inde-
pendently self-assess their knowledge and skills. One reason may have to do with the power dynamic between teachers and
students, as teachers may regard SA as a threat to their authority (Towler & Broadfoot, 1992; Butler & Lee, 2010). Another
reason may stem from the lack of research on SA in the Chinese context. Teachers have concerns about the validity and
reliability of SA instruments, as well as the best implementation of SA (Butler, 2018). Without adequate justification, teachers
doubt the rationale for incorporating SA in classroom settings.
The present study attempts to explore SA of reading and writing abilities of young Chinese learners of English and its
relationship to objective tests of reading and writing, of which the findings may justify an implementation of SA as a tool to
promote language learning for young learners. The findings may also provide some empirical insights useful for exploring the
trajectories of SA among young language learners.

2. Literature review

2.1. FL reading and writing among young language learners

Reading and writing in a FL is not easy for young learners. Koda (2007) defines reading as a “product of a complex
information-processing system” (p. 3) that includes a combination of three components: decoding, text-information building,
and reader-model construction. Decoding is how readers extract written text information based on their linguistic knowl-
edge. Text-information building is how readers organize the information that has been extracted from the written text.
Finally, reader-model construction is readers’ syntheses and interpretations of the written text based on their background
knowledge. Reading is a process complicated by text-level variables such as topic familiarity, genre, and text organization, as
well as other variables beyond the text level such as linguistic skills, motivation, affect, and learner characteristics (Alderson,
2006; Bernhardt, 2011).
Similar to reading, writing is a process that involves cognitive, social, and cultural dimensions. Writing is an interplay
between linguistic knowledge and communicative competence. It is a social act as it is goal-directed and serves to
communicate to particular audience (Grabe & Kaplan, 1996). Writing is also a cultural phenomenon as cultural norms
influence variations in writing patterns (Grabe & Kaplan, 1996) and coherence of texts (Leki, 1992). The cognitive load for
writing is heavy as writers need to engage in the time-consuming process of pre-writing, writing, revising, and editing
(Weigle, 2002). The impact from first language (L1) rhetorical knowledge also complicates the task of writing in a FL in
aspects such as communicative purposes, register use, and intertextuality (Silva & Matsuda, 2002).
With limited cognitive and social development, young learners can be overwhelmed by the aforementioned complexities
of FL reading and writing. Young learners must exert great efforts to master the knowledge and skills needed for successful
reading and writing in a FL, which is especially challenging while they are simultaneously developing their cognitive and
literacy skills in L1 (McKay, 2006). The discrepancy between the limited cognitive development and the complex skills needed
for successful FL literacy development heightens the importance of FL reading and writing assessment that are cognitively
and socially appropriate for young language learners.

2.2. Assessing young language learners

When assessing young language learners, two critical factors need to be taken into account. First, developmental stages of
young learners. They have short attentional span, non-linear information-processing patterns and inadequate L1 literacy
skills, which to a large extent determines the format and the degree of autonomy of assessment (Butler, 2016). Second, affect.
Compared with adults, young learners are more vulnerable in assessment (Chik & Besser, 2011) and lose the motivation to
learn more easily (Haggerty & Fox, 2015). Young learners’ experience with assessment can significantly impact their future
learning motivation and outcome (Moss, 2013); therefore, having young learners hold positive experience with assessment is
critical for successful language learning (Wolf & Butler, 2017).
Given the above unique characteristics of young learners, assessment for this population should not focus solely on the
assessment of language learning outcomes; instead, it should be a tool through which they can develop their language
abilities, monitor their language learning process, and experience positive emotions and motivation to learn (Weigle, 2002).
Multiple assessment formats must be employed to capture the full range of language performance of young learners in
diverse contexts (Rea-Dickins & Gardner, 2000). Assessment tasks should be varied and engaging to highlight what young
learners can do in the target language (Hasselgren, 2000). Individual language profiles need to be built for a better under-
standing of young learners’ language performance (Rea-Dickins, 2000).
62 H. Liu, C. Brantmeier / System 80 (2019) 60e72

In China, young Chinese students are evaluated by numerous large-scale and high-stakes standardized exams developed
by provincial or municipal education authorities for different purposes (Jin, Wu, Alderson, & Song, 2017). From the
perspective of pedagogical implications, imposing standardized exams on young learners, unfortunately, does not enhance
classroom teaching (McKay, 2006) and fails to bring the expected washback effect (Qi, 2005, 2007). Alternative assessment
(e.g., performance assessment, self-assessment, language portfolio) is particularly needed in the Chinese context so that
teachers can expand their understanding of learners’ learning progress and then respond to individual learner differences
effectively (Rea-Dickins, 2000; Rea-Dickins & Gardner, 2000).

2.3. SA of language abilities

SA, as an alternative assessment, has been defined as the “procedures by which learners themselves evaluate their lan-
guage skills and knowledge” (Bailey, 1998, p. 227). It has received a great deal of attention given its multiple benefits. Research
has found that SA raises self-awareness of learning (Babaii, Taghaddomi, & Pashmforoosh, 2016; Oscarson, 1989), promotes
learner autonomy (Dann, 2002; Oscarson, 1989), improves self-regulated learning process (Butler, 2016; 2018), and allows
learners to self-assess themselves in an interactive and low-anxiety way (Bachman & Palmer, 1996). It has also been found to
be positively associated with learner confidence and performance (Butler & Lee, 2010; De Saint-Leger, 2009; Little, 2009). It
narrows the gap between learner perception and actual performance (Andrade & Valtcheva, 2009), and minimizes mis-
matches between learner assessment and teacher evaluation (Babaii et al., 2016). SA also serves a number of other purposes,
for instance, expanding the range of assessment (Oscarson, 1989), supporting a learner-centered curriculum (Little, 2005),
and fostering the perception of assessment as a responsibility shared by both teachers and learners (Little, 2005; Oscarson,
1989). As Butler and Lee (2010) summarized, SA is a useful tool to promote language education by fostering self-regulated
learning and learner-centered instruction.
Validity and reliability of the instruments are the biggest concerns with SA (Ashton, 2014; Patri, 2002). One approach to
address this concern is to examine the relationships between SA and objective language performance (Butler, 2018).
Empirically, a number of studies have revealed positive correlations between SA of language abilities and objective language
tests (Ashton, 2014). Ross's (1998) meta-analysis also indicated that the correlation coefficients between SA scores and
objective performance ranged between 0.52 and 0.65 across the four language skills: reading, writing, listening, and speaking,
of which the effect size is medium to large according to Plonsky and Oswald's (2014) proposal of the scale of correlation and
effect size in the field of second language research.
An extensive review of prior research showed that whether SA is an accurate predictor of language abilities varies by
several factors. First, SA constructs (e.g., criterion-referenced or not, contextualized or decontextualized). In general,
criterion-referenced and more contextualized SA items have stronger correlations with objective test performance. Butler
and Lee (2006) compared two types of SA (decontextualized off-task SA and contextualized on-task SA) with the Cam-
bridge Young Learners' English Test and with teachers' assessment among young Korean learners of English in a primary
school. Findings indicated that both the on-task SA and the off-task SA were significant predictors of the Cambridge test as
well as teachers’ assessment; however, compared with the off-task SA, the on-task SA was a better predictor. The findings
were echoed by Butler (2018) who explored SA among young Japanese learners of English in a primary school. The second
factor is associated with assessment tasks. Brantmeier (2005) and Brantmeier and Vanderplank (2008), with a focus on L2
reading by adult Spanish learners, revealed that the accuracy of SA reading varied by different task types (written recall,
sentence completion and multiple-choice questions) utilized to capture reading comprehension, of which the findings
underlined the need to incorporate a combination of different assessment tasks for a better understanding of L2 reading
comprehension.
SA training and experience also played a critical role. Dolosic et al. (2016) examined the relationship between criterion-
referenced SA items and oral production in French in a language summer camp in the U.S. A pre- and post-test of SA
speaking showed that learners who could not accurately self-assess at the beginning of the program were better able to self-
assess at its end. It was concluded that SA could be used as a metacognitive tool for learners to identify their strengths and
weaknesses in language learning if they could have consistent SA practice. Butler and Lee (2010) scrutinized SA among Korean
learners of English in sixth grade and found that they were able to accurately self-assess their English language abilities, and
could self-assess even better after SA practice and training for a whole semester. The authors stressed that SA has a
constructive role in the language learning environment in Asian classrooms.
Other variables associated with the accuracy of SA are L2/FL proficiency level (Alderson, 2006; Heilenman, 1990;
Sahragard & Mallahi, 2014), specific language skills (Brantmeier, 2005; Ross, 1998; Wan-a-rom, 2010), and the target
language itself (Ashton, 2014). A threshold exists beyond which learners can self-assess their language abilities more
accurately, and higher-proficiency learners can more accurately self-assess their abilities than lower-proficiency
learners could (Brantmeier et al., 2012). The correlation between SA and receptive skills (reading and listening) was
stronger than that between SA and productive skills (writing and speaking) (Ross, 1998). Furthermore, SA patterns
varied by different target languages, for instance, L2 learners of German and Japanese tended to overestimate their
reading proficiency, whereas L2 learners of Urdu tended to underestimate and were more aware of what they could not
do (Ashton, 2014).
H. Liu, C. Brantmeier / System 80 (2019) 60e72 63

3. The present study

3.1. Research questions

Though research on SA of language abilities has expanded recently, there is a dearth in database examining SA with young
Chinese learners of English. An investigation on SA is also a foundation for any benefit of implementing SA in classrooms in
China. To address this lacuna, the present study attempts to explore the following four research questions:

1) What are the self-ratings of English reading and writing abilities of young Chinese learners of English?
2) Is there a relationship between scores of SA reading ability and three different reading comprehension tasks (free recall,
sentence completion, and multiple-choice questions)?
3) Do scores of SA reading ability correlate with the scores of the overall reading comprehension?
4) Do scores of SA writing ability correlate with the scores of writing production?

3.2. Participants and context

106 Chinese learners of English (56 males and 50 females) from two seventh-grade classes at a public middle school in a
metropolitan area in China participated in the present study at the end of the spring 2016 semester in June. The participants
were between 12 and 14 years old (Mean ¼ 12.93, SD ¼ 0.51). More demographic details are presented in the section De-
mographic Questionnaire. The entire data collection was completed on a weekday in a regular classroom, with a total time
commitment of 2 h. An English teacher and a researcher of the present study monitored the entire process of data collection.
The public school where the study was conducted was large, with approximately 400e500 students in each grade
(seventh, eighth, and ninth). The students received 90 min of formal English instruction each day from Monday to Friday,
for a total of 450 min per week. The English book series used is Go For It! developed by the People's Education Press in
China. The book series has two volumes comprising 12 learning units each. The goal of the book series aims to develop all
four skills: listening, speaking, reading, and writing. The assessment format is characterized by traditional standardized
exams, including midterm and final exams developed by the school. Other standardized exams are also administered by
the Department of Education at the provincial or municipal level (Jin et al., 2017). All exams are comprehensive, including
sections of listening, multiple-choice questions (focusing on grammatical knowledge), cloze test, reading comprehension
(all multiple-choice questions), sentence/conversation completion, and prompt writing (a total of more than 70 words
required).
According to the 2011 English Curriculum Standards for Compulsory Education issued by the Department of Education
in China, students should have reached a Level-Three English proficiency by the end of the seventh grade. At that level,
students can read and comprehend simple stories in English, summarize reading texts, and read over 40,000 words
outside the classroom. According to the curriculum, the required vocabulary size for junior secondary school is 1500
words in 941 word families (Jin et al., 2017). It is noteworthy that these are all guiding principles for English education,
and no national language proficiency scales exist in China (Zeng & Fan, 2017). The China Standards of English (CSE), a set
of English proficiency standards designed to meet the specifications of the Chinese context, was proposed by the State
Council of China in 2014; however, it was not released until February 2018. CSE Level 3 corresponds to the proficiency
level at junior high schools in China. The instruments utilized in the present study were developed in spring 2016 with no
reference to CSE.

3.3. Materials

3.3.1. Reading Comprehension Test


The Reading Comprehension Test (see Appendix A for sample test) consisted of two passages. A summary of the two
reading passages is presented in Table 1. A total of 1.5 h was given to participants to complete the test. Each reading passage
was followed by three tasks: Free Recall, Sentence Completion, and Multiple-Choice Questions.
Free Recall, with no tester interference or retrieval cues (Bernhardt, 1991), asked participants to write down in English as
much as possible about the reading passage without looking back at the passage. Participants were told that after they
finished reading the passages, they should turn the page over and complete the written recall, and they were not allowed to

Table 1
Summary of reading passages.

Title Number of Words Number of Sentences Number of Embedded Clauses Number of Pausal Units
Passage 1 Ann and Frank 180 13 5 14
Passage 2 A Restaurant 210 14 9 14
64 H. Liu, C. Brantmeier / System 80 (2019) 60e72

look back at the reading. A teacher and one researcher of this study were present during data collection to be sure no par-
ticipants looked back at the reading while completing tasks. The number of pausal units was used as the benchmark when
scoring free recall. A pausal unit is a unit that has a “pause on each end of it during normally paced oral reading” (Bernhardt,
1991, p. 208). To determine pausal units for each passage, a native speaker of English read the passage out loud, and another
native speaker marked the natural pauses throughout the oral reading. This process was done twice with different native
speakers of English, and results were compared to determine the final pausal units in each passage. A scoring matrix following
Bernhardt's (2011) was created as the codifying system to reflect the agreed pausal units, and readers were scored on the
number of correct pausal units recalled. See Bernhardt (2011, p. 104) for a detailed example. In total, the two reading passages
had 14 pausal units each. The pausal unit was utilized as researchers (e.g., Brantmeier et al., 2012; Bernhardt, 1991) have found
that it is the most effective when compared with other protocols such as the idea unit, a recall protocol highlighting the idea,
proposition, and constituent structure (Riley & Lee, 1996).
The Sentence Completion task (5 items for each passage; one point for each item) asked participants to complete
sentences according to the reading passages, and the Multiple-Choice Questions (5 items for each passage; one point for
each item) asked participants to select the one correct answer to the question from four choices. Both sentence
completion and multiple-choice questions set certain limits on answers. Sentence completion was open-ended, and all
possible answers were determined by native speakers of English and used for the scoring procedure. Multiple-choice
questions were also pre-determined, and participants were scored either right or wrong for each question. Adopting a
combination of the three tasks (Free Recall, Sentence Completion and Multiple-Choice Questions) allows participants to
give “a full range of responses, i.e., a score and some insight about the reader” (Bernhardt, 2011, p. 103). Quantitative
measures alone fail to elicit complete understandings of learners’ reading comprehension (Bernhardt, 2011).
The present study chose gender-neutral reading topics to eliminate possible interaction effect between gender and
passage content on reading comprehension (Brantmeier, 2003a,b). After participants finished all three tasks for each
reading passage, one question about topic familiarity was presented. Participants were asked to self-rate to what extent
they were familiar with the topic of the reading passages using a 5-point Likert scale: (1) not familiar at all, (2) not very
familiar, (3) somewhat familiar, (4) very familiar and (5) really familiar. Preliminary independent sample t-test analyses (see
Appendix B) indicated no significant gender difference in topic familiarity for both reading passages; thus, concerns
about the possible impact of gender difference on topic familiarity can be eliminated.

3.3.2. Writing task


The Writing Task (see Appendix C) asked participants to describe the characters in a picture presented and make a story of
what might happen between the characters they saw. Participants were asked to write the story down on a sheet of paper
provided. This picture-based writing prompt was presented in both English and Chinese languages. Jacobs, Zinkgraf,
Wormuth, Hartfield, and Hughey's (1981) criterion was used for scoring five subscales: content, organization, vocabulary,
mechanics, and language use. The maximum points for the task were 20, with each subscale proportioned to five points. Two
trained and qualified raters scored the writing task, and the average was the final score for each participant. Inter-rater
reliability (Cronbach's alpha) was 0.90. Thirty minutes was given to complete the task, and a minimum of 100 words was
required.

3.3.3. SA items
SA items on reading and writing abilities were completed before the reading and writing tests. In China, the education
management structure at different provincial or municipal levels makes it hard to be consistent in defining learning ob-
jectives and outcomes in the English education curriculum at different educational stages (Jin et al., 2017). There is no
consensus on what constitutes English reading and writing abilities at different proficiency levels, making it challenging to
construct SA items tailored to the Chinese context. SA items used in the present study were finally adapted and modified
from multiple sources, including CEFR self-assessment grids (Reading and Writing), ELP, and SA questionnaires from
Brantmeier et al. (2012). CEFR self-assessment grids and ELP illustrate proficiency levels using “Can Do” statements in
which the descriptors focus on what language learners can do with the target language. CEFR self-assessment grids and ELP
have been widely recognized as useful tools to identify different language skills of individuals. SA items from Brantmeier
et al. (2012), modified from the DIALANG project, have been utilized in a number of empirical studies, and high validity has
been tested.
To best tailor SA items to the Chinese context, the present study referred to Hasselgreen's (2000, 2005) proposal on
how to best adapt CEFR and ELP for specific use. Some key considerations for developing the present SA items included,
for instance, the topics and text types that junior secondary students are familiar with, the curriculum guide for English
education, English textbooks and materials used, and the general teaching practices of English reading and writing at
junior secondary schools in China. Development of SA items were also consulted upon with experts and English teachers
from the school where the study was conducted. All the SA items are criterion-referenced. A total of eleven SA reading
items and fourteen SA writing items were finally constructed. Participants were asked to indicate how they would rate
their English reading and writing abilities in each specific situation presented in each item. Each item was rated on a 5-
point Likert scale - “1 (Strongly Disagree)“, “2 (Disagree)“, “3 (Neutral)“, “4 (Agree)” and “5 (Strongly Agree)“. Participants
circled the appropriate rating accordingly. All items were translated by a professional translator and presented to the
participants in L1 Chinese.
H. Liu, C. Brantmeier / System 80 (2019) 60e72 65

Table 2
SA items - internal consistency.

Number of Items Cronbach's alpha Average Average


Item-Total Correlation Inter-Item Correlation
SA Reading 11 0.80 0.57 0.32
SA Writing 14 0.88 0.62 0.38

3.3.4. Demographic Questionnaire


Each participant completed a Demographic Questionnaire (in L1 Chinese) where they self-reported their name, age,
gender, number of years studying English, years living in an English-speaking country, enjoyment of English learning, reasons
for learning English, whether their parents speak English, and their perceived English proficiency level from four options:
Novice, Intermediate, Advanced, Superior (Native-like), and Distinguished (Native). Descriptive statistics from the questionnaire
showed that no participants had experience living in an English-speaking country. No parents of participants spoke English.
Ninety-one (85.84%) of the participants indicated their enjoyment of English learning. Most reported reason for English
learning was related to English as a compulsory course and the importance of learning English. Ninety-eight (92.45%) of the
participants self-rated themselves as “Novice” or “Intermediate” English language learners.

4. Data analysis and results

4.1. Preliminary analysis e internal consistency

R Software (Version 0.99.903) was used for data analysis. The internal consistency of SA reading items and SA writing items
was checked by Cronbach's alpha if item deleted, item-total correlation, and the average inter-item correlation. Two criteria
were evaluated: 1) to drop an item if the item deletion increased alpha by at least .01; and 2) to drop an item if the item-total
correlation was smaller than .30. In terms of the inter-item correlation, an average inter-item correlation between 0.20 and
0.40 indicates that the items, although homogenous, “contain sufficiently unique variance so as to not be isomorphic with
each other” (Piedmont, 2014). Based on these criteria, no SA reading or SA writing items was dropped. Table 2 summarizes the
statistics.

4.2. Descriptive statistics

Table 3 shows the descriptive statistics of scores of free recall, sentence completion, multiple choice, overall reading
comprehension, writing production, SA reading, and SA writing. The overall reading comprehension score was the composite
of free recall, sentence completion and multiple-choice questions. The writing production score was the average score rated
by the two trained raters. The SA reading score was the average score of the 11 SA reading items, and the SA writing score was
the average score of the 14 SA writing items. Variables including reading comprehension, free recall, sentence completion,
and multiple choice did not pass the Shapiro normality test; therefore, Spearman's correlation was employed in some
analyses.

4.3. Results

4.3.1. RQ1. What are the self-ratings of English reading and writing abilities of young Chinese learners of English?
Overall, the mean for SA reading was 3.64 (SD ¼ 0.56). Table 4 details each SA reading item and its mean and standard
deviation. The table also shows the one-sample t-test results comparing the mean score of each item and the mean score of all
SA reading items. The one-sample t-test analyses revealed that the means for Items 2, 9, 1, and 3 were self-rated by participants

Table 3
Descriptive statistics.

Mean Standard Deviation Min Max Skewness Kurtosis Shapiro Normality

W Statistics P value
RC 14.51 7.98 2 33 0.5 0.86 0.94 <.001
FR 5.58 4.04 0 20 1.1 0.98 0.91 <.001
SC 3.76 2.72 0 10 0.49 0.69 0.94 <.001
MC 5.16 2.39 1 10 0.24 0.75 0.96 <.01
WP 10.75 2.86 3.5 17.5 0.13 0.21 0.99 >.05
SAR 3.64 0.56 1.55 4.82 0.4 0.58 0.98 >.05
SAW 3.24 0.62 1.5 4.57 0.23 0.07 0.99 >.05

RC ¼ Reading Comprehension; FR¼Free Recall; SC¼Sentence Completion; MC ¼ Multiple Choice; WP¼Writing Production; SAR¼Self-Assessment of
Reading; SAW¼Self-Assessment of Writing.
66 H. Liu, C. Brantmeier / System 80 (2019) 60e72

Table 4
SA Reading ratings and one-sample t-test results.

Item Description Mean Standard One-Sample t-test


Deviation Sig. (2-tailed)
2 I can understand the general idea of simple informational texts and short descriptions, especially if 4.32 0.74 p < .001
they contain pictures that help explain the text.
9 I can locate specific information I need to help me complete a task. 4.01 0.85 p < .001
1 I can identify the characters, settings, problems, and solutions occurring in a story. 4.00 0.83 p < .001
3 I can infer the meaning of new vocabulary based on the text I read. 3.90 0.95 p < .01
11 I can use some reading strategies (such as rereading) to help me understand the text. 3.75 1.06 p > .05
10 I can identify the main idea discussed in a text and how the idea is supported. 3.55 0.91 p > .05
4 I can choose a reading text appropriate to my reading ability for myself. 3.51 0.99 p > .05
8 I can come up with questions (such as why, what, and how) by myself when I am reading. 3.51 1.08 p > .05
7 I can make predictions when I am reading. 3.34 1.05 p < .01
6 I can make connections between the text I read and my life experience. 3.11 1.04 p < .001
5 I can make connections between the text I read with other texts I have read. 3.09 1.11 p < .001

significantly higher than the overall mean of SA reading items, whereas Items 7, 6, and 5 were self-rated significantly lower.
Participants self-perceived their better abilities to be the following: to understand the general idea of simple informational texts
and short descriptions; to locate specific information for task completion; to identify the characters, settings, problems, and
solutions occurring in a story; and lastly, to infer the meaning of new vocabulary based on the reading text. They self-perceived
their weaknesses in making predictions when reading, making connections between the text they read and their life experience,
and making connections between the text they read with other texts they have read before.
Table 5 listed each SA writing item and its corresponding mean and standard deviation. The table also included the one-
sample t-test results comparing the mean score of each SA writing item and overall mean score for all SA writing items.
Overall, the mean for SA writing was 3.24 (SD ¼ 0.62). The one-sample t-tests indicated that the means for Items 14, 5, and 10
were significantly higher than the overall mean of SA writing, whereas Items 12, 8, and 11 were significantly lower. Partic-
ipants self-identified their strengths in using appropriate spelling, punctuations and capitalization when writing, writing
personal feelings and emotions, and writing a story ending that could be clearly understood by readers. However, they self-
identified their insufficient abilities in keeping readers in mind when writing a story, writing an opening paragraph that can
attract readers’ attention, and using appropriate metaphors when writing.

4.3.2. RQ2. Is there a relationship between scores of SA reading ability and three different reading comprehension tasks (free recall,
sentence completion, and multiple-choice questions)?
Spearman's correlation coefficient, a non-parametric statistic to measure the strength and direction of the association
between two variables, was used to examine the correlations because the distribution of scores of free recall, sentence
completion, multiple choice, and overall reading comprehension violated the assumption of normality. It is generally believed
that non-parametric statistic tests are less powerful than parametric tests; however, that notion stands only when the as-
sumptions of the parametric test are met. If the data is not normally distributed, the Type I error rate of tests based on the
sampling distribution will not be set at 5%; therefore, there is no way to calculate power as it is related to Type I error rate (see
Field, 2013, p. 551).
Spearman's rho (rs) was interpreted. Results indicated that scores of SA reading were significantly related to the scores of
free recall (rs ¼ 0.41, p < .0001), sentence completion (rs ¼ 0.46, p < .0001), and multiple choice questions (rs ¼ 0.43,

Table 5
SA Writing ratings and one-sample t-test results.

Item Description Mean Standard One-Sample t-test Sig. (2-


Deviation tailed)
14 I can use appropriate spelling, punctuations and capitalization when I am writing. 3.81 1.01 p < .001
5 I can write my personal feelings and emotions when I am writing my personal story. 3.57 0.91 p < .001
10 I can write a story ending that makes readers clearly understand what I am writing about. 3.53 1.04 p < .01
6 I can write a story step-by-step, introducing the characters, the problem/conflict, and then 3.38 1.07 p > .05
the solution.
4 I can add dialogues to images when I am writing a story. 3.36 1.03 p > .05
7 I can separate a story into several paragraphs and make the idea of each paragraph clear. 3.36 1.08 p > .05
3 I can write vivid details about a story. 3.29 0.93 p > .05
1 I can write about a personal story from my life experience. 3.27 0.90 p > .05
2 I can write about a story about other characters. 3.20 0.97 p > .05
13 I can use appropriate grammar and sentence structures when I am writing. 3.14 0.92 p > .05
9 I can use different word choice when I'm writing a story. 3.12 0.94 p > .05
12 I can keep my readers in mind when I am writing a story. 2.93 1.05 p < .01
8 I can write an opening paragraph that attracts readers' attention. 2.83 1.03 p < .001
11 I can use metaphors when I am writing a story. 2.62 1.05 p < .001
H. Liu, C. Brantmeier / System 80 (2019) 60e72 67

p < .0001). According to Plonsky and Oswald's (2014) proposal of the scale of correlation and effect size, all those correlations
had medium to large effect size. See Appendix D for the scatterplots of the correlations.

4.3.3. RQ3. Do scores of SA reading ability correlate with the scores of the overall reading comprehension?
Spearman's correlational analysis showed that scores of SA reading were significantly correlated with the scores of the
overall reading comprehension (rs ¼ 0.51, p < .0001) (see Appendix D for the scatterplot), of which the correlation had a
medium to large effect size (Plonsky & Oswald, 2014).

4.3.4. RQ4. Do scores of SA writing ability correlate with the scores of writing production?
Pearson correlation revealed a significant positive relationship between SA writing and writing production (r ¼ 0.30,
p < .01), with a small to medium effect size (Plonsky & Oswald, 2014). Young Chinese learners of English tended to accurately
self-assess their skills and knowledge in English writing. Appendix D shows the scatterplot of the correlation.

5. Discussions and implications

The findings revealed that scores of SA reading and writing abilities of young Chinese learners of English were significantly
correlated with scores of objective tests of reading comprehension and writing production. Scores of SA reading were also
significantly correlated with all the three reading comprehension tasks: free recall, sentence completion and multiple-choice
questions. The correlation between SA reading and reading comprehension had a medium to large effect size whereas the
effect size of the correlation between SA writing and writing production was small, which confirms Ross's (1998) finding that
the correlation between SA and receptive skills was stronger than that between SA and productive skills. However, the
correlation coefficients found in the present study were smaller than the averages synthesized by Ross (1998). Boud and
Falchikov (1989) noted the importance of learners' familiarity and experience with SA. The participants in the present
study had no experience with SA items, and this unfamiliarity might account for the relatively lower correlation coefficients.
Studies by Dolosic et al. (2016) and Butler and Lee (2010) have shown that young learners are able to better assess their
language learning abilities after consistent practice with SA. Giving the findings, it could be expected that the correlations
would have been stronger if the participants in the present study had had consistent training or practice with SA. Additionally,
learning context has a huge impact on the operation of SA (Butler & Lee, 2010). The tension between the practice of learner
assessment and the culture of teacher and large-scale standardized assessment, especially in China, can not be disregarded.
The tension might partly shape how young learners perceive and respond to SA, and to what extent they could self-evaluate
their knowledge and skills accurately.
It is noteworthy that, although present findings revealed significant correlations between SA and objective tests, these
correlations did not fully capture the accuracy of the SA scores (Ashton, 2014). The significant correlations only indicated a
trend that young language learners could self-assess their knowledge and skills in reading and writing accurately. Additional
studies on whether young learners at different proficiency levels over- or under-estimate their abilities are needed before any
generalizations can be made. Furthermore, the positive correlations identified did not inform language teachers and re-
searchers of the exact process of how young learners respond to SA items. Future research is needed to understand this
process as well as the underling knowledge and skills needed for accurate SA, given the cognitively demanding and socially
complex nature of responding to SA (Butler, 2018).
The present findings have important implications for language instruction in classrooms. Self-perceived weaknesses in
reading (e.g., the inability to make predictions or to make connections between the texts) and writing (e.g., the inability to
keep readers in mind when writing a story, to write an opening paragraph that can attract readers' attention, or to use
appropriate metaphors) identified can inform teachers of learners’ particular struggles so that teachers can adjust their
reading and writing instruction effectively. Such washback effects help promote a learner-centered pedagogy particularly
needed in most Chinese secondary schools where classroom instruction emphasizes teacher-centered pedagogies and
teaching towards examinations (Carless, 2005). The fact that teacher-centered pedagogies and teaching towards examina-
tions inhibit learner autonomy will ultimately prevent learners from independently setting goals and making decisions for
learning, or implementing any relevant remedies for weaknesses. An implementation of SA along with its washback effects
will foster learner autonomy expressed in a learner-centered pedagogy and ultimately aid young learners as they grow into
independent language learners.
Pedagogically, teachers could use SA as part of both formative and summative assessment (Boud & Falchikov, 1989). It
could be formative given its role in guiding learning process. For each learning unit or one single lesson, teachers can design
SA items contextualized to the learning content and objectives of that particular unit or lesson, and ask students to self-assess
the knowledge and skills they have mastered. At the same time, to maximize the benefits of SA as a formative assessment tool,
it is crucial that students are clearly guided to use SA items, and that teachers provide corresponding feedback consistently
and address the mismatches between students' self-evaluation and teachers' assessment appropriately (Dann, 2002). SA
could also be summative as a part of learners' grades. Although more research is needed to explore how the summative role of
SA could be best implemented, teachers can utilize it in a variety of ways, for instance, evaluating students' involvement in SA,
determining the match or mismatch between students' SA and teachers' observations, and gauging the consistency or
discrepancy between students' SA and objective tests. The formative and summative uses of SA would encourage learners to
68 H. Liu, C. Brantmeier / System 80 (2019) 60e72

take more responsibility for their own learning, shifting teachers’ role from “marking to planning and moderating assessment
activities” (Boud & Falchikov, 1989).
The positive finding that young Chinese learners of English tended to self-assess their English reading and writing abilities
accurately leads to the opportunity for future research to investigate the interface between SA, English learning motivation,
and high-stakes exam achievement in China. Traditionally considered as a fair and reliable way to measure learning out-
comes, large-scale standardized tests are widely employed by local and national agents in China (Jin et al., 2017). English
language assessment at all levels of education, as with other subjects, is predominately large-scale standardized tests (Cheng,
2008). However, large-scale standardized testing decreases learning motivation (Carless & Wong, 2000). By contrast, SA,
which places learners in a natural and low-anxiety setting where teachers can consistently monitor their progress and
provide feedback, has been found positively related to learners’ motivation and engagement (e.g., Butler & Lee, 2010). Giving
the findings, it is of great benefit to explore whether consistent use of SA within the classroom environment could strengthen
English learning motivation and enthusiasm, and whether consistent use of SA could promote English language performance
in large-scale and high-stakes standardized tests in China.

6. Conclusion

In conclusion, the present study reveals that young Chinese learners of English tended to self-assess their reading and
writing abilities accurately. Findings provide empirical basis for drawing some relevant pedagogical implications that
incorporate SA as part of the curriculum design in foreign language classrooms. In addition, the findings add more empirical
information critical for examining the trajectories of SA with young learners. Future research is needed to determine whether
young learners over- or under-estimate their language abilities, and whether consistent use of SA promotes performance in
large-scale and high-stakes standardized tests. The examination of how to best implement SA as both formative and sum-
mative assessment would also benefit the field of language instruction.

Appendix E. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.system.2018.10.013.

Appendix A. Sample reading passages, free recall, sentence completion and multiple-choice questions

Passage 1 Ann and Frank

One day Ann and Frank went to the lake with Rover. Rover can swim well, so Frank made him go into the water after a stick.
“Jump, Rover! Jump in and get the stick,” said Frank, and into the water Rover went with a big splash. Pretty soon he came
out with the stick in his mouth.
Rover did not like the game as much as Frank because the water was a little cold.
They had a great time for a while with Rover, and then set out for home because it was late in the day, and they could not
stay long.
On the way home, Rover saw a rabbit and away he went after it as fast as he could go. Ann and Frank ran too but could not
keep up with Rover and the rabbit.
When they got home, Rover was there, and Frank said, “Where is the rabbit, Rover?” Rover gave Frank a funny look and
went away.
“Oh I know,” said Frank. “The rabbit ran so fast you could not catch it."
Free Recall. Please write down in English as much as you can remember about the reading passage you just read. Please do
not look back at the reading passage you just read. 请根据刚刚阅读的短文内容用英文写出你所能记住的内容 。请不要再次阅读
该短文。
Sentence Completion. Please complete the following sentences according to the reading passage you just read. Please do
not look back at the reading passage you just read. 请根据刚刚阅读的短文内容补全下列句子。请不要再次阅读该短文。
Sample 1:
When Rover was going back home, he saw_____________.
Sample 2:
Rover gave Frank a funny look after coming back home because__________________.
Multiple-Choice Questions. Please circle the correct answer. 请在下列选项中选出一个正确的选项。
Sample 1. What did Rover see when going back home?

a. Ann and Frank


b. a stick
c. a lake
d. a rabbit
H. Liu, C. Brantmeier / System 80 (2019) 60e72 69

Sample 2. Why didn't Rover like the game to jump into the water?

a. Because he was afraid.


b. Because the water was cold.
c. Because he didn't like to play with Ann and Frank.
d. Because he wanted to go back home.

Passage 2 A Restaurant

You can go to a restaurant to eat breakfast, lunch, or dinner when you want. Restaurants serve food to you and to other people
who go and eat at them. First, you look at a menu that lists all of the foods and drinks that you can order from the restaurant.
Then you tell the waiter or waitress what you want to eat and drink. A “waiter” is the word you use for a man and a
“waitress” is the word you use for a woman. Those are the people who take your order and bring you your food.
Restaurants cook the food you order. Some restaurants have lots of foods and some have some foods like only salads or
soups, hotdogs or desserts. Some restaurants are open part of the day and only serve food at some times of the day. Other
restaurants are open all day and serve food all the time.
After you eat the food, you need to pay for the food. The waiter or waitress gives you a bill that tells you how much money
you need to pay for the food. You also need to leave a tip after paying for your bill. A tip is some extra money you give for good
help, good food, and good service.
Free Recall. Please write down in English as much as you can remember about the reading passage you just read. Please do
not look back at the reading passage you just read. 请根据刚刚阅读的短文内容用英文写出你所能记住的内容 。请不要再次阅读
该短文。
Sentence Completion. Please complete the following sentences according to the reading passage you just read. Please do
not look back at the reading passage you just read. 请根据刚刚阅读的短文内容补全下列句子。请不要再次阅读该短文。
Sample 1:
When you go to a restaurant, the first thing you do is to___________________________.
Sample 2:
After you pay for your bill, you will _________________________________________.
Multiple-Choice Questions. Please circle the correct answer. 请在下列选项中选出一个正确的选项。
Sample 1. In a restaurant, who will you tell if you want to eat or drink?

a. the chef
b. your friends
c. a waiter or a waitress
d. people siting around you

Sample 2. What you should do after you pay for your bill?

a. to leave a tip
b. to leave the restaurant
c. to check the bill again
d. to ask for extra drinks

Appendix B. Independent sample t-test: gender difference in topic familiarity

Mean SD t value p value


Passage 1 Male 2.64 0.98 0.21 >.05
Female 2.68 0.84
Passage 2 Male 2.82 1.08 0.70 >.05
Female 2.96 0.92

N: male ¼ 56, female ¼ 50.

Appendix C. Writing Task

Writing Task. What do you think is happening in the picture? Who is the boy? Who is the man? What are the boy and the
man talking about? Please write a story about the picture you see. 请仔细看这幅图片。图片里面发生了什么?这个小男孩是
谁?这个男人是谁?他们在说什么?请根据图片发挥你的想象描写一小段故事.
70 H. Liu, C. Brantmeier / System 80 (2019) 60e72

Appendix D. Scatterplots of correlations


H. Liu, C. Brantmeier / System 80 (2019) 60e72 71

References

Alderson, C. J. (2006). Diagnosing foreign language proficiency: The interface between learning and assessment. London: Continuum.
Andrade, H., & Valtcheva, A. (2009). Promoting learning and achievement through self-assessment. Theory Into Practice, 48(1), 12e19.
Ashton, K. (2014). Using self-assessment to compare learners' reading proficiency in a multilingual assessment framework. System, 42(1), 105e119.
Brantmeier, C. (2003a). Does gender make a difference? Passage content and comprehension in second language reading. Reading in a Foreign Language,
15(1), 1e27.
Brantmeier, C. (2003b). Beyond linguistic knowledge: Individual differences in second language reading. Foreign Language Annals, 36(1), 33e43.
Brantmeier, C. (2005). Nonlinguistic variables in advanced second language reading: Learners’ self-assessment and enjoyment. Foreign Language Annals,
38(4), 494e504.
Brantmeier, C., & Vanderplank, R. (2008). Descriptive and criterion-referenced self-assessment with L2 readers. System, 36(3), 456e477.
Brantmeier, C., Vanderplank, R., & Strube, M. (2012). What about me? Individual self-assessment by skill and level of language instruction. System, 40(1),
144e160.
Babaii, E., Taghaddomi, S., & Pashmforoosh, R. (2016). Speaking self-assessment: Mismatches between learners' and teachers' criteria. Language Testing,
33(3), 411e437.
Bachman, L. F., & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press.
Bailey, K. M. (1998). Learning about language assessment: Dilemmas, decisions, and directions. Boston: Heinle & Heinle.
Bernhardt, E. B. (1991). Reading development in a second language. Norwood: Ablex.
Bernhardt, E. B. (2011). Understanding advanced second-language reading. New York: Routledge.
Boud, D., & Falchikov, N. (1989). Quantitative studies of student self-assessment in higher education: A critical analysis of findings. Higher Education, 18(5),
529e549.
Butler, Y. G. (2016). Assessing young learners. In D. Tsagari (Ed.), Handbook of second language assessment (pp. 359e375). Berlin: Mouton de Gruyter.
Butler, Y. G. (2018). The role of context in young learners' processes for responding to self-assessment items. The Modern Language Journal, 102(1), 242e261.
Butler, Y. G., & Lee, J. (2006). On-task versus off-task self-assessments among Korean elementary school students studying English. The Modern Language
Journal, 90, 506e518.
Butler, Y. G., & Lee, J. (2010). The effects of self-assessment among young learners of English. Language Testing, 27(1), 5e31.
Butler, Y. G., & Zeng, W. (2014). Young foreign language learners' interactions during task-based paired assessments. Language Assessment Quarterly, 11(1),
45e75.
Carless, D. (2005). Prospects for the implementation of assessment for learning. Assessment in Education: Principles, Policy & Practice, 12(1), 39e54.
Carless, D. R., & Wong, P. M. J. (2000). Teaching English to young learners in Hong Kong. In M. Nikolov, & H. Curtain (Eds.), An early start: Young learners and
modern languages in Europe and beyond (pp. 209e224). Strasbourg: Council of Europe Publishing.
Chapelle, C. A., & Brindley, G. (2010). Assessment. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 247e267). London: Hodder Education.
Chen, Y. M. (2008). Learning to self-assess oral performance in English: A longitudinal case study. Language Teaching Research, 12(2), 235e262.
Cheng, L. (2008). The key to success: English language testing in China. Language Testing, 25(1), 15e37.
Chik, A., & Besser, S. (2011). International language test taking among young learners: A Hong Kong case study. Language Assessment Quarterly, 8(1), 73e91.
China Standards of English. (2018). China standards of English language ability. (2018). http://www.moe.gov.cn/srcsite/A19/s229/201804/t20180416_333315.
html. (Accessed 7 June 2018).
Dann, R. (2002). Promoting assessment as learning: Improving the learning process. London: Routledge.
De Saint-Leger, D. (2009). Self-assessment of speaking skills and participation in a foreign language class. Foreign Language Annals, 42, 158e178.
Dolosic, H. N., Brantmeier, C., Strube, M., & Hogrebe, M. C. (2016). Living language: Self-assessment, oral production, and domestic immersion. Foreign
Language Annals, 49(2), 302e316.
Field, A. P. (2013). Discovering statistics using IBM SPSS statistics: And sex and drugs and rock “n” roll (4th ed.). Los Angeles: Sage.
Grabe, W., & Kaplan, R. B. (1996). Theory and practice of writing: An applied linguistic perspective. New York: Longman.
Haggerty, J. F., & Fox, J. (2015). Raising the bar: Language testing experience and second language motivation among South Korean young adolescents.
Language Testing in Asia, 5(11), 1e16.
Hasselgreen, A. (2000). The assessment of the English ability of young learners in Norwegian schools: An innovative approach. Language Testing, 17(2),
261e277.
Hasselgreen, A. (2005). Assessing the language of young learners. Language Testing, 22(3), 337e354.
Heilenman, L. K. (1990). Self-assessment of second language ability: The role of response effects. Language Testing, 7, 174e201.
Hu, G. (2002). Recent important developments in secondary English-language teaching in the People's Republic of China. Language Culture and Curriculum,
15(1), 30e49.
Jacobs, H. L., Zinkgraf, S. A., Wormuth, D. R., Hartfield, V. F., & Hughey, J. B. (1981). Testing ESL composition: A practical approach. Newbury House. English
Composition Program.
Jin, Y., Wu, Z., Alderson, C., & Song, W. (2017). Developing the China standards of English: Challenges at macropolitical and micropolitical levels. Language
Testing in Asia, 7(1), 1.
Koda, K. (2007). Reading and language learning: Crosslinguistic constraints on second language reading Development. Language Learning, 57(1), 1e44.
Leki, I. (1992). Understanding ESL writers: A guide for teachers. Portsmouth, N.H.: Heinemann.
Little, D. (2005). The Common European framework and the European language portfolio: Involving learners and their judgements in the assessment
process. Language Testing, 22(3), 321e336.
Little, D. (2009). Language learner autonomy and the European language portfolio: Two L2 English examples. Language Teaching: Surveys and Studies, 42(2),
222e233.
McKay, P. (2006). Assessing young language learners. Cambridge, UK: Cambridge University Press.
Moss, C. M. (2013). Research on classroom summative assessment. In J. H. McMillan (Ed.), SAGE handbook of research on classroom assessment (pp. 235e255).
Thousand Oaks: SAGE Publications.
Oscarson, M. (1989). Self-assessment of language proficiency: Rationale and applications. Language Testing, 6(1), 1e13.
Patri, M. (2002). The influence of peer feedback on self- and peer-assessment of oral Skills. Language Testing, 19(2), 109e131.
Piedmont, R. L. (2014). Inter-item correlations. In A. C. Michalos (Ed.), Encyclopedia of quality of life and well-being research (pp. 3303e3304). Dordrecht,
Netherlands: Springer.
Plonsky, L., & Oswald, F. L. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64, 878e912.
Qi, L. (2005). Stakeholders' conflicting aims undermine the washback functions of a high-stakes test. Language Testing, 22(2), 142e173.
Qi, L. (2007). Is testing an efficient agent for pedagogical change? Examining the intended washback of the writing task in a high-stakes English test in
China. Assessment in Education, 14, 51e74.
Rea-Dickins, P. (2000). Assessment in early years language learning contexts. Language Testing, 17(2), 115e122.
Rea-Dickins, P., & Gardner, S. (2000). Snares and silver bullets: Disentangling the construct of formative assessment. Language Testing, 12(2), 215e243.
Riley, G. L., & Lee, J. F. (1996). A comparison of recall and summary protocols as measures of second language reading comprehension. Language Testing,
13(2), 173e189.
Ross, S. (1998). Self-assessment in second language testing: A meta-analysis of experimental factors. Language Testing, 15, 1e20.
Sahragard, R., & Mallahi, O. (2014). Relationship between Iranian EFL learners' language learning styles, writing proficiency and self-assessment. Procedia -
Social and Behavioral Sciences, 98, 1611e1620.
72 H. Liu, C. Brantmeier / System 80 (2019) 60e72

Silva, T., & Matsuda, P. K. (2002). Writing. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 251e266). London: Hodder Education.
Towler, L., & Broadfoot, P. (1992). Self-assessment in the primary school. Educational Review, 44(2), 137e151.
Wan-a-rom, U. (2010). Self-assessment of word knowledge with graded readers: A preliminary study. Reading in a Foreign Language, 22(2), 323e338.
Wang, W., & Lam, A. S. L. (2009). The English language curriculum for senior secondary school in China: Its Evolution from 1949. RELC Journal: Journal of
Language Teaching and Research, 40(1), 65e82.
Weigle, S. (2002). Assessing writing. Cambridge: Cambridge University Press.
Wolf, M., & Butler, Y. G. (2017). An overview of English language proficiency assessments for young learners. In M. Wolf, & Y. G. Butler (Eds.), English language
proficiency assessment for young learners (pp. 3e22). New York: Routledge.
Zangl, R. (2000). Monitoring language skills in Austrian primary (elementary) schools: A case study. Language Testing, 17(2), 250e260.
Zeng, Y., & Fan, T. (2017). Developing reading proficiency scales for EFL learners in China. Language Testing in Asia, 7(1), 8.

Vous aimerez peut-être aussi