Test Design

Language Assessment Design
Dordt College Placement Exam

Vicky Fang & Hala Sun
Original Test Design: Placement Exam
ii
Table of Contents
BACKGROUND INFORMATION .............................................................................................1 OVERVIEW .......................................................................................................................................................... 1 History......................................................................................................................................1 Target population.....................................................................................................................2 Purpose of the placement test ..................................................................................................3 Test description ........................................................................................................................3 TEST DEVELOPMENT PROCESS ................................................................................................................. 6 GENERAL ADMINISTRATION PROCEDURES ......................................................................................... 7 CONSTRUCTS ..................................................................................................................................................... 8 I. Listening comprehension ......................................................................................................8 II. Grammar .............................................................................................................................8 III. Reading comprehension .....................................................................................................9 IV. Writing ability ....................................................................................................................9 V. Oral skills ..........................................................................................................................10 SCORING AND INTERPRETATION ......................................................................................11 ANALYSIS ...................................................................................................................................16 APPLYING WESCHES (1983) FOUR COMPONENTS .........................................................................16 APPLYING SWAINS (1983) FOUR PRINCIPLES ..................................................................................18 EXAMNING VALIDITY AND RELIABILITY ......................................................................20 VALIDITY OF M-C QUESTIONS ..................................................................................................................21 Item facility ............................................................................................................................21 Distractor analysis .................................................................................................................22 Item discrimination ................................................................................................................23 Response frequency distribution ............................................................................................25 RELIABILITY ....................................................................................................................................................27 Inter-rater reliability ..............................................................................................................21 SUBTEST RELATIONSHIPS....................................................................................................32 DISCUSSION ...............................................................................................................................35 CONCLUSION ............................................................................................................................38 REFERENCES .............................................................................................................................40
Original Test Design: Placement Exam Background and Information Overview History. In 2012, as a language assessment project with Dr. Kathleen M. Bailey, we
(Hala Sun and Vicky Fang of the Monterey Institute of International Studies [MIIS]) re-designed the Dordt College Placement Test (DCPT) (Hala Sun is a Dordt College alumna). The DCPT is a specialized assessment tool to measure incoming international and exchange students academic English language proficiency, specifically their listening, reading, writing, and speaking skills as well as their grammatical knowledge. Unlike the previous DCPT, this newly designed test includes a section called Grammar. Concurrent with this language assessment project, we also designed an academic writing course curriculum for the English for Academic Purposes (EAP) program at Dordt for our curriculum design project. As part of the design process, we surveyed the current and the past international students and interviewed the EAP and the English instructors at Dordt College. Based on our needs analysis, we learned that the need to improve international students grammatical competence was crucial. Furthermore, we also found out that the DCPT, which determines whether students need to take the EAP courses during their first semester, was designed 16 years ago in 1996 by the current EAP course instructor, Sanneke Kok. We strongly felt the need to update the stimulus materials presented in the previous DCPT because we believe that the relevancy and the authenticity of stimulus materials affects the abilities that we want to assess (Wesche, 1983). Based on our interview, Instructor Kok had attempted to update the DCPT, but due to limited time and resources, she was only able to make minor changes to the scoring rubrics 2 years ago; the content and the type of test methods were not revised. The stimulus material for the listening comprehension subtest (a mock lecture) was slightly changed
over the years--the professor giving the lecture was changed. In addition, there had been no tests conducted to examine the reliability and the validity of the DCPT (Sanneke Kok, personal communication, September 24, 2012). Applying all the concepts from our language assessment course, we envisioned this test to be comprehensive and appropriate to the needs of the stakeholders. This newly designed DCPT is still organic and may need follow-up revisions upon administering this test to incoming international students at Dordt College. Nevertheless, we feel confident about the foundations of this test because (1) we designed this test, following the decision-making format presented by Alderson, Clapham, and Wall (1995); (2) we pre-piloted and piloted the new DCPT with the current international students at Dordt; and (3) we ran various statistical tests to ensure the validity and the reliability of this test. Target population. All students admitted to Dordt College whose English is not their native language (this includes exchange students and ESL students) are required to take the DCPT. According to Instructor Kok, the number of international students admitted to Dordt varies every year, but on average, 10 to 12 students take her EAP courses every semester. To pass the old DCPT, students need to score at least 80% on the essay writing and 70% on all of each of the remaining subtests (listening, reading comprehension, and speaking). Through our needs analysis, we found out that the English professors have high academic expectations from their students, especially in writing and grammar competency. Therefore, we decided to keep the current 70%80% standard except for one minor change. We included the newly added subtest, grammar, into the 80% standard category. The students who do not score 80% or above on both the essay writing and the grammar subtests are required to take the EAP reading and writing
course (Academic Writing from Sources). Similarly, if students do not score 70% or above, they have to take the EAP speaking and listening course (Academic Interaction). Purpose of the placement test. The revised DCPT is designed to provide an accurate evaluation of international students academic English language skills, assessing their potential to be successful in their college academic life. Specifically, this test helps determine whether these international students have sufficient academic English skills and knowledge to take the general courses at Dordt, especially English courses. International students who do not pass the placement exam have to take either or both of the EAP courses offered in their first semester. Once the international students complete these EAP courses, then they can register for general English core courses. As we re-designed the DCPT, we constantly made sure that the constructs assessed in our DCPT matched the two EAP courses offered. We reviewed the curricula of these two courses because we wanted to examine whether the areas or skills that students need further improvement, based on the results of their DCPT, are covered in the current EAP courses. Currently, the Academic Writing from Source course helps students to improve their academic reading and writing skills, especially focusing on how to integrate various sources and to make appropriate citations using standard documentation styles. The Academic Interaction course focuses on helping students to develop and strengthen their speaking and listening skills of academic English. Test description. To understand the new DCPT, we must first examine the original placement test created by the EAP instructor, Sanneke Kok. The previous DCPT had the following four subtests:
Original Test Design: Placement Exam Subtest 1: Oral interview. This subtest required the test-takers to answer several questions posed by a test administrator. This subtest assessed the test-takers oral fluency and accuracy in speaking. Subtest 2: Mini article. This subtest comprised 10 multiple-choice questions based on a sample reading. This subtest assessed the test-takers vocabulary knowledge and reading comprehension. Subtest 3: Mini lecture. This subtest required the test-takers to watch a video clip of a mock lecture, testing their listening comprehension. After watching the clip, students had to answer the true or false questions presented orally by the lecturer in the video. In addition, students had to fill in the missing blanks of the given table. Subtest 4: Writing prompt. For this subtest, the test-takers had to compose an essay according to the given prompt. This subtest assessed the test-takers academic writing ability, which includes organization skills and grammar. Subtests 1, 2, and 3 were used to determine whether the test-takers needed to take the Academic Interaction and subtest 4 was used to decide whether the test-takers had to take the Academic Writing from Sources.
For our newly designed test, we have an overarching theme of language learning. This test has the following five subtests: Subtest 1: Listening comprehension. This subtest consists of five short-answer questions and measures the test-takers ability to comprehend a speech from a video clip (Ted Talk). In this video, the presenter discusses the concept of English manias and the implications of the spread or the dominance of the English language. The test-takers have to identify various important
Original Test Design: Placement Exam information from the video to be able to answer the short-answer questions. The maximum allotted time for this subtest is 10 minutes. Subtest 2: Grammar. This is a cloze-elide subtest, in which the test-takers have to identify 15 extra words that make the sentence(s) within the given text ungrammatical. The test-takers are required to cross out these extra words. The instructions indicate that there are exactly 15 extra words to cross out. This given text, taken from the New York Times
newspaper, relates to the topic of language learning and immersion. The maximum allotted time for this subtest is 5 minutes. Subtest 3: Reading comprehension. This third subtest consists of 10 multiple-choice (MC) questions and measures the test-takers vocabulary knowledge, reading comprehension, and grammar. There are two articles within this subtest, each having five M-C questions. The first article is a short narrative story of the worlds oldest learner. The second part is an excerpt of an article that discusses the influence of mother tongue and language learning experience. The maximum allotted time for this subtest is 20 minutes. Subtest 4: Mini-essay writing. This subtest measures the test-takers academic writing ability. Specifically, the essays content and organization are assessed as well as the test-takers correct use of grammar. This subtest requires the test-takers to explain using specific examples whether or not they think learning English is important. The maximum allotted time for this subtest is 20 minutes. The test-takers are required to write at least 180 words. Subtest 5: Oral interview. This subtest measures the test-takers speaking ability, specifically their fluency and accuracy in speech (e.g., grammar, pronunciation, and coherence). Furthermore, their content and comprehension are also assessed by examining the relevancy of their answer to the given prompt including their examples to support their stance. The test-takers
Original Test Design: Placement Exam are given 2 minutes to prepare and up to 3 minutes to answer the prompt. The prompt is written as follows: In the United States, many universities require students to learn an additional language
other than their native language. Do you support the idea that university students should be required to learn an additional language (other than their native languages)? Why or why not? For this new DCPT, subtests 1, 3, and 5 will be used to determine whether the test-takers need to take the Academic Interaction, and subtest 2 and 4 will be used to decide whether the test-takers have to take the Academic Writing from Sources. Test Development Process The following table shows the steps we took to design this new DCPT: Table 1 Dordt College Placement Test Development Process Step 1: Decision-making - Examined the (old) DCPT - Familiarized the target population and the setting (including college goals and curricula). - Conducted a needs analysis of the stakeholders - Chose the constructs and the types of subtests - Provided definition for each construct - Determined the test methods for each construct Step 2: Designing - Gathered relevant, useful, and motivating stimulus materials - Designed one subtest at a time - Allocated specific time for each subtest - Created the scoring criteria for each subtest Step 3: Pre-piloting - Pre-piloted the test with 3 TESOL MIIS classmates and the course instructor - Revised the test based on the feedback and test results (e.g., lessened the time allotted for each subtest; and revised M-C choice items, that were misleading, confusing, or too obvious) Step 4: Piloting - Sent the revised DCPT to Instructor Kok to pilot/administer this test - Instructor Kok returned 10 current international students DCPT, including the recorded oral interview via DVD - Scored the exams
Table 1 (Cont) Dordt College Placement Test Development Process Step 5: Analysis - Analyzed the validity of the M-C subtest, using item facility, item discriminability, distractor analysis, and response frequency distribution - Analyzed the reliability of the objectively scored subtests using the split-half method - Analyzed the reliability of the scorers using the interrater reliability Step 6: Reflections & Revisions - Decided to make the oral interview prompt simpler (some students did not answer the question asked) - Made minor changes to the oral scoring criteria - Created a model or an example for the cloze-elide test (instead of crossing out the words, some students underlined or circled the words) - Added more lines in the essay; some students did not meet the minimum word requirement; we assume that the test-takers concluded their writing when they saw the lines ending or lacking
General Administration Procedures For test administration, Dordt College has two teamsthe logistics team and the oral interview team. The logistics team members are student volunteers recruited by Instructor Kok in advance. Instructor Kok provides a 1-hour training to the student volunteers. We adapted the current logistics guide and made minor changes. See Appendix A for the new logistics guide we created and Appendix B for the original guide. In addition, Instructor Kok recruits the oral interview team or as Instructor Kok refers to as the Entrance Interview for International/ESL Students (EIIS) team. Every year, there are about five to six groups of EIIS team, each team consisting three faculty members from various disciplines (both male and female). Similar to the logistics team, the EIIS team members receive an hour training from Instructor Kok. During the training session, Instructor Kok briefly discusses the topic of language acquisition, as well as the benefits of being an EIIS team member, such as gaining a snap shot of the new international students abilities and needs (personal communication, November 14, 2012). For a sample oral interview schedule (provided
Original Test Design: Placement Exam by Instructor Kok), see Appendix C. Constructs With Alderson, Clapham, and Walls (1995) guidance on developing test specifications, we identified the five constructs for the new placement test. There are listening comprehension,
grammatical knowledge, reading comprehension, writing ability and oral skills. In addressing to the issue of test methods, Bailey (1998) points out that indirect tests may fail to provide valid assessment of a construct and may also have negative washback on test-takers. Wesche (1983) also argues that integrative and direct tests are better to predict students use of the target language in real life. Thus, when designing the placement test, we tried to incorporate direct and integrative test methods to measure each construct. I. Listening comprehension. In defining the listening construct, Buck (2001) argues that listening tests need to be contextualized, knowledge-independent, require fast, automatic, online processing of texts and go beyond literal meaning (p. 113). In correspondence to the definition, we made a listening task that requires test-takers to respond to five short-answer questions after watching a four-minute video. By doing so, we simulated a mini lecture to test students abilities to recall specific words as well as students comprehension of the overall speech. II. Grammar. From our interview with Instructor Kok as well as the four English professors at Dordt College, we have learned that the institutions educational philosophy stresses an emphasis on students grammatical competence. Therefore, we added the grammar section in designing the test. In defining the concept of grammar, the Longman dictionary (2010) states that it usually takes into account the meanings and functions these sentences have in the overall system of the language (p. 252). Citing Larsen-Freeman (1991, 1997), Brown (2010)
also argues that grammatical knowledge includes grammatical forms, grammatical meanings and pragmatic meaning. To implement the idea that grammatical forms are intimately associated with grammatical meanings as well as pragmatic meaning, we inserted the grammar problems that English as Second language (ESL) learners may encounter into an article. The grammar problems we addressed in test include the use of articles and prepositions, adjective usage, verb tense and subject-verb agreement. These grammar problems were intentionally selected from the grammar criteria addressed in the analytic rubric of writing (see Appendix D for the scoring criteria). By doing this, we hope to raise the test-takers awareness of these grammar problems when they compose their writings. III. Reading comprehension. Hedgcock and Ferris (2009) mention that from a bottomup view of reading, the reader starts from small units such as words and works towards large units such as written discourse; from the top-down view, the readers understanding of a text is the product of the readers background knowledge of the text and the information given by the text. Thus, we designed the items that lead the test-takers to adopt both approaches to comprehend the reading passage (Alderson, 2000). The bottom-up items include questions asking for the interpretation of specific words. The top-down items include questions that require the test-takers to paraphrase a sentence and recognize the implied message of a text. We included two readings in the section, which consists of 10 multiple-choice questions. IV. Writing ability. To study academic writing, students need to master the process of structuring ideas into a piece of writing which shares the convention of a specific type of text (Ferris & Hedgcock, 1998). To measure the students writing skills, we decided to assess the students ability to write an expository essay, which is a common essay genre that college students often encounter in academic life (Purdue Online Writing Lab, 2010). Thus, test-takers
Original Test Design: Placement Exam need to write an essay of about 180-250 words to state, explain, and support their views on the
10
given prompt. Based on Purdue Online Writing Lab (2012), the structure of the expository essay consists of the following main components:
A clear, concise, and defined thesis statement that occurs in the first paragraph of the essay.
Clear and logical transitions between the introduction, body and conclusion. Body paragraphs that include evidential support (whether factual, logical, statistical, or anecdotal).
We used these descriptions to revise the analytic rubric developed by Instructor Kok to assess students writing ability. V. Oral skills. Luoma (2004) defines speaking tasks as activities that involve speakers in using language for the purpose of achieving a particular goal or objective in a particular speaking situation (p. 31). To effectively assess the construct, we created a prompt that requires test-takers to expound on an argument based on a given topic. Test-takers will have two minutes to prepare their speech and three minutes to perform their speech orally. During the preparation time, test takers are also allowed to jot down some notes for their speech. By having students relate the issue to a familiar environment, we hope the students will gain confidence in discussing the topic. We also hope to maximize their opportunity to express themselves in English by providing concrete personal examples.
Original Test Design: Placement Exam Scoring and Interpretations
11
Different scoring criteria are used to evaluate each construct. Reading comprehension and Grammar are both objectively scored subtests. The Reading comprehension subtest uses multiple-choice questions to test students reading ability. There are 10 multiple-choice questions, each worth 1 point. For the Grammar subtest, we created a cloze elide test method to measure students grammatical knowledge. The test-taker receives one point when he/she crosses out the extra word in the article. If the test-taker crosses out the wrong word, points would not be deducted from his/her score. The cloze-elide test contains 15 extra words, so 15 points are granted to the grammar subtest. We used both exact word method and acceptable word method to evaluate the listening comprehension construct. Bailey (1998) introduces two scoring methods to evaluate cloze tests. Under exact word scoring method, the test-taker gets credit only when he/she writes down the exact word in the response. In contrast, with acceptable-word scoring, the test-taker can get credit when his/her response is grammatically correct and makes good sense in the context (p. 61). The two methods both have merits and demerits. We used the exact word method for evaluating responses that require accurate information from the listening, and we used the acceptable word method to assess the test-takers comprehension of the overall content of the listening. For each item scored by acceptable word method, we made a list of acceptable answers. The total points of the listening subtest are 10 points. Each question is worth 2 points. 1 point will be deducted if the test-taker does not respond to the acceptable word questions in a complete sentence. The oral and writing subtests are both scored subjectively according to the respective analytic rubrics. In setting up the rubrics for the oral presentation and essay writing, we revised
12
the analytic rubrics used in the original placement test. The analytic rubrics of writing include the evaluation of three aspects, content, organization and grammar (see Appendix D). Based on the needs analysis we conducted for our curriculum design project, we knew that both the international students and the English department at Dordt College value grammatical competence in language learning. Therefore, we kept grammar weighting 50% of the total possible writing scores (100 points). The rubric for the oral test was retained at first, but we found that this rubric was not appropriate to score the oral test we designed. The old DCPT oral test was in the form of an interview, so the rubrics included comprehension of the interview questions. However, the oral test we designed is a presentation in a given scenario, so our rubrics need to assess whether the student appropriately provides an answer based on the given prompt or not. In designing the new rubrics for the oral test, we emphasized three main aspects of a speech: content, accuracy and fluency (see Appendix D for the scoring criteria). The new rubrics made the total points of the oral test increased to 40 points. Table 2 presents our descriptive statistics based on the results of the new DCPT: Table 2 Dordt College Placement Test Descriptive Statistics (N = 10) Points Test Possible Listening 10 Grammar 15 Reading 10 Writing 100 Oral 40 Total 175 Mean 8.4 9 6.7 69.4 28.9 122.4 Mode 10 10, 13 8, 6 N/A N/A N/A Median 10 10 7 71.75 29 127.75 Range 8 14 7 52.5 19.5 101 Standard Deviation Variance 2.76 7.6 4.64 21.56 1.95 3.79 16.84 283.6 5.27 27.82 31.46 344.37
13
Except listening and reading subtests, all the other tests are graded using different scales. These subtest scores are not aggregated to enable the EAP/English Department to decide whether an international student has to take either or both of the EAP courses. Table 3 and 4 present the subtest scores, and the comments following Table 4 represents how subtest scores are used to make placement decisions (Alderson et al., 1995). Table 3 Placement Test Scores for Academic Interaction Subtest (Points Possible) Learner Listening (10) Reading (10) 1 10 8 2 10 6 3 9 6 4 2 2 5 5 7 6 8 6 7 10 8 8 10 7 9 10 8 10 10 9 Mean 8.4 6.7
Oral (40) R1 (30) R2 (33) =31.5 R1 (38) R2 (38) = 38 R1 (28) R2 (26) = 27.5 R1 (17) R2 (20) = 18.5 R1 (30) R2 (27) = 28.5 R1 (34) R2 (34) = 34 R1 (30) R2 (30) = 30 R1 (23) R2 (26) = 24.5 R1 (28) R2 (31) = 29.5 R1 (27) R2 (27) = 27 28.9
Original Test Design: Placement Exam Table 4 Placement Test Scores for Academic Writing Subtest (Points Possible) Learner Grammar (15) Writing (100) 1 14 R1 (77) R2 (75) =76 2 3 4 5 6 7 8 9 10 13 10 0 9 11 13 8 2 10 R1 (90) R2 (90) = 90 R1 (85) R2 (85) = 85 R1 (36) R2 (38) = 37.5 R1 (62) R2 (61) = 61.5 R1 (79) R2 (77) = 78 R1 (87) R2 (87) = 87 R1 (53) R2 (57) = 55 R1 (57) R2 (56) = 56.5 R1 (68) R2 (67) = 67.5
14
Mean
69.4
To be exempt from the Academic Interaction course, students must obtain a score of 70% or higher on each of these three subtestslistening comprehension, oral presentation and grammar. To be exempt from the Academic Writing from Sources course, students must obtain a score of 80% or higher on each of these two subtestsreading comprehension and mini-essay writing respectively. To further analyze students scores on each subtest, we created the following frequency polygons for listening for where we used partially subjective scoring, reading, and grammar subtests for where we both used objective scoring for both subtests:
Original Test Design: Placement Exam Figure 1 Frequency Polygon for Listening and Reading Subtests
15
Figure 2 Frequency Polygon for Grammar Subtest
By looking at the frequency polygons and the descriptive results from Table 3 and Table 4, we wondered whether the listening comprehension subtest is too easy for the students. The mean of the listening subtest is 8.4, much higher than a score of 70% of the total listening scores. On the reading subtest, although the mean is only 6.7 out of 10, there are 60% of the students
Original Test Design: Placement Exam who obtained a score of 70% or higher. In contrast, in the grammar, writing, and oral subtests, only 2030% of the students met the requirements. Based on these scores, it seems that these students needs to improve their writing and oral skills with an emphasis on grammatical competence. Analysis Applying Wesches (1983) Four Components The following table shows the application of Wesches (1983) four components framework to our test: Table 5 Wesches (1983) Framework Subtest Stimulus Materials The test-taker Listening watches a video clip of English Mania presented by Jay Walker (2009). The test also contains five shortanswer questions related to the content of the video.
16
Task Posed to the Learner The test-taker must watch and listen to the video and identify important information.
Learners Response The test-taker must write down their responses to the questions.
Scoring Criteria* Questions 2 and 3 (requiring specific number and country names) are marked using the exact word method. The remaining questions are marked using the acceptable word method. Students are given either 2 points or 0 points. For Question 3, partial credit (1 pt) is given when at least two correct countries are mentioned. The test-taker gets points when he/she crosses out the exact incorrect words.
Grammar
The test-taker reads an article from the New York Times (Bahanoo, 2012).
The test-taker must identify 15 extra words inserted within a sentence that makes the sentence ungrammatical based on the structural rules of English; the test-taker must pay attention to the details of the reading to find multiple grammar errors, such as use of articles and tenses.
The test-taker must cross out the extra words.
Original Test Design: Placement Exam Table 5 (Cont) Wesches (1983) Framework Subtest Stimulus Materials The test-taker Reading reads 1 long passage and 1 short passage. The test contains 5 multiplechoice questions for each passage.
17
Task Posed to the Learner The test-taker must identify the main ideas of the readings and define the meaning of the words within the given context.
Learners Response The test-taker must circle the letter representing the answer to a question.
Scoring Criteria* The test-taker gets points when they circle the correct letters of the multiple-choice questions, as determined by the established key.
Miniessay Writing
An essay prompt is presented to the test-taker.
The test-taker must read and answer to the given prompt. He/She must compose an organized writing with sufficient examples and correct use of vocabulary and grammar.
The test-taker must read the prompt, understand the context and adopt the role given in the scenario.
The test-taker must write an essay about 180-250 words that states, explains, and supports his/her opinion on the given prompt.
The test-takers essay is subjectively scored based on an analytic rubric set by the test designers. The rubric consists of three sections, content, organization and grammar.
Oral Interview
A role-play scenario is given to the test-taker.
The test-taker must take 2 minutes to prepare a persuasive speech that states, explains, and supports his/her opinion on the given topic and deliver it within 3 minutes.
The test-takers speech is subjectively scored based on an analytic rubric set by the test designers. The rubric evaluates two aspects of a speech which are content and fluency and accuracy.
*Note. The keys and rubrics of the scoring criteria were all pre-established by the test designers, although the rubric of the oral interview was modified subject to the students responses from the piloting tests. Wesche (1983) points out the importance of using authentic materials in language testing. Therefore, the stimulus materials we selected to include in the test were sourced from
Original Test Design: Placement Exam authoritative publications, such as the New York Times and the National Geographic Learning.
18
We also decided to create a theme-based test to help scaffold students knowledge, as well as to make the testing constructs more integrated with one another. Considering the background of our test-takers, we chose language learning as the overarching theme, because all test-takers share an experience of learning an additional language, English. In addition, we sequenced the test from the receptive skills (listening, grammar and reading) to the productive skills (speaking and writing) to enhance the production stage of the exam. Applying Swains (1980) Four Principles of Communicative Language Development The following table shows our application of our test to Swains four principles: Table 6 Swains (1980) Framework Subtest Start from somewhere Our choice of this Listening procedure is motivated by our intention to simulate an academic situation in which students are given a lecture.
Concentrate on content Since the testtakers are international students, the topic of English learning is relevant to them and the video also serves to activate testtakers schemata.
Bias for best The test-takers can get visual support besides the audio input. Also, they are allowed to take notes when watching the video. The spelling errors are not marked in test-takers responses to the comprehension questions. The subtest assesses multiple grammar points, such as use of articles, adjectives and verb tense.
Work for washback The test-takers can Experience a situation of taking a real academic lecture. Practice notetaking skills.
Grammar
Citing LarsenFreemans (1991, 1997), Brown (2010) defines grammatical knowledge as: grammatical forms, grammatical meanings and pragmatic meanings.
Students can relate the content to their own experience in language learning.
The test-takers can: Learn to pay attention to the details of the reading passages. Know the meanings are associated with the grammatical forms.
Original Test Design: Placement Exam Table 6 (Cont) Swains (1980) Framework Subtest Start from somewhere Concentrate on Bias for best content Reading The design of the subtest was driven by both the top-downprocessing and the bottom-up-processing of reading comprehension (Longman dictionary, 2012). Consistent with the content of the previous subtests, the two articles are also about language learning. The definitions of some difficult vocabulary terms are given in the test. Key words and key sentences are either underlined or bolded for attention. Paragraphs are marked with alphabetic letters for the convenience of reference. The test-takers can use the materials provided on the test to support their opinions.
19
Work for washback The test-takers can: Expand their vocabulary knowledge. Learn to use context to interpret the meanings of the words. Identify the main ideas from the readings. Paraphrase the reading.
Mini-essay Writing
Through essay writing task, we are able to identify students strengths and weaknesses, including grammar usage and vocabulary knowledge.
The essay prompt, whether learning English is important or not, has been developed through the previous subtests.
The test-takers can: Write in a simulated academic context. Compose an argumentative essay. Incorporate sufficient sources into the writing. The test-takers can: Experience a simulated academic presentation. Give a persuasive speech.
Oral Interview
Besides the concern of using direct test to measure the test-takers oral competence, the construct of the oral test was also inspired by the frequent situations where students are required to orally express their opinions supported by examples in academic settings.
The content is related with the theme of the test, language learning.
The test-takers can use the materials provided on the test to support their opinions. The test-takers have 2 minutes to prepare and jot down some notes for their speech.
Original Test Design: Placement Exam Examining Validity and Reliability The Dordt College Placement Test (DCPT) is an important test not only for the English Department but also for the international students. Since the results of this test will be used to decide whether the incoming students need to take the English for Academic Purposes (EAP) classes in their first semester, we had to make sure that our newly revised test is valid and
20
reliable. Therefore, we piloted the DCPT with the current EAP students at Dordt and decided to conduct several analyses on our subtests, including Item Facility (I.F.), Item Discrimination (I.D.), Distractor Analyses, Response Frequency Distribution, Split Half Reliability, Inter-Rater Reliability and Subtest Relationships. Specifically for validity, we analyzed one of the objectively scored portions of our test, the reading comprehension subtest, using I.F., Distractor Analyses, I.D., and Response Frequency Distribution. To test the reliability, we evaluated the subjectively scored parts of our test using Inter-Rater Reliability and one of the objectively scored parts, the multiple-choice (M-C) test, using Split Half Reliability. Finally, we assessed the correlation between scores on each of our subtests and the total test. As mentioned previously, for our reading comprehension subtest, we designed an M-C test. Bailey (1998) discusses that many teachers and test-makers use an M-C test as a method to assess students ability because of the ease of test administration and scoring. Moreover, students may perceive an M-C test to be much fairer and/or more reliable since this test can be scored objectively (Bailey, 1998, p. 130). Despite its scoring practicality, the reality is that it is difficult to design an M-C test. In fact, Bailey (1998) mentions that it is quite labor-intensive because testmakers need to consider various factors. For instance, getting the question (stems) and the options right takes time. To ensure that our M-C subtest is working well and that this subtest
Original Test Design: Placement Exam provides valid information the college needs to make placement decisions, we conducted four different types of validity analyses. Validity of M-C Questions Item facility. Item facility (I.F.) is an index of how easy an individual item was for the people who took it (Bailey, 1998, p. 132). To calculate the I.F. of the reading comprehension
21
multiple-choice (M-C) subtest, we used the following formula, taken from Bailey (1998, p. 132): I.F. = # of test-takers answering the item correctly # of test-takers According to Bailey (1998), the I.F. number ranges from 0.0, which signifies that every test-taker missed the item, to 1.0, which means everyone answered the item correctly. Table 7 represents the I.F. data for DCPT reading comprehension M-C subtest. Table 7 Reading Comprehension Subtest Item Facility (n=10) Item 1 2 3 4 5 6 7 8 9 10 Students who answered the item correctly 8 8 10 9 9 3 6 4 5 5 Item Facility (I.F.) 0.80 (80%) 0.80 1.00 0.90 0.90 0.30 0.60 0.40 0.50 0.50 Average I.F.= 0.67
Oller (1979) states that items falling somewhere between about 0.15 and 0.85 are usually preferred (p. 247). Based on our I.F. data, item 3 (1.00) and items 4 and 5 (0.90) need serious revisions, because Oller claims that in tests that are intended to reveal differences among the students who are better and worse performers on whatever is being tested, there is
22
nothing gained by including test items that every student answers correctly or that every student answers incorrectly (p. 246). Excluding items 3, 4, and 5, the remaining items fall well within Ollers preferred range. Half of the items (items 6, 7, 8, 9, and 10) appear to be in the medium difficulty range, from 0.30 to 0.60. Based solely on these results, we would not change the items with medium difficulty (items 6 to 10), but would revisit items 3, 4, and 5. Distractor analysis. We conducted a Distractor Analysis to improve the validity of our M-C test and to make sure that each option for each test item was distracting. Also, since we assume that students have variable skills and knowledge, we do not want to have a test item option that is too obvious and serves no purpose in terms of distinguishing students who know and who do not know the correct answer. Table 8 shows the number of students that selected each option. Table 8 Reading Comprehension Subtest Distractor Analysis (n=10) Item A B C D 1 8* 2 0 0 2 0 8* 1 1 3 0 10* 0 0 4 0 0 9* 1 5 9* 0 0 1 6 2 2 3 3* 7 2 2 0 6* 8 5 0 4* 1 9 1 4 5* 0 10 5* 1 2 2 Note. (*) indicates the correct answer to the item. Based on the results of the Distractor Analysis, we can see that items 1 through 5 need attention. Previously, our I.F. Analysis indicated that items 3, 4, and 5 should be revised because these items were too easy for the students. In the Distractor Analysis, we can verify that options in items 3 to 5 should be changed considerably because the majority of the students chose one
Original Test Design: Placement Exam option over the others. Options in the items 1, 2, 7, 8, and 9 should also be reviewed to make ensure that all of the options are distracting, like items 6 and 10.
23
Item discrimination. The Item Discrimination method was used to find out how the top scorers and low scorers performed on each item in the M-C subtest. Using Flanagans method of computing item discriminability (Oller, 1979), the top scorers and the low scorers were ranked based on the scores of the entire exam, including the other four subtests. We took the top 33% of the exams and the bottom 33% of the exams and calculated the I.D. using the following formula, taken from Bailey (2008, p. 136) I.D.= (# of high scorers who got the item right) (# of low scorers who got the item right) 33% (total # of students [10]) Using Flanagans formula, Table 9 reflects the calculated I.D. values for the M-C subtest. Investigating the I.D. value is helpful to us as test makers because we are able to know whether our low I.F. items were truly difficult and our high I.F. items were too easy for the
Table 9 Reading Comprehension Subtest Item Discrimination (n=10) Low scorers (bottom High scorers (top three) three) with correct Item with correct answers answers 1 2 2 2 3 1 3 3 3 4 3 2 5 3 2 6 1 1 7 2 2 8 2 1 9 2 1 10 1 2
Item Discrimination (I.D.) 0.00 0.61 0.00 0.30 0.30 0.00 0.00 0.30 0.30 0.30 Average I.D.= 0.151
Original Test Design: Placement Exam students. Table 4 shows the I.F. and I.D. values side by side. We will use Table 10 to better analyze the results of the I.D. values. Table 10 Reading Comprehension Subtest Item Discrimination and Item Facility (n=10) Low scorers High scorers (bottom three) Item (top three) with with correct Discrimination Item Facility Item correct answers answers (I.D.) (I.F) 1 2 2 0.00 0.80 (80%) 2 3 1 0.61 0.80 3 3 3 0.00 1.00 4 3 2 0.30 0.90 5 3 2 0.30 0.90 6 1 1 0.00 0.30 7 2 2 0.00 0.60 8 2 1 0.30 0.40 9 2 1 0.30 0.50 10 1 2 0.30 0.50
24
Based solely on the I.D. values, items 1, 3, 6, 7, and 10 should probably be revised since Ollers (1979) lowest acceptable value is 0.25. From our previous discussion on Item Facility, we mentioned that item 3 should to be addressed because the item was too easy for the students (IF=1.0). In addition, we also mentioned keeping the items 6, 7, and 10 because these items had medium difficulty and were in Ollers preferred I.F. range; but our I.D. values for items 6 and 7 show that they need to be revisited because equal numbers of top scoring students and low scoring students answered these items correctly. In fact, in item 10 (I.D. = .30), two high scorers missed the item while only one low scorer incorrectly answered the item. Oller points out that we would be disturbed if we found an item that good readers (high scorers) tended to miss more frequently than weak readers (low scorers) (1979, p. 251). However, we need to consider two important factors before making changes to any items. First, because our sample size is
Original Test Design: Placement Exam small (n=10), it is difficult to decide whether to revise these items using Ollers recommended range, especially in items 4 and 5the high I.F. value (0.9) signals change but I.D. value
25
(0.30) indicate that it is acceptable. If our sample size was larger, there may be more variance in our results; thus, we could analyze these items better. Second, the top scorers and the low scorers were divided based on the results of the entire test, which consists of five subtests, evaluating different language constructs. The DCPT that we designed heavily emphasize grammatical competence and writing skills; it was designed as such based on our needs analysis. As a result, we may have considered students with high grammar knowledge and writing skills as part of the top three high scorers, but we are analyzing the performance of students reading comprehension ability. Students with high reading comprehension skills may have scored lower in other sections, and thus, may not have been included in the top three high scorer sample. Therefore, keeping these factors in mind, we will revisit and mindfully revise the items that need attention. Response frequency distribution. Prior to looking at the results of the Response Frequency Distribution, Table 11 provides a brief overview of the items that need attention and/or revision based on I.F., Distractor Analysis, and I.D. analysis: Table 11 Overview of Items that Need Attention Analysis Item Facility Distractor Analysis Item Discriminability
Item(s) 3, 4, and 5 3, 4, and 5 (maybe 1, 2, 7, 8, and 9) 1, 3, 6, 7, and particularly 10 (-0.30)
Based on the information from Table 5, items 3, 4, 5, and 10 (since it is showing a negative discrimination) seem to need revisions, and items 7, 8, and 10 should be revisited. To further assist us in making decisions and analyzing the validity of our M-C subtest, we conducted the Response Frequency Distribution. Response Frequency Distribution analysis is a useful
Original Test Design: Placement Exam method because it provides a detailed picture of what the top scorers and low scorers answered for each item; this analysis reflects the combination of the Distractor Analysis and I.D. analysis (see Table 12). Table 12 Response Frequency Distribution on Reading Comprehension Subtest Item 1 2 3 4 High/Low Scorers High Low High Low High Low A 2* 2* 0 0 0 0 B 1 1 3* 1* 3* 3* C 0 0 0 1 0 0 3* 2* 0 0 1 0 0 0 2* 1* 2* 1* 0 1 D 0 0 0 1 0 0 0 1 0 1 1* 1* 2* 2* 0 0 0 0 1 0
26
High 0 0 Low 0 0 5 High 3* 0 Low 2* 0 6 High 1 0 Low 1 1 7 High 0 1 Low 1 0 8 High 1 0 Low 2 0 9 High 0 1 Low 1 1 10 High 1* 1 Low 2* 0 Note. (*) indicates the correct answer to the item.
As shown in Table 12, items 1, 3, 7, and 10 need to be revisited because these items did not discriminate between the high scorers and the low scorers. For items 1, 3, and 7, there were equal number of high scorers and low scorers who answered the items correctly; for item 10, as mentioned previously, there were more low scorers who had the right answer than high scorers. Item 6 also had equal number of high and low scorers, but in terms of Distractor Analysis, this item is ideal. In addition, it is important to note that since we only selected the top three and
27
bottom three scorers, and because there are only four options, despite the even distribution, there will still be a 0 number in our Response Frequency Distribution data, such is the case in items 2, 6, and 9. The results in items 2, 8, and 9 are quite interesting because a majority (if not all three) of the top scorers answered these items correctly, while the low scorers were distracted by other options. Referring back to Table 11, the items that were easy were items 3, 4, and 5. Thus, we consider items 2, 8, and 9 as useful items to separate the performance of high and low scorers. Finally, based on Table 12, it is worth revisiting items 4 and 5 because all three high scorers as well as 2 out of 3 low scorers answered these items correctly. Through several validity analyses, we have learned that some of the itemseither the questions or the optionsneed to be reviewed. Table 13 shows the overall summary of items that need attention. Table 13 Overall Summary of Items that Need Attention Analysis Item Facility Distractor Analysis Item Discriminability Response Frequency Analysis Overall Items That Need Attention:
Item(s) 3, 4, and 5 3, 4, and 5 1, 3, 6, 7, and particularly 10 (-0.30) 1, 3, 7, and 10 1, 3, 4, 5, 7, and 10
Reliability Aside from ensuring the validity, it is important to examine whether a test reliable. According to Brown (2005), test reliability is defined as the extent to which the results can be considered consistent or stable (p. 175). As mentioned above, the DCPT consists of five subtests, in which the reading and grammar subtests are objectively scored. We consider the listening subtest as a partially subjectively scored test because the scoring criteria adopted the acceptable word method. Internal-consistency measures can only be applied to the objectively
Original Test Design: Placement Exam scored tests. We also are not able to calculate the internal-consistency of the grammar subtest because it is a cloze-elide test, and each item is dependent on one another. Therefore, we only
28
measured the internal-consistency reliability of the reading subtest, which is composed of 10 MC items. We used the split-half method and the Spearman-Brown prophecy formula to estimate the full subtest reliability (see Table 14).
Table 14 Internal Consistency Measures of the Reading M-C Subtest Reliability after using Spearman Brown Split Half Prophecy Standard Subtest Reliability Formula Deviation Reading 0.56 1.85 0.72
Standard Error of Measurement (SEM) 0.80
Points Possible 10.00
The reliability result of 0.72 using the Spearman Brown formula means that the scores of the reading M-C subtest is 72% consistent, with an 18% measurement error (100%72%=18%). The statistical results also suggest that our reading subtest has good reliability considering the internal consistency. We claim as such in consideration of the following factors: 1). The sample size of the test is smallonly 10 test-takers. 2). The test is first launched and is fairly new. 3). The total number of the items (10) on the M-C subtest is small. In addition, Brown (2005) points out that all the methods used to estimate the internalconsistency reliability underestimate the actual value. Therefore, we are confident enough to conclude that the reading M-C subtest of the DCPT is fairly reliable.
Original Test Design: Placement Exam With the reliability estimate, we calculated the standard error of measurement (SEM) using the following formula, in which S stands for standard deviation and rxx ' stands for the reliability estimate for the test:
SEM S 1 rxx ' (Brown, 2005, p. 189)
29
As the formula suggests, the SEM value is related to the internal consistency of the test. It refers to the possible score range a test-taker can get if he/she takes the test repeatedly. In other words, it expresses the precision of test scores. The SEM value we obtained is 0.98, which can be rounded up to 1 point. Thus, if a test-taker gets a score of 7 on the reading subtest, his/her true ability score lies with a certain level of probability in between 6 and 8. Considering the fact that the total points of the reading subtest is 10 and each item is worth 1 point, the 0.98 value of the SEM is quite good and reasonable. Thus, our SEM value further supports the reliability of our reading comprehension section. Inter-rater reliability. Since the reliability of the reading section is confirmed, we now proceed to examine the reliability of our subjectively scored teststhe oral and the essay writing subtests. Using analytic scoring rubrics, we each rated the tests; upon scoring, the inter-rater reliability was measured. We calculated the final score of each of the subjectively scored sections by averaging our ratings (Rater1s rating and Rater 2s rating). According to Bailey (1998), coefficient alpha is usually used to compare the scoring of the two raters. To calculate the coefficient alpha, the variance for each rater and the total variance for both raters were computed (see Table 15 and Table 16).
Original Test Design: Placement Exam Table 15 Inter-rater Reliability for Oral Test Learner 1 2 3 4 5 6 7 8 9 10 Mean Standard Deviation Variance Coefficient Alpha = .89 Rater 1 30 38 28 17 30 34 30 23 28 27 29 5.41 29.25 Rater 2 33 38 24 22 27 34 37 26 32 27 30 5.25 27.60 Rater 1 + Rater 2 63 76 52 39 57 68 67 49 60 54 59 10.13 102.65
30
As Table 15 suggests, the standard deviation for Rater 1 was slightly higher than the standard deviation of Rater 2, but the mean for Rater 1s scores was marginally lower than the mean for Rater 2s scores. Thus, we can infer that Rater 1 was slightly tougher and had a little more variability in scoring the oral test. Moreover, as shown in Table 9, the calculated coefficient alpha was 0.89. Bailey (1998) mentions, the closer the value is to the whole number 1.00, the greater the inter-rater reliability (p. 182). Therefore, based on the coefficient alpha value, the ratings on the oral test were quite reliable.
Original Test Design: Placement Exam Table 16 Inter-rater Reliability for Essay Writing Test Learner 1 2 3 4 5 6 7 8 9 10 Mean Standard Deviation Variance Coefficient Alpha = .96 Rater 1 77 90 85 33 62 82 92 53 59 76 70.9 17.81 317.29 Rater 2 82 90 85 41 61 75 85 62 54 67 70.2 15.07 226.96 Rater 1 + Rater 2 159 180 170 74 123 157 177 115 113 143 141.10 32.43 1051.49
31
Similar to the analysis of the results shown in Table 15, the means and the standard deviations in Table 16 indicate that Rater 1s ratings were slightly more lenient and had more variability in scoring the writing subtest than Rater 2s. The coefficient alpha, 0.96, indicates an extremely high correlation between the two raters. In other words, the ratings of the two raters were highly reliable. Since the ratings of the two raters on both subjectively scored subtests are quite reliable, we deduce that one of the attributions of the results may be due to the use of analytic scoring scales. The analytic scoring rubrics outline the components of writing, such as content, organization and grammar in detail, so raters can easily identify the measurable concepts when evaluating the tests. It is also noteworthy to mention that the inter-rater reliability of the writing subtest is higher than the one of the oral subtest. The trigger of such a difference may be that the analytic scoring scale of the writing subtest is more specific and detailed than of the oral subtest.
Original Test Design: Placement Exam Although the inter-rater reliability appears to be substantial, we acknowledge our limitationsthat we both served as test-developers and raters. In real-life cases, the raters are often not involved with the test development. Although trainings are given to ensure the reliability of raters, raters sometimes have different perspectives and interpretations of the scoring criteria. However, in our case, since we created the scoring criteria, we knew exactly what we were looking for. To ensure the successful application of the DCPT, we recommend that raters attend a rater conference, where they can be trained on how to score each criterion of the analytic scoring rubrics before they start to evaluate the test. Subtest Relationships
32
To further strengthen the validity of our test, we evaluated the relationship between each pair of the subtests by conducting a statistical analysis of the correlation among the subtests of the DCPT. As mentioned before, the DCPT consists of five subtests. The listening comprehension subtest is worth 10 points; the grammar section is worth 15 points; the reading comprehension section has 10 total points; the writing section contains 100 points in total; and the oral subtest is worth 40 points (a total of 175 points). At first, we used the raw score formula to calculate Pearsons r, which is the correlation coefficient between each pair of the subtests. However, since each subtest has different scoring scales and points, we decided to convert all the subtest scores to a standardized scalez scoresto calculate Pearsons r. Interestingly, there was no difference in the results between using z scores and raw scores. Table 17 reflects the statistical results of the subtest relationship.
Original Test Design: Placement Exam Table 17 Subtest Relationships Test Listening 0.57 Grammar 0.69 Reading 0.44 Writing 0.78 Oral Oral
33
Correlation Coefficients (Pearsons r) 0.67 0.79 0.58 0.89 0.52 0.47 0.52 0.47 0.89 0.78 0.44 0.69 Writing Reading Grammar
0.58 0.79 0.67 0.57 Listening
As evident in Table 17, the values of the correlation coefficient are all positive. This positive correlation suggests that as scores in one subtest increase, so will the scores in another subtest. Thus, if a test-taker improves his/her performance on the listening comprehension, he/she may also perform better on any of the other four subtests. Brown (2005) notes that relatively strong correlations would be those that range from +0.80 to +1.0, or 0.80 to 1.0 (p. 141). The greatest value indicated in Table 17 is 0.89, which is the value of correlation coefficient between grammar and writing subtests. The correlation coefficient between oral and writing is 0.78, and the correlation coefficients between listening and reading is 0.79. Both of them can be rounded up to 0.80. These three high values indicate a strong correlation between the paired subtests mentioned in comparison to the other pairs of the subtests. For example, reading and writing subtests as well as reading and oral subtests show relatively low correlations (0.44 and 0.47, respectively). The high correlation between grammar and writing is expected, because half of the scores in the writing section depend mainly on the test-takers grammatical competence. However, we cannot claim that our grammar section measures the same construct as the writing section. Oller (1979) strongly argues that a low correlation does not indicate that two tests are measuring
34
different constructs, nor does high correlation indicate that two tests are measuring the same constructs. In fact, there are a lot of factors that may impact the correlation between two tests. Oller also points out that high correlations have been observed between a wide variety of testing techniques with a wide range of tested populations (p. 193). Since our test is fairly new and our sample size of test-takers is small, it is normal that we do not have consistently high subtest correlations. To observe whether two different tests measure the same thing, we decided to square the correlation coefficients to obtain the values of overlapping variance. Table 18 reflects the values of overlapping variance between each pair of the subtests. Table 18 r-squared for Subtest Relationships Test Overlapping Variance Listening 0.32 0.45 0.62 0.34 Grammar 0.48 0.79 0.27 Reading 0.19 0.22 0.27 Writing 0.61 0.22 0.79 Oral 0.61 0.19 0.48 Oral Writing Reading Grammar
0.35 0.62 0.45 0.32 Listening
The highest value in Table 18, 0.79, is the overlapping variance between the grammar and writing subtests. This means that the writing section and the grammar section share almost 80% overlapping variance in measuring the same construct, which is grammatical competence. As we mentioned previously, 50% of the scores in the writing section are dependent on grammar. Based on our needs analysis of Dordt College ESL students and the English Department, Dordt College strongly values grammatical competence and the writing skills of students. Therefore, we were glad to see that the grammar and writing sections have high reliability and relationship within the DCPT.
Original Test Design: Placement Exam Discussion
35
According to Oller (1979), there are four traditional criteria for evaluating testsvalidity, reliability, practicality, and washback. First, the validity of a test refers to how well the test does what it is supposed to do, namely, to inform us about the examinees progress toward some goal in a curriculum [] or to differentiate levels of ability among various examinees on some task (p. 4). In terms of face validity, although few items from the M-C subtest need improvement, we feel that overall the newly revised DCPT has face validity because the test was designed carefully considering the language constructs that need to be tested. Moreover, we made sure that the overall difficulty level of the entire test was appropriate, the instructions were clear, and the tasks were uncomplicated. It is also worth mentioning that we pre-piloted and piloted this test to the current international students who are taking EAP classes at Dordt College. According to Mousavi (1999), a test is considered valid when the content of the test measures the language skills and structures that are meant to be concerned. To ensure that the content of our test is valid, especially to Dordt College, we reviewed and adapted all the skills that were tested on the original DCPT. However, based on our needs analysis, we found out that the English Department at Dordt, as well as the past and current international students, considered grammatical competence as an important language skill. Therefore, with Alderson, Clapham, and Walls (1995) guidance on developing test specification, we identified five constructs for the new placement testlistening comprehension, grammatical knowledge, reading comprehension, writing ability, and oral skills. We also used Weshes (1983) four components framework (see Appendix G) and Swains (1980) four principles of communicative language test development (see Appendix H) as our guidance when developing and validating the content of our test. Finally, since the target population for this test is incoming international
36
students who are language learners, we selected language learning as the overarching theme of the exam. Ollers (1979) second criterion for evaluating tests is reliability. Oller states that reliability of a test is a matter of how consistently it produces similar results on different occasions under similar circumstances (p. 4). Furthermore, Baker (1989) defines the term reliability as stability in the measure (p. 60). Based on several reliability analyses that we conducted, it is safe to claim that the DCPT is a reliable test to assess incoming international students overall English language abilities. Using the split-half method, we confirmed the reliability of the reading M-C subtest. To examine the reliability of the subjectively rated scores, we used coefficient alpha to calculate inter-rater reliability, in which we found that the ratings of the two raters are highly reliable. Finally, we also evaluated the subtest relationships to determine the strength of the correlation between two subtests. Our results showed positive correlation coefficients between each pair of subtests. The third criterion we used to evaluate our test is practicality. According to Oller (1979), practicality includes the preparation, administration, scoring, and interpretation of the test (p. 4). To ensure the practicality of our test, we pre-piloted and piloted our exam to measure and adjust the limits of time. We also referred to the test administration specification sent by the EAP instructor. We adapted this specification and modified it accordingly (see Appendix A for Test Administration Procedures). We also made sure that the entire exam, especially the oral interview subtest, was not too lengthy, not only for the benefit of the students, but to also increase the ease of scoring and interpretation, as well as to lessen the burden of volunteer students and faculty members who are involved in the test administration and scoring.
Original Test Design: Placement Exam Finally, it is important to consider the washback of a test, the effect a test has on teaching and learning (Bailey, 1998, p. 249). Applying Swains (1980) four principles of
37
communicative language test development, the newly revised DCPT has the following washback for each subtest, as shown in Table 19. Table 19 Washback of Subtests: Applying Swains (1980) Framework Subtest Washback Listening Comprehension The test-takers can: Experience a situation of taking a real academic lecture. Practice note-taking skills. Grammar The test-takers can: Learn to pay attention to the details of the reading passages. Know the meanings are associated with the grammatical forms. The test-takers can: Expand their vocabulary knowledge. Learn to use context to interpret the meanings of the words. Identify the main ideas from the readings. The test-takers can: Write in a simulated academic context. Compose an argumentative essay. Incorporate sufficient sources into the writing. The test-takers can: Experience a simulated academic presentation. Give a persuasive speech.
Reading Comprehension
Mini-Essay Writing
Grammar
Original Test Design: Placement Exam Conclusion In the process of designing the DCPT, we used Weshes (1983) four components framework (see Appendix G) and Swains (1980) (see Appendix H) four principles of communicative language test development framework to ensure that the quality of the test. Test specifications were also used to guide the test development process to establish good comparability of scores across test forms (Alderson, Claphan, & Wall, 1995). Furthermore, we
38
pre-piloted the test to three native (or near native) English speakers before piloting the test to the students from the target groupcurrent Dordt College ESL students. The pre-piloting stage allowed us to revisit some of our testing items, as well as to re-adjust the time allotted to each subtest to increase practicality of the test. Thanks to the support of the ESL Department of Dordt College, our test test was piloted in the environment where the test would actually be adopted and administered. Therefore, the results of the DCPT were highly informative and indicative of its future performance. The latter sections of this report specifically discussed the validity and the reliability results of the DCPT. In summary, the overall test is valid because the content measures the language skills and structures that are meant to be concerned (Mousavi, 1999). We conducted four statistical analyses, item facility, distractor analysis, item discrimination and response frequency distribution, to test the quality of the M-C items on the reading subtest. Based on our statistical results, although there are some items that need to be revisited to strengthen the overall validity of the reading subtest (see Table 13), we can confirm the overall validity of our reading subtest, especially considering that this is a new test. In addition, our reliability procedures yielded positive results, showing significant inter-rater reliability between raters. Therefore, based on all the statistical results, we can safely conclude that the DCPT is valid and reliable for
Original Test Design: Placement Exam its use as a placement test. We can also confirm the practicality of the DCPT because we carefully designed the test and conducted several piloting procedures. Finally, we ensured the washback of the DCPT by applying Swains (1980) four principles of communicative language test development.
39
Original Test Design: Placement Exam References Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Angeli, E., Wagner, J., Lawrick, E., Moore, K., Anderson, M., Soderlund, L., & Brizee, A. (2012, May 30). General format. Retrieved from http://owl.english.purdue.edu/owl/resource/560/01/ Bahanoo, S. (2012, April 3). How immersion helps to learn a new language. The New York Times. Retrieved from http://www.nytimes.com/2012/04/03/science/how-immersionhelps-to-learn-a-new-language.html Bailey, K. M. (1998). Learning about language assessment: Dilemmas, decisions, and directions. Boston, MA: Heinle & Heinle Publishers. Baker, D. (1989). Language testing: A Critical survey and practical guide. London: Edward Arnold.
40
Brown, H. D. (2010). Language assessment, principles and classroom practice. New York, NY: Pearson Education. Deutscher, G. (2010, August 26). Does your language share how you think? The New York Times. Retrieved from http://www.nytimes.com/2010/08/29/magazine/29languaget.html?pagewanted=all Ferris, D., & Hedgcock, J. (Forthcoming). Teaching L2 composition: Purpose, process, and practice (3rd ed.). New York, NY: Routledge. Mousavi, S. A. (1999). A Dictionary of language testing. Tehran: Rahnama Publications. Oller, J. W. (1979). Language tests at school. London: Longman Group. Richards, J. C., & Schmidt, R. (2010). Longman dictionary of language teaching & applied
Original Test Design: Placement Exam linguists. London, UK: Pearson Education Limited.
41
Swain, M. (1984). Large-scale communicative language testing: A case study. In S. J. Savignon & M. Berns (Eds.), Initiatives in communicative language teaching (pp. 185201). Reading, MA: Addison-Wesley. Walker, J. (2009, February). Jay Walker: Worlds English mania [Video file]. Retrieved from http://www.ted.com/talks/lang/en/jay_walker_on_the_world_s_english_mania.html Wesche, M. B. (1983). Communicative testing in a second language. Modern Language Journal, 67, 4155. Vargo, M. & Blass, L. (2013). Pathways 1: Reading, writing, and critical thinking. Boston: National Geographic Learning.
Original Test Design: Placement Exam Appendices Appendix A: New Logistics Guide Instructions for Logistics Team for Test Administration DATE NAMES OF ADMINISTRATORS 9:00 AM 12:00 noon General Information
42
1. There will be a reception desk in front of the circulation desk in the library. All students taking the interview will be told to report to the reception desk. You team members will be at that desk to welcome students as they arrive. There will also be a few chairs for students who may have to wait a minute for your attention. 2. You will have a schedule with a list of the interview rooms, interview teams, interview times, and student names. As each student comes to the reception desk, you will find the interview station where they are expected and bring them to the door/entrance. 3. After the first part of the EIIS, the oral interview, students will be directed to return to the reception desk, where one of you will take the student to the appropriate place for the next part of the interview. 4. Here follows a list of locations for the various parts of the interview: Part I: Listening: watching video from YouTube called Ted Talk and answering questions: computer bank in TRC. Each student will need a set of headphones. These are available from a librarian at the circulation desk. Part II, III, IV: Grammar, Reading, & Essay Writing: Reading of article and answering objective questions: large tables or individual chairs with writing surface on the right side of the Teaching Resource Center (TRC). Place one student at each table or chair. Be sure each student has plenty of elbow space and privacy. Part V: Oral Interview: assigned station; see schedule.
Turn to the next page
Original Test Design: Placement Exam Specific Procedure
43
1. Part I: Listening: Give students pages 1, 2, and 3. Take students to one of the computers in the TRC and give them a set of headphones. Point out and remind students to carefully read the instructions on pages 2 and 3. Also point out that page 2 should be used for taking notes on the video presentation. Tell students that they have 10 minutes to complete this section. Note the starting and stopping times. Collect the pages and place in the student folder, from which I will retrieve them. Guide them to return to the reception desk. 2. Part II & III: Grammar & Reading: When students return to the reception desk, give them a copy of pages 4, 5, 6, 7, and 8. Bring students to a table or chair in the Teaching Resources Center (TRC). Instruct the student to read the instructions on both pages carefully. Tell the students they have 25 minutes to complete Part II and III. Write down the student starting and stopping times on the sheet provided. Collect the pages and place in the student folder, from which I will retrieve them. 3. Part IV: Essay Writing: At the same location (TRC), give students a copy of pages 9 and 10. Remind students to read all of the instructions before they begin to write. Remind them to write at least 180 words. Students have 20 minutes to complete this essay. Record starting and stopping times. If the student does not come to you, you should go to the student and inform him or her that time is up. When the student has completed the essay, collect the pages and place in the student folder, from which I will retrieve them. Ask them to sign up for an oral interview. 4. Photocopies and distributionAs soon as you have each students essay/paragraph, make two copies (the reference librarian will give you money) and bring the three copies of the paragraph to the team that interviewed the student. (Be sure the team is not in the middle of an interview with another student.) If the team is waiting till all paragraphs are done, store the three copies in the student folder and make sure the copies get to the right place. 5. Part V: Oral Interview: As students come in, they will be helped right away, or asked to take a seat until a team member is available. Greet the student and ask for the students name. Check your schedule to see where the student will be interviewed and accompany the student to the entrance of the interview station. The interview team will take over from there. THANK YOU!
Original Test Design: Placement Exam Appendix B: Original Logistics Guide
44
1.
2.
3.
4.
Instructions for Logistics Team for EIIS Administration August 24, 2012 Kerrie Best ,Fanny Gonzales Garcia, Giovi Romero, Yushin Tsai 9:00 AM 12:00 noon General Information There will be a reception desk in front of the circulation desk in the library. All students taking the interview will be told to report to the reception desk. You team members will be at that desk to welcome students as they arrive. There will also be a few chairs for students who may have to wait a minute for your attention. You will have a schedule with a list of the interview rooms, interview teams, interview times, and student names. As each student comes to the reception desk, you will find the interview station where they are expected and bring them to the door/entrance. After the first part of the EIIS, the oral interview, students will be directed to return to the reception desk, where one of you will take the student to the appropriate place for the next part of the interview. Here follows a list of locations for the various parts of the interview: a. Part I: Oral Interview: assigned station; see schedule b. Part II: Reading of article and answering objective questions: large tables or individual chairs with writing surface on the right side of the Teaching Resource Center (TRC). Place one student at each table or chair. Be sure each student has plenty of elbow space and privacy. c. Part III: watching video lecture and answering questions: computer bank in TRC. Each student will need a set of head phones. These are available from a librarian at the circulation desk. d. Part IV: writing prompt: large tables or individual chairs with writing surfaces on right hand side of TRC Specific Procedure 1. Part I--As students come in, they will be helped right away, or asked to take a seat until a team member is available. Greet the student and ask for the students name. Check your schedule to see where the student will be interviewed and accompany the student to the entrance of the interview station. The interview team will take over from there. 2. Part IIWhen students return to the reception desk, give them a copy of pages 9 and 10. Bring students to a table or chair in the Teaching Resources Center (TRC). Instruct the student to read the instructions on both pages carefully. Tell the students they have 20 minutes to complete this part of the interview. Write down the student starting and stopping times on the sheet provided. If students do not return to you 20 minutes after they have started, go to the CRC and politely inform them that time is up. Collect the pages and place in the student folder, from which I will retrieve them. 3. Part IIIGive students pages 16, 17, and 18. Take students to one of the computers in the TRC and give them a set of headphones. Point out and remind students to carefully read the instructions on page 18. Also point out that page 17 should be used
45
for taking notes on the video-taped lecture and that the chart on page 16 is a copy of a chart shown briefly during the lecture. Tell students that they have 20 minutes to complete this section. Note the starting and stopping times. Again, if the student does not come to you after 20 minutes, you should go to the student. You should collect the answer sheet, page 18, but instruct the student to keep the chart and the notes for use with the final part of the interview. 4. Part IVTake students, with their pages 16 and 17, back to a table or chair in the TRC. Give them copies of pages 20 and 21 (this is one sheet that has the instructions and a lined area for writing the essay). Remind students to read all of the instructions before they begin to write. Remind them also that they can refer to their notes. Students have 30 minutes to complete this final part of the interview. Record starting and stopping times. Ask students to bring their completed mini-essay to one of the team at the reception desk. As always, if the student does not come to you, you should go to the student and inform him or her that time is up. When the student has completed this final part of the interview, please direct him or her to the Commons for lunch. 5. Photocopies and distributionAs soon as you have each students essay/paragraph, make two copies (the reference librarian will give you money) and bring the three copies of the paragraph to the team that interviewed the student. (Be sure the team is not in the middle of an interview with another student.) If the team is waiting till all paragraphs are done, store the three copies in the student folder and make sure the copies get to the right place. THANK YOU! THANK YOU! THANK YOU!
46
Appendix C: Oral Interview Schedule Sample INTERVIEW SCHEDULE ENTRANCE INTERVIEW FOR INTERNATIONAL/ESL STUDENTS Friday August 24, 2012, John and Louise Hulst Library, Upper Level
ORAL ROOM 262 INTERVIEW L. VAN BEEK BEGINS J. VERSLUIS C. HENTGES Ivy Mangeli Kenya, ex. Alba Garcia Macias Mexico, fr. Bit Null Ryu South Korea, ex. Carolyne Muthoni Kenya, fr.
ROOM 263 H. SCHAAP D. ROTH S. GRONECK Winnie Obiero Kenya, fr. Eun Hye Jee South Korea, ex. Jung Eun Sun South Korea, ex. Dong Hyun Park South Korea, fr.
ROOM 264 L. ZUIDEMA B. KUIPER K. SANDOUKA
9:00 AM (10:30) 9:20 AM (10:50)
9:40 AM (11:10) 10:00 AM (11:30)
Yonatan Ashenafi Ethiopia, fr. Young In Kim South Korea, ex. Fortunate Magara Uganda, ex.
ALCOVE M. DENGLER N. VAN GAALEN A. FOREMAN Henry Murray Panama, tr. Eui Shin Kim South Korea, ex. David Baldusi Alves Brazil, fr.
REFERENCE CORNER S. TAYLOR I. MULDER M. DRISSEL
Juan Benitez Gonzalez Paraguay, fr. Ju Eun Park South Korea, ex. Ji Eun Kim South Korea, ex.
There will be a reception desk in front of the main circulation desk of the library and a logistics team to welcome and move our students to and from various parts of the interview. Team members are: Kerrie, Giovi, Yuhsin, Fanny, and Sanneke Kok, Coordinator of Academic Services for International Students.
47
Appendix D: Answer Key with Scoring Criteria I. Listening Comprehension: Watch the video English Mania presented by Jay Walker from Ted Talk (about 4 minutes) and answer the following questions. Please answer in less than 50 words (ALL ANSWERS MUST BE IN COMPLETE SENTENCES EXCEPT FOR QUESTIONS 2 & 3) Transcript1 Let's talk about manias. Let's start with Beatle mania: hysterical teenagers, crying, screaming, pandemonium. Sports mania: deafening crowds, all for one idea -- get the ball in the net. Okay, religious mania: there's rapture, there's weeping, there's visions. Manias can be good. Manias can be alarming. Or manias can be deadly. The world has a new mania. A mania for learning English. Listen as Chinese students practice their English by screaming it. Teacher: ... change my life! Students: I will change my life. T: I don't want to let my parents down. S: I don't want to let my parents down. T: I don't ever want to let my country down. S: I don't ever want to let my country down. T: Most importantly ... S: Most importantly ... T: I don't want to let myself down. S: I don't want to let myself down. Jay Walker: How many people are trying to learn English worldwide? Two billion of them. Students: A t-shirt. A dress. JW: In Latin America, in India, in Southeast Asia, and most of all in China. If you are a Chinese student you start learning English in the third grade, by law. That's why this year China will become the world's largest English-speaking country. (Laughter) Why English? In a single word: Opportunity. Opportunity for a better life, a job, to be able to pay for school, or put better food on the table. Imagine a student taking a giant test for three full days. Her score on this one test literally determines her future. She studies 12 hours a day for three years to prepare. 25 percent of her grade is based on English. It's called the Gaokao, and 80 million high school Chinese students have already taken this grueling test. The intensity to learn English is almost unimaginable, unless you witness it. Teacher: Perfect! Students: Perfect! T: Perfect! S: Perfect! T: I want to speak perfect English. S: I want to speak perfect English. T: I want to speak -- S: I want to speak -T: perfect English. S: perfect English. T: I want to change my life! S: I want to change my life!
1
Walker, J. (2009, February). Jay Walker: Worlds English mania [Video file]. Retrieved from http://www.ted.com/talks/lang/en/jay_walker_on_the_world_s_english_mania.html
48
JW: So is English mania good or bad? Is English a tsunami, washing away other languages? Not likely. English is the world's second language. Your native language is your life. But with English you can become part of a wider conversation: a global conversation about global problems, like climate change or poverty, or hunger or disease. The world has other universal languages. Mathematics is the language of science. Music is the language of emotions. And now English is becoming the language of problem-solving. Not because America is pushing it, but because the world is pulling it. So English mania is a turning point. Like the harnessing of electricity in our cities or the fall of the Berlin Wall, English represents hope for a better future -a future where the world has a common language to solve its common problems. Short-answer questions: Spelling errors are allowed; Deduct 1 pt when the sentences are not complete except for question 2 & 3 (2 pts per question =10 pts total) 1. In your own words, define the word mania. *Acceptable words: enthusiasm, passion, craze, popular trend generating wide enthusiasms, hysteria, craziness, alarming 2. How many people are trying to learn English worldwide? *Answer: 2 (two) billion 3. Name at least 3 countries/regions that the speaker mentioned that are manias for English? *Answer: Latin America, India, Southeast Asia, and China 4. According to the speaker, why are so many people trying to learn English? *Answer: 2pt: opportunity, for better life, hope, language of problem solving, worlds second language (full credit); 1pt: acceptable words: job, pay for school, put better food on the table, academic achievement; 0 pt: no mention of any of the words 5. What is the speakers opinion on English mania? *Answer: English mania is more positive than negative; English mania is positive; it is a turning point=2 pts; no mention of good=0 pts.
49
II. Grammar: The passage was taken from the New York Times newspaper, published on April 3, 2012. Read the following passage and cross out 15 extra words that make the sentences grammatically incorrect (15 pts total). Example: The boys is are singing the national anthem. How Immersion Helps to Learn a Language2 Answer key: The crossed out words are bolded Learning (1) a the foreign language is never easy, but contrary to common wisdom, it is possible for adults to process a language (2) a the same way (3) a the native speaker does. And over time, processing improves even when the skill goes unused, researchers are reporting. For (4) there their study, (5) in on the journal PloS One, the scientists used an artificial language of 13 words, completely different from English. Its totally (6) unpractical impractical to follow someone to high proficiency because it takes years and years, said the lead author, Michael Ullman, a neuroscientist at Georgetown University Medical Center. The language dealt with pieces and moves in (7) a the computer game, and the researchers tested proficiency by asking test subjects to play (8) a the game. The subjects (9) are were split into two groups. One group studied the language in a formal classroom setting, while the other (10) was were trained through immersion. After five months, both groups retained the language (11) even though because they had not used it at all, and both displayed brain processing similar to that of a native speaker. But the immersion group displayed the full brain patterns (12) for of a native speaker, Dr. Ullman said. The research has several applications, Dr. Ullman said. This should help us understand how foreign-language learners can achieve native like processing with (13) increase increased practice, he said. It makes sense that youd want to have your brain process like (14) a the native speaker. And though it may (15) take takes time, and more research, the work also could or should help in rehabilitation of people with traumatic brain injury, he added.
Bahanoo, S. (2012, April 3). How immersion helps to learn a new language. The New York Times. Retrieved from http://www.nytimes.com/2012/04/03/science/how-immersion-helps-to-learn-a-new-language.html
Original Test Design: Placement Exam III. Reading Comprehension: (10 pts- 1pt each) Passage 1: The Worlds Oldest First Grader 1. Based on the passage, we can infer that before 2003, primary education in Kenya was: a. Not cheap b. Not available c. Prohibited d. Free 2. Why was Maruge motivated to study? a. To be in one of the top five students in his class. b. To use his education to read the Bible. c. To become the schools student leader. d. To study Swahili, English, and math. 3. Who did NOT want Maruge to be in school? a. Kenyan government b. First grade parents c. Jane Obinchu d. None of the above 4. The main idea in paragraph (E) is: a. People were fighting and burning houses in the village. b. It was too difficult to live in a tent at a refugee camp. c. Maruge did not stop studying, even during those difficult times. d. Maruge taught other residents of the home to read and write. 5. The main idea in paragraph (G) is: a. Maruge was an inspiration to other adult Kenyans. b. Kenyans enjoyed the movie The First Grader. c. Thoma Litei decided to go to school to learn. d. The First Grader was created after Maruges death. Passage 2: 1. The authors attitude to Whorfs theory is a. Ambivalent b. Neutral c. Supportive d. Contemptuous
50
Original Test Design: Placement Exam 2. The word trauma in the passage is closest in meaning to a. Physical injury b. Torture c. Emergency d. Agony 3. All of the following can be inferred from the text EXCEPT a. Learning our mother tongue can lead to positive experiences. b. The influence of mother tongue on our thoughts is significant. c. Whorfs theory was based on hard facts and solid common sense. d. Whorf failed to provide any evidence to support his theory.
51
4. The author uses the word crash-landed to imply that Whorfs theory was _________ hard facts and solid common sense. a. in favor of b. based on c. inconsistent with d. critical of 5. Which of the sentences below best expresses the essential information in the boldfaced sentence in the passage? a. Exploring the relationship between the mother tongue and our thoughts was frowned upon for decades. b. People reacted severely and they explored the relationship between the mother tongue and our thought. c. Whorfs theory succeeded in exploring the relationship between the mother tongue and our thoughts. d. Whorfs claims were so credible that no researcher made an attempt to dishonor Whorf for decades.
Original Test Design: Placement Exam IV. Mini-essay writing: Write a mini-essay about 180-250 words according to the following prompt. You will be tested on the following criteria: content, organization, and grammar. Do you think learning English is important? If so, why or why not? Please provide personal examples to support your stance (Total 100 pts). Content Clearly relates or answers to the given topic or question Gives sufficient examples/references Clear connection between examples/references and main ideas Correct use of vocabulary words Sufficient number of words (180-250) Scoring: circle the appropriate score Clear 543210 Missing Sufficient 543210 Lacking Clear 543210 Missing Correct 543210 Incorrect Target #: 5; 160-179 words: 4; 140-159 words: 3; 120-139 words: 2; 100-119 words: 1; less than 100 words: 0 _________/25 Scoring: circle the appropriate score Clear 54Not Clear 321Missing 0 Clear 54Not Clear 321Missing 0 Always 54Sometimes 321 Never 0 Always 54Sometimes 321Never 0
52
Subtotal: points for content Organization Topic or introductory sentence Concluding sentence Coherence (logical progression and development of ideas, good flow) Cohesion (good connections between sentences) Sentence variety (both simple and compound and/or complex) Subtotal: points for organization Grammar Correct spelling (subtract 1 pt .ea. new error) Correct use of articles and prepositions Standard capitalization Standard punctuation (periods, commas, semicolons) Standard sentence word order Agreement between subjects & verbs, nouns and pronouns/antecedents Correct verb tense and usage Correct adverb and adjective usage Appropriately placed phrasal modifiers Standard academic diction (avoidance of slang and informal language) Subtotal: points for grammar TOTAL POINTS
Good Variety 54Some Variety 321__Never 0 ________/25 Scoring: take off one point for each error in the categories indicated. Circle the # of remaining pts. 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 5 5 5 4 4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 0 0 0 0
_______/50 _______/100
Original Test Design: Placement Exam V. Oral Interview: 25 pts total In the United States, many universities require students to learn an additional language other than their native language.
53
Do you think universities in your home country should require students to learn an additional language (other than your native language)? Why or why not? You have 2 minutes to prepare. Use the space below to write down an outline or important points that you want to discuss. You will be given maximum 3 minutes to answer the question. You can use your notes to talk but do not read aloud what you have written out. Please relate the issue to your personal experience and cultural background. *For this subjectively scored portion, the following criteria will be assessed: Content and Fluency & Accuracy; we will use an analytic scale: Oral Interview Criteria Content Clearly relates or answers to the given topic or question Scoring: circle the appropriate score Clear 543210 Missing
Gives adequate and meaningful examples/references Sufficient 543210 Lacking Clear connection between examples/references and main ideas Correct use of vocabulary words Accuracy Correct use of grammar Clear pronunciation of words Fluency Coherence (logical progression and development of ideas, good flow) Fluency in speech (with few use of circumlocution and few hesitation) TOTAL POINTS Always 54Sometimes 321 0 Never Fluent 54 Somewhat Fluent 32 10 Not Fluent ______/40 Correct 543210 Incorrect Clear 54Not Clear 3210 Missing Clear 543210 Missing Correct 543210 Incorrect
Original Test Design: Placement Exam Appendix E: Dordt College Placement Test (DCPT)
54
Instruction: The placement test consists of 5 sections (about 1 hour total) For the listening comprehension, the test instructor will play a short video. I. Listening Comprehension (10 minutes) II. Grammar (5 minutes) III. Reading Comprehension (20 minutes) IV. Mini-Essay Writing (20 minutes) After you complete all four sections, submit your test to the test instructor and schedule a time to do the oral interview section. The test instructor will provide you the oral interview section of the test. You will be interviewed individually and the interview will be audiorecorded. V. Oral Interview (about 5 minutes)
Name: ___________________
Original Test Design: Placement Exam I. Listening Comprehension: Short-Answer Questions
55
Watch the video Worlds English Mania presented by Jay Walker from Ted Talk (about 4 minutes).3 Use the space below to take notes. After watching the video, answer the following five questions. Please answer in less than 50 words (ALL ANSWERS MUST BE IN COMPLETE SENTENCES EXCEPT FOR QUESTIONS 2 & 3) Use this space to take notes
TURN TO NEXT PAGE FOR QUESTIONS

3
Walker, J. (2009, February). Jay Walker: Worlds English mania [Video file]. Retrieved from http://www.ted.com/talks/lang/en/jay_walker_on_the_world_s_english_mania.html
Original Test Design: Placement Exam Please answer in less than 50 words (ALL ANSWERS MUST BE IN COMPLETE SENTENCES EXCEPT FOR QUESTIONS 2 & 3) 1. In your own words, define the word mania.
56
2. How many people are trying to learn English worldwide?
3. Name at least 3 places (countries or regions) that the speaker mentioned that HAVE manias for English?
4. According to the speaker, why are so many people trying to learn English?
5. What is the speakers opinion of English manias?
II. Grammar: The passage was taken from the New York Times newspaper, published on April
Original Test Design: Placement Exam 3, 2012.4 Read the following passage and cross out 15 extra words that make the sentences grammatically incorrect. Example: The boys is are singing the national anthem. How Immersion Helps to Learn a Language
57
Learning a the foreign language is never easy, but contrary to common wisdom, it is possible for adults to process a language a the same way a the native speaker does. And over time, processing improves even when the skill goes unused, researchers are reporting. For there their study, in on the journal PloS One, the scientists used an artificial language of 13 words, completely different from English. Its totally unpractical impractical to follow someone to high proficiency because it takes years and years, said the lead author, Michael Ullman, a neuroscientist at Georgetown University Medical Center. The language dealt with pieces and moves in a the computer game, and the researchers tested proficiency by asking test subjects to play a the game. The subjects are were split into two groups. One group studied the language in a formal classroom setting, while the other was were trained through immersion. After five months, both groups retained the language even though because they had not used it at all, and both displayed brain processing similar to that of a native speaker. But the immersion group displayed the full brain patterns for of a native speaker, Dr. Ullman said. The research has several applications, Dr. Ullman said. This should help us understand how foreign-language learners can achieve native like processing with increase increased practice, he said. It makes sense that youd want to have your brain process like a the native speaker. And though it may take takes time, and more research, the work also could or should help in rehabilitation of people with traumatic brain injury, he added.
Bahanoo, S. (2012, April 3). How immersion helps to learn a new language. The New York Times. Retrieved from http://www.nytimes.com/2012/04/03/science/how-immersion-helps-to-learn-a-new-language.html
Original Test Design: Placement Exam III. Reading Comprehension: Passage 1: The passage was taken from the National Geographic Learning.5 Read the passage below and answer the multiple-choice questions following the passage. Circle the letter of the best answer. The Worlds Oldest First Grader
58
On January 12, 2004, Kimani Maruge knocked on the door of the primary school in his village in Kenya. It was the first day of school, and he was ready to start learning. The teacher let him in and gave him a desk. The new student sat down with the rest of the first graderssix- and seven-year-old boys and girls. However, Kimani Maruge was not an ordinary first grader. He was 84 years oldthe worlds oldest first grader. Kimani Maruge was born in Kenya in 1920. At that time, primary education in Kenya was not free, and Maruges family didnt have enough money to pay for school. When Maruge grew up, he worked hard as a farmer. In the 1950s, he fought with other Kenyans against the British colonists. After years of fighting, Kenya became independent in 1963. In 2003, the Kenyan government began offering free primary education to everyone, and Maruge wanted an education, too. However, it wasnt always easy for Maruge to attend school. Many of the first graders parents didnt want an old man in their childrens class. School officials said that a primary education was only for children. But the school principal, Jane Obinchu, believed Maruge was right. With her help, he was able to stay in school. Maruge was a motivated and successful student. In fact, he was one of the top five students in his first grade class. In second grade, Maruge became the schools student leader. He went as far as seventh grade, the final year of primary school. Over the years, Maruge studied Swahili, English, and math. He wanted to use his education to read the Bible and to study veterinary medicine. In 2008, there were problems in Kenya after an election. People were fighting and burning houses in Maruges village. Maruge moved to a refugee camp for safety and lived in a tent. However, even during those difficult times he continued to go to school. Later that year, he moved to a home for the elderly. He continued going to school, and even taught other residents of the home to read and write. In 2005, Maruge flew in a plane for the first time in his life. He traveled to New York City, where he gave a speech at the United Nations. He spoke about the importance of education and asked for help to educate the people of Kenya. Maruge also wanted to improve primary education for children in Africa.
The passage was printed in Vargo, M. & Blass, L. (2013). Pathways 1: Reading, writing, and critical thinking. Boston: National Geographic Learning.
59
Maruge died in 2009, at age 89. However, his story lives on. The 2010 movie The First Grader showed Maruges amazing fight to get an education. Many older Kenyans decided to start school after seeing The First Grader. One of those people was 19-yearold Thoma Litei. Litei said, I knew it was not too late. I wanted to read, and to know more language, so I came [to school] to learn. That is why it is important for his story to be known.
1. Based on the passage, we can infer that before 2003, primary education in Kenya was: a. Not cheap b. Not available c. Prohibited d. Free 2. Why was Maruge motivated to study? a. To be in one of the top five students in his class. b. To use his education to read the Bible. c. To become the schools student leader. d. To study Swahili, English, and math. 3. Who did NOT want Maruge to be in school? a. Kenyan government b. First grade parents c. Jane Obinchu d. None of the above 4. The main idea in paragraph (E) is: a. People were fighting and burning houses in the village. b. It was too difficult to live in a tent at a refugee camp. c. Maruge did not stop studying, even during those difficult times. d. Maruge taught other residents of the home to read and write. 5. The main idea in paragraph (G) is: a. Maruge was an inspiration to other adult Kenyans. b. Kenyans enjoyed the movie The First Grader. c. Thoma Litei decided to go to school to learn. d. The First Grader was created after Maruges death.
Original Test Design: Placement Exam Passage 2: The following extract was taken from the article Does Your Language Shape How You Think? published in the New York Times magazine6. Read the passage below and answer the multiple-choice questions following the passage. Circle the letter of the best answer.
60
Benjamin Lee Whorfs theory crash-landed on hard facts and solid common sense, when it transpired1 that there had never actually been any evidence to support his fantastic claims. The reaction was so severe that for decades, any attempts to explore the influence of the mother tongue on our thoughts were relegated2 to the loony3 fringes4 of disrepute5. But 70 years on, it is surely time to put the trauma of Whorf behind us. And in the last few years, new research has revealed that when we learn our mother tongue, we do after all acquire certain habits of thought that shape our experience in significant and often surprising ways. Vocabulary word-bank: 1. transpire: occur, happen 4. fringe: border, trimming
2. relegate: assign, transfer 5. disrepute: dishonor
3. loony: crazy
1. The authors attitude to Whorfs theory is a. Ambivalent b. Neutral c. Supportive d. Contemptuous 2. The word trauma in the passage is closest in meaning to a. Physical injury b. Torture c. Emergency d. Agony 3. All of the following can be inferred from the text EXCEPT a. Learning our mother tongue can lead to positive experiences. b. The influence of mother tongue on our thoughts is significant. c. Whorfs theory was based on hard facts and solid common sense. d. Whorf failed to provide any evidence to support his theory. Turn to the next page
Deutscher, G. (2010, August 26). Does your language share how you think? The New York Times. Retrieved from http://www.nytimes.com/2010/08/29/magazine/29language-t.html?pagewanted=all
61
4. The author uses the word crash-landed to imply that Whorfs theory was _________ hard facts and solid common sense. 6. in favor of 7. based on 8. inconsistent with 9. critical of 5. Which of the sentences below best expresses the essential information in the boldfaced sentence in the passage? e. Exploring the relationship between the mother tongue and our thoughts was frowned upon for decades. f. People reacted severely and they explored the relationship between the mother tongue and our thoughts. g. Whorfs theory succeeded in exploring the relationship between the mother tongue and our thoughts. h. Whorfs claims were so credible that no researcher made an attempt to dishonor Whorf for decades.
62
IV. Mini-essay writing Write a mini-essay about 180-250 words according to the following prompt. You will be tested on the following criteria: content, organization, and grammar. Feel free to use the back page for more space. Do you think learning English is important? If so, why or why not? Please provide personal examples to support your stance (in addition, you may refer to what you have learned from the video English Mania).
END OF SECTION IV SUBMIT YOUR TEST AND SCHEDULE AN ORAL INTERVIEW
Original Test Design: Placement Exam NAME:_________________ V. Oral Interview: In the United States, many universities require students to learn an additional language other than their native language.
63
Do you think universities in your home country should require students to learn an additional language (other than your native language)? Why or why not? You have 2 minutes to prepare. Use the space below to write down an outline or important points that you want to discuss. You will be given maximum 3 minutes to answer the question. You can use your notes to talk but do not read aloud what you have written out. Please relate the issue to your personal experience and cultural background.
Original Test Design: Placement Exam Appendix F: Getting Started Worksheet (Alderson, Clapham, & Wall, 1995) Worksheet for Getting Started on Your Original Test 1. What is the purpose of the test? (How will the information you gather be used? Are you measuring achievement or progress? Are you placing students in a program?)
64
This test serves as an entrance (placement) test for incoming international students (normally 1012 students per semester). These international students will either attend college for all four years (well call them regular students) or just for one year as exchange student. All international students have to take this testthis test will determine whether they can take general English core requirement (e.g., ENG 101). If the students do not pass this test, then they are required to take the ESL coursesReading & Writing and/or Speaking & Listening. It is only one level. Students will be required to take either one of the ESL courses or to take both. 2. What sort of learners will be taking the test? (Describe the 2LLs age, first language[s], purpose for learning the target language, etc.) All international students who are admitted to Dordt College will be required to take this test. As mentioned previously, these international students could either be regular or exchange students. They come from all different countries with diverse L1s. However, based on our interview with the ESL professor, most of the students are from South Korea; there are some students from Turkey, Mexico, and various African countries. The age group ranges from 18 to 25 years old. 3. What language skills should be tested (reading, writing, speaking and/or listening)? We will be testing all four skills. Since there are not very many international students (around 10-12 per semester) and only one ESL level, we decided to revise the current placement/entrance exam. Dordt College is Hala Suns alma mater. 4. What language elements should be tested (grammar, vocabulary, pronunciation, speech acts, etc.)? I. Listening comprehension: Content, Comprehension II. Grammar: Grammar III. Reading comprehension: Vocabulary, Comprehension, Grammar IV. Mini-essay writing: Content, Organization, Grammar V. Oral interview: Content, Fluency and Accuracy (Grammar, Pronunciation, Coherence, and Fluency) 5. What target language situation is envisaged for the test, and is this to be simulated in some way in the test content and method? (For instance, is this a test of academic French? Of English for international TAs? Of Japanese for hotel workers in California?)
English for Academic Purposes in college
Original Test Design: Placement Exam 6. What text types should be chosen as stimulus materialwritten and/or spoken? I. Listening comprehension: 1 approximately four-minute video/audio clip (a speech about Worlds English mania taken from Ted Talk) II. Grammar: 1 written text with grammatical errors III. Reading comprehension: 2 written texts (academic in nature) IV. Mini-essay writing: 1 written essay question V. Oral interview: The administrator will give students a role-play scenario, in which students will have two minutes to prepare their speech and three minutes to perform their speech orally. 7. What sort of tasks are required -- discrete point, integrative, simulated authentic, objectively assessable? (That is, what will the test-takers actually do?) I. Listening comprehension: Students will watch and listen to a video clip (a speech from Ted Talk). Students can take notes while watching/listening to the video clip. Students will then answer 5 short answer questions (they can answer the questions while they are watching/listening to the video). II. Grammar: This is a cloze elide test. Students have to read a text and cross out 15 extra words that make the sentences grammatically incorrect. This requires students editing skills as well as their knowledge in grammar. This is objectively assessable as well since there will be exact answers (words that need to be crossed out). Test scorers will only count the correctly crossed answers; students do not lose points for incorrectly crossing (students will not be aware of this specific aspect of the scoring method to avoid crossing out all or many words as they can). III. Reading comprehension: This test is objectively assessable (multiple-choice questions). After reading two passages, students are required to answer the MC-questions by choosing the best answer (questions will cover comprehension, vocabulary, and grammar aspects).
65
IV. Mini-essay writing: For this test, students have to read the prompt (subject/question/topic of the essay) and write an essay (hand-written); they will be required to write minimum 180 and maximum 250-word essay. V. Oral interview: It is an integrative test examining the use of language elements (Grammar, Vocabulary, Fluency, Comprehension, and Pronunciation). The test takers will have two minutes to prepare their speech and three minutes to perform their speech orally. There will be an analytic scale to assess these language elements.
8. What test methods (what item formats) are to be used? (One multiple-choice subtest is required.) I. Listening comprehension: 5 short answer questions II. Grammar: 1 cloze elide test; crossing out extra (grammatically incorrect) words from 1 written text (15 crossed-out words in total)
Original Test Design: Placement Exam III. Reading comprehension: two 5 multiple-choice questions (total of 10 questions) IV. Mini-essay writing: 1 written essay question V. Oral interview: Responding to 1 question (given orally) 9. How many sections should the test have, how long should they be and how will they be differentiated? (There will be at least three sections more if you are working with another student.) I. Listening comprehension: about 10 minutes II. Grammar: 5 minutes to read and cross out extra/grammatically incorrect words III. Reading comprehension: 20 minutes IV. Mini-essay writing: 20 minutes to answer 1 essay question V. Oral interview: about 5 minutes 10. How many items are required for each section? What is the relative weight for each item?
66
I. Listening comprehension: 5 short answer questions (each is worth 2 points; 10 pts max) II. Grammar: 1 written text with 15 extra/grammatically incorrect words (15 items; 1 pt each; 15 pts max) III. Reading comprehension: 10 questions (two sections of 5 questions (10 pts; 1 pt each item) IV. Mini-essay writing: 1 essay question (but more than one point for scoring; 100 pts max) V. Oral interview: One 3-minute speech (40 pts max) TOTAL Maximum Points: 175 pts 11. What rubrics are to be used as instructions for candidates? (That is, what instructions and guidance are printed in the test and/or announced by the test administrator?) Instructions and guidance are printed in the test in English. For oral interview, test administrators will read the instruction to the student. The student will then be given 2 minutes to prepare his/her speech, responding to the prompt. Once the time is up, the test administrator notifies the student and gives him/her 3 minutes to respond. For listening comprehension, test administrators will play the audio/video file. Students can take notes and proceed to answer the short answer questions as they listen/watch the clip. The audio/video file will only be played once. 12. Which criteria will be used for assessment by markers? (In other words, describe how the answer key will be developed for the objectively scored portion, and explain the rating system for the subjectively scored portion.)
I. Listening comprehension: For this subjectively scored portion, the following criteria will be assessed: Content and Comprehension (understanding the main points). Listening Comprehension Criteria Spelling errors are allowed; Deduct 1 pt when the sentences are not complete except for question 2 & 3. Total possible points: 10 pts.
Original Test Design: Placement Exam 10. In your own words, define the word mania. *Acceptable words: enthusiasm, passion, desire, craze, popular trend generating wide enthusiasms, hysteria, craziness, alarming, deeply fascinated
67
11. How many people are trying to learn English worldwide? *Answer: 2 (two) billion 12. Name at least 3 countries/regions that the speaker mentioned that are manias for English? *Answer: Latin America, India, Southeast Asia, and China; 2 pts when three countries are mentioned; only 1 pt when two countries are correct (one country is incorrect); 0 points for no answer or none of these countries are mentioned 13. According to the speaker, why are so many people trying to learn English? *Answer: 2pt: opportunity, for better life, hope, language of problem solving, worlds second language (full credit); 1pt: acceptable words: job, pay for school, put better food on the table, academic achievement; 0 pt: no mention of any of the words 14. What is the speakers opinion on English mania? *Answer: English mania is more positive than negative; Speakers opinion is neutral is OK; English mania is positive; it is a turning point=2 pts; no mention of good=0 pts. II. Grammar: Objectively scored. For each extra word (choice a) along with the correct word (choice b), we will item that as (Question 1). Students get one mark for each correct answer (15 total). III. Reading comprehension: MC questions Students get one mark for each correct answer (10 total). After piloting this test, we will do the following analyses: Item-discriminability, Item facility, distractor analysis, and response frequency distribution. These analyses would enable us to find out more about the questions and the choices we wrote. IV. Mini-essay writing: Subjectively scored; the following criteria will be assessed; Content, Organization, and Grammar; we will calculate the scores based on our essay criteria and categorize the scores into following score system (analytic). We will use the interrater reliability to test the validity of this section. Essay Criteria Content Clearly relates or answers to the given topic or question Gives sufficient examples/references Clear connection between examples/references and main ideas Correct use of vocabulary words Sufficient number of words (180-250)
Scoring: circle the appropriate score Clear 543210 Missing Sufficient 543210 Lacking Clear 543210 Missing Correct 543210 Incorrect Target #: 5; 160-179 words: 4; 140-159 words: 3; 120-139 words: 2; 100-119 words: 1; less than 100 words: 0 _________/25
Subtotal: points for content
Original Test Design: Placement Exam Organization Topic or introductory sentence Concluding sentence Coherence (logical progression and development of ideas, good flow) Cohesion (good connections between sentences) Sentence variety (both simple and compound and/or complex) Subtotal: points for organization Grammar Correct spelling (subtract 1 pt .ea. new error) Correct use of articles and prepositions Standard capitalization Standard punctuation (periods, commas, semicolons) Standard sentence word order Agreement between subjects & verbs, nouns and pronouns/antecedents Correct verb tense and usage Correct adverb and adjective usage Appropriately placed phrasal modifiers Standard academic diction (avoidance of slang and informal language) Subtotal: points for grammar TOTAL POINTS Scoring: circle the appropriate score Clear 54Not Clear 321Missing 0 Clear 54Not Clear 321Missing 0 Always 54Sometimes 321 Never 0 Always 54Sometimes 321Never 0
68
Good Variety 54Some Variety 321__Never 0 ________/25 Scoring: take off one point for each error in the categories indicated. Circle the # of remaining pts. 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 4 3 2 1 0 5 5 5 5 4 4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 0 0 0 0
_______/50 _______/100
V. Oral interview: For this subjectively scored portion, the following criteria will be assessed: Grammar, Vocabulary, Fluency, Comprehension, and Pronunciation; we will use an analytic scale; we will use interrater reliability to test the validity of this section. Oral Interview Criteria Content Clearly relates or answers to the given topic or question Scoring: circle the appropriate score Clear 543210 Missing
Gives adequate and meaningful examples/references Sufficient 543210 Lacking Clear connection between examples/references and main ideas Correct use of vocabulary words Clear 543210 Missing Correct 543210 Incorrect
69
Accuracy Correct use of grammar Clear pronunciation of words Fluency Coherence (logical progression and development of ideas, good flow) Fluency in speech (with few use of circumlocution and few hesitation) TOTAL POINTS Always 54Sometimes 321 0 Never Fluent 54 Somewhat Fluent 32 10 Not Fluent ______/40 Correct 543210 Incorrect Clear 54Not Clear 3210 Missing
Original Test Design: Placement Exam Appendix G Weshes (1983) Four Components Framework Subtest Listening Stimulus Materials The test-taker watches a video clip of English Mania presented by Jay Walker (2009). The test also contains five short-answer questions related to the content of the video. Task Posed to the Learner The test-taker must watch and listen to the video and identify important information. Learners Response The test-taker must write down their responses to the questions. Scoring Criteria*
70
Questions 2 and 3 (requiring specific number and country names) are marked using the exact word method. The remaining questions are marked using the acceptable word method. Students are given either 2 points or 0 points. For Question 3, partial credit (1 pt) is given when at least two correct countries are mentioned. The test-taker gets points when he/she crosses out the exact incorrect words.
Grammar
The test-taker reads an article from the New York Times (Bahanoo, 2012).
The test-taker must identify 15 extra words inserted within a sentence that makes the sentence ungrammatical based on the structural rules of English; the test-taker must pay attention to the details of the reading to find multiple grammar errors, such as use of articles and tenses.
The test-taker must cross out the extra words.
71
Appendix G (Cont) Wesches (1983) Four Components Framework Subtest Stimulus Task Posed to the Materials Learner The test-taker The test-taker must Reading reads 1 long identify the main passage and 1 ideas of the short passage. readings and The test contains define the meaning 5 multipleof the words choice questions within the given for each passage. context. Miniessay Writing An essay prompt is presented to the test-taker. The test-taker must read and answer to the given prompt. He/She must compose an organized writing with sufficient examples and correct use of vocabulary and grammar. The test-taker must read the prompt, understand the context and adopt the role given in the scenario.
Learners Response The test-taker must circle the letter representing the answer to a question.
Scoring Criteria* The test-taker gets points when they circle the correct letters of the multiple-choice questions, as determined by the established key.
The test-taker must write an essay about 180-250 words that states, explains, and supports his/her opinion on the given prompt.
The test-takers essay is subjectively scored based on an analytic rubric set by the test designers. The rubric consists of three sections, content, organization and grammar.
Oral Interview
A role-play scenario is given to the test-taker.
The test-taker must take 2 minutes to prepare a persuasive speech that states, explains, and supports his/her opinion on the given topic and deliver it within 3 minutes.
The test-takers speech is subjectively scored based on an analytic rubric set by the test designers. The rubric evaluates two aspects of a speech which are content and fluency and accuracy.
*Note. The keys and rubrics of the scoring criteria were all pre-established by the test designers, although the rubric of the oral interview was modified subject to the students responses from the piloting tests.
Original Test Design: Placement Exam Appendix H Swains (1980) Four Principles of Communicative Language Test Development Subtest Listening Start from somewhere Our choice of this procedure is motivated by our intention to simulate an academic situation in which students are given a lecture. Concentrate on content Since the testtakers are international students, the topic of English learning is relevant to them and the video also serves to activate testtakers schemata. Bias for best The test-takers can get visual support besides the audio input. Also, they are allowed to take notes when watching the video. The spelling errors are not marked in test-takers responses to the comprehension questions. The subtest assesses multiple grammar points, such as use of articles, adjectives and verb tense.
72
Work for washback The test-takers can: Experience a situation of taking a real academic lecture. Practice notetaking skills.
Grammar
Citing LarsenFreemans (1991, 1997), Brown (2010) defines grammatical knowledge as: grammatical forms, grammatical meanings and pragmatic meanings.
Students can relate the content to their own experience in language learning.
The test-takers can: Learn to pay attention to the details of the reading passages. Know the meanings are associated with the grammatical forms.
Original Test Design: Placement Exam Appendix H (Cont) Swains (1980) Four Principles of Communicative Language Test Development Subtest Start from Concentrate on Bias for best Work for somewhere content washback Reading The design of the subtest was driven by both the top-downprocessing and the bottom-up-processing of reading comprehension (Richards & Schmidt, 2010). Consistent with the content of the previous subtests, the two articles are also about language learning. The definitions of some difficult vocabulary terms are given in the test. Key words and key sentences are either underlined or bolded for attention. Paragraphs are marked with alphabetic letters for the convenience of reference. The test-takers can use the materials provided on the test to support their opinions.
73
The test-takers can: Expand their vocabulary knowledge. Learn to use context to interpret the meanings of the words. Identify the main ideas from the readings. Paraphrase the reading.
Mini-essay Writing
Through essay writing task, we are able to identify students strengths and weaknesses, including grammar usage and vocabulary knowledge.
The essay prompt, whether learning English is important or not, has been developed through the previous subtests.
The test-takers can: Write in a simulated academic context. Compose an argumentative essay. Incorporate sufficient sources into the writing. The test-takers can: Experience a simulated academic presentation. Give a persuasive speech.
Oral Interview
Besides the concern of using direct test to measure the testtakers oral competence, the construct of the oral test was also inspired by the frequent situations where students are required to orally express their opinions supported by examples in academic settings.
The content is related with the theme of the test, language learning.
The test-takers can use the materials provided on the test to support their opinions. The test-takers have 2 minutes to prepare and jot down some notes for their speech.

Test Design

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Test Design

Transféré par

Droits d'auteur :

Formats disponibles

Language Assessment Design

Dordt College Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam Scoring and Interpretations

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Figure 2 Frequency Polygon for Grammar Subtest

The test-taker must cross out the extra words.

An essay prompt is presented to the test-taker.

A role-play scenario is given to the test-taker.

Original Test Design: Placement Exam

Item(s) 3, 4, and 5 3, 4, and 5 (maybe 1, 2, 7, 8, and 9) 1, 3, 6, 7, and particularly 10 (-0.30)

Original Test Design: Placement Exam

Item(s) 3, 4, and 5 3, 4, and 5 1, 3, 6, 7, and particularly 10 (-0.30) 1, 3, 7, and 10 1, 3, 4, 5, 7, and 10

Standard Error of Measurement (SEM) 0.80

Points Possible 10.00

0.58 0.79 0.67 0.57 Listening

Original Test Design: Placement Exam

0.35 0.62 0.45 0.32 Listening

Original Test Design: Placement Exam Discussion

Original Test Design: Placement Exam

Turn to the next page

Original Test Design: Placement Exam Specific Procedure

Original Test Design: Placement Exam Appendix B: Original Logistics Guide

Original Test Design: Placement Exam

Original Test Design: Placement Exam

ROOM 264 L. ZUIDEMA B. KUIPER K. SANDOUKA

9:00 AM (10:30) 9:20 AM (10:50)

9:40 AM (11:10) 10:00 AM (11:30)

REFERENCE CORNER S. TAYLOR I. MULDER M. DRISSEL

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam

Original Test Design: Placement Exam I. Listening Comprehension: Short-Answer Questions

TURN TO NEXT PAGE FOR QUESTIONS

2. How many people are trying to learn English worldwide?

5. What is the speakers opinion of English manias?

Original Test Design: Placement Exam

2. relegate: assign, transfer 5. disrepute: dishonor

Original Test Design: Placement Exam

Original Test Design: Placement Exam

END OF SECTION IV SUBMIT YOUR TEST AND SCHEDULE AN ORAL INTERVIEW

English for Academic Purposes in college

Subtotal: points for content

Original Test Design: Placement Exam

The test-taker must cross out the extra words.

Original Test Design: Placement Exam

A role-play scenario is given to the test-taker.

Vous aimerez peut-être aussi