Académique Documents
Professionnel Documents
Culture Documents
June 2010
Special Issue on Technology and Learning Vocabulary
ARTICLES
COLUMNS
Emerging Technologies
From Memory Palaces to Spacing Algorithms:
Approaches to Second-Language Vocabulary Learning
PDF
Robert Godwin-Jones
pp. 411
Announcements
News from Sponsoring Organizations
PDF
pp. 1216
REVIEWS
Edited by Sigrun Biesenbach-Lucas
Learning Language and Culture via Public Internet
Discussion Forum
Barbara Hanna & Juliana de Nooy
PDF
Reviewed by Sonja Lind
pp. 1721
Information Technology in Languages for Specific
Purposes: Issues and Prospects
Elisabet Arn Maci, Antonia Soler Cervera, & Carmen
Rueda Ramos
PDF
Reviewed by Ishaaq Akbarian
pp. 2225
Language Learning & Technology is a refereed journal which began publication in July 1997. The journal
seeks to disseminate research to foreign and second language educators in the US and around the world
on issues related to technology and language education.
Language Learning & Technology is sponsored and funded by the University of Hawai'i National
Foreign Language Resource Center (NFLRC) and the Michigan State University Center for
Language Education And Research (CLEAR), and is co-sponsored by the Center for Applied
Linguistics (CAL).
Language Learning & Technology is a fully refereed journal with an editorial board of scholars in
the fields of second language acquisition and computer-assisted language learning. The focus of
the publication is not technology per se, but rather issues related to language learning and
language teaching, and how they are affected or enhanced by the use of technologies.
Language Learning & Technology is published exclusively on the World Wide Web. In this way,
the journal seeks to (a) reach a broad audience in a timely manner, (b) provide a multimedia
format which can more fully illustrate the technologies under discussion, and (c) provide
hypermedia links to related background information.
Beginning with Volume 7, Number 1, Language Learning & Technology is indexed in the
exclusive Institute for Scientific Information's (ISI) Social Sciences Citation Index (SSCI), ISI
Alerting Services, Social Scisearch, and Current Contents/Social and Behavioral Sciences.
Language Learning & Technology is currently published three times per year (February, June,
and October).
CO-SPONSOR
Center for Applied Linguistics (CAL)
Richard Schmidt
Editorial Board
Sigrun Biesenbach-Lucas
Klaus Brandl
Thierry Chanier
Tracey Derwing
Robert Godwin-Jones
Lucinda Hart-Gonzlez
Regine Hampel
Philip Hubbard
Claire Kennedy
Markus Ktter
Marie-Noelle Lamy
Lina Lee
Meei-Ling Liaw
Lara Lomicka
Jill Pellettieri
Bryan Smith
Patrick Snellings
Maggie Sokolik
Susana Sotillo
Paige Ware
Mark Warschauer
Georgetown University
University of Washington
Universite Blaise Pascal
University of Alberta
Virginia Commonwealth University
Second Language Testing, Inc.
The Open University
Stanford University
Griffith University, Brisbane
University of Mnster
The Open University
University of New Hampshire
National Taichung University
University of South Carolina
Santa Clara University
Arizona State University
University of Amsterdam
University of California Berkeley
Montclair State University
Southern Methodist University
University of California, Irvine
Editorial Staff
Editors
Dorothy Chun
Irene Thompson
Trude Heift
Carla Meskill
Managing Editor
Matthew Prior
Carol Wilson-Duffy
Sigrun Biesenbach-Lucas
Georgetown University
Robert Godwin-Jones
Copy Editors
Dennis Koyama
Daniel Jackson
Associate Editors
note of revision. External links will be validated at the time of publication. Broken links will be fixed at
the author's request.
Articles and Commentaries
1. LLT publishes articles that report on original research or present an original framework that links
second language acquisition theory, previous research, and language learning and teaching practices
that utilize technology. Articles containing only descriptions of software, pedagogical procedures, or
those presenting results of surveys without providing empirical data on actual language learning outcomes
will not be considered.
2. General guidelines are available for reporting on both quantitative and qualitative research
(http://llt.msu.edu/resguide.html).
3. Articles should be no more than 8,500 words in length, including references and a 200-word abstract.
Appendices should be limited to 1,500 words. Lengthy appendices should be included as hyperlinks and
sent as separate files in .html or .pdf format.
4. Commentaries are short articles, typically 2,000-3,000 words, that discuss material either previously
published in LLT or otherwise offering interesting opinions on issues related to language learning and
technology.
5. Titles should not exceed 10 words and should be adequately descriptive of the content of the article.
6. All articles and commentaries go through a two-step review process:
Step 1: Internal Review. The editors first review each manuscript to see if it meets the basic
requirements (i.e., that it reports on original research or presents an original framework linking
previous research, second language acquisition theory, and teaching practices), and that it is of
sufficient quality to merit external review. Manuscripts that do not meet these requirements and
are principally descriptions of classroom practices or software are not sent out for further review.
The internal review generally takes 1-2 weeks. Following the internal review, authors are notified
of the results.
Step 2: External Review. Submissions which meet the basic requirements are then sent out for
blind peer review by 3 experts in the field. The external review takes approximately 2-3 months.
Following the external review, the authors are sent copies of the external reviewers comments
and are notified as to the decision (accept as is, accept pending changes, revise and resubmit, or
reject).
Reviews
1. LLT publishes reviews of professional books and software related to the use of technology in language
learning, teaching, and testing.
2. LLT does not accept unsolicited reviews. Contact Review Editor Sigrun Biesenbach-Lucas (lltreviews@hawaii.edu) if you are interested in having material reviewed or in serving as a reviewer. Send
materials you wish to be reviewed to:
Sigrun Biesenbach-Lucas
2133 Comus Court
Ashburn, VA 20147
3. Reviews should provide a constructive critique of the book/software and include references to theory
and research in second language acquisition, computer-assisted language learning, pedagogy, or other
relevant disciplines. They should also include specific ideas for classroom implementation and
suggestions for additional research.
4. Reviews should be limited to 2,000 words. Reviewers are encouraged to incorporate images (e.g.,
screen shots or book covers) and hypermedia links that provide additional information.
5. The following information should be included in a table at the beginning of the review:
Books
Software
Author(s)
Title
Series (if applicable)
Publisher
City and country
Year of publication
Number of pages
Price
ISBN
We have an important reminder for our authors and reviewers. All articles, columns, and
reviews should now be submitted online through ScholarOne Manuscripts. Full
instructions and support are available at http://mc.manuscriptcentral.com/llt and a user
ID and password can be obtained by authors and reviewers on their first visit. We thank
you for your patience during the transition period.
If you are not already a subscriber, please take a few minutes to fill out our free
subscription form. This enables us to compile useful statistics about the readership of
our journal.
We wish you a restful summer and look forward to receiving contributions from all over
the world and especially those dealing with L2s other than English.
Sincerely,
Irene Thompson and Dorothy Chun
Editors-in-Chief
EMERGING TECHNOLOGIES
FROM MEMORY PALACES TO SPACING ALGORITHMS:
APPROACHES TO SECOND-LANGUAGE VOCABULARY LEARNING
Robert Godwin-Jones
Virginia Commonwealth University
An essential element of language learning is building ones personal store of words and expressions, a
necessary component to improving competency in all areas of communication. In classroom settings,
introductory textbooks provide a controlled set of new vocabulary items unit by unit, usually introduced
first in dialogs and then gathered together in a list at the end of the chapter. While teachers urge students
to learn new words in context, through dialogs or reading, many ignore the advice and focus on learning
from lists, one column in the target language (L2), the other in the students native language (L1). Flash
cards continue to be a popular method of working with vocabulary, typically with the L2 and L1 on
opposite sides. Today electronic texts and dedicated software programs provide considerable help in
increasing vocabulary through reading (or listening) and in learning a targeted set of words or
expressions. Glosses in L1 or L2 are often available, as may be multimedia annotations in the form of
graphics, audio, or even video clips. Links to electronic dictionaries or to other on-line resources are also
common. There has been considerable experimentation on the part of CALL (computer assisted language
learning) practitioners with different kinds and combinations of glosses and other comprehension and
vocabulary study aids. While many studies have analyzed the results of these projects, there has been less
attention paid to programs which are dedicated, not to incidental vocabulary learning through reading, but
to the intentional study of sets of vocabulary. This is an area of SLA (second language acquisition) that
has been out of fashion, as the sense of rote learning it evokes is out of step with the model of
communicative language learning, fostering as it does the idea of language as something mechanical and
fixed. Yet, in this area, some interesting and innovative work is being done which goes well beyond
simplistic electronic flash cards. In fact, integrating sophisticated tools for dedicated vocabulary study
into on-line environments for language learning offers an important resource for serious, long-term
language learning.
VOCABULARY STUDY THROUGH READING
Language learners at the novice and elementary levels typically work with specific lists of new
vocabulary. As learners advance in their language study, they tend to learn new words not from formal
introduction in textbooks but through reading or listening, deciphering unknown expressions through their
contextual use, root meaning, structure, or similarity to known items, or by simply looking them up in a
reference work. Of course, part of this process may include keeping ones own list of new words, but
principally one learns by seeing or hearing words repeatedly, used in different ways, and gradually
acquiring their meaning. Learning vocabulary in this way, through context, makes it much more likely
that more understanding of its correct usage will be gained than through learning an item from a list, or
from its appearance in a single (inauthentic) dialog. Seeing the new item in actual use also provides more
information on variations it may undergo, such as stem changes, inflections, or affixes, all important
aspects of being able to actually use a recently acquired item in real communication.
For language learners of my generation, there were few options other than a dictionary to discover the
meaning of a new word whose meaning couldnt be guessed through contextual clues. I can recall well
the experience of reading my first full-length novel in German, The Magic Mountain, by Thomas Mann, a
wonderful novel, but a foolish choice for an intermediate-level German student. My list of new words was
long and kept on note cards in the order they appeared in the novel. The volume of unknown words made
oversight unwieldy and searching inefficient, leading to the frustration of repeatedly looking up the same
Copyright 2010, ISSN 1094-3501
Robert Godwin-Jones
word and adding redundant items to my cards. Of course, one could argue that the multiple look-up and
recording activity increased my time on task and the likelihood the word would be retained, but it was
only my stubbornness that prevented me from giving up a few chapters into the first volume of the novel.
No sensible language teacher today would encourage such a reading choice, ignoring the findings of SLA
research that provide guidance on the kinds of text to choose and the percentage of assumed new
vocabulary best suited to learners at different levels (not above 5%; see Dodigovic, 2005). Also, today the
text to be used is likely to be available in an electronic format, as digitized by the teacher, found in a
collection of e-texts, supplied by a textbook publisher, or discovered on the Web. This allows for look-up
in one of many on-line electronic dictionaries, some of which can be configured to look up automatically
an item through an action such as double clicking or mouse hovering. For Japanese texts, for example, an
extension to the Firefox browser, rikaichan, provides pop-up definitions; Maryamsoft does the same for
Persian. Additionally, tools for automatic glosses of texts are available, including Gymn@zilla and
the Berkeley Interlinear Text Collector (BITC). Looked-up items can be put into an electronic list
alphabetically arranged or even into a database, for future look-up or study. Of course, if an edition has
been especially prepared for language learning, even more help is likely to be available, including a
dedicated glossary, notes, grammar explanations, and cultural information. There may be a dedicated
vocabulary tool as well, which not only keeps lists of new words but also might offer exercises or games
for working with and retaining the new vocabulary. Lextutor (also known as the Compleat Lexical Tutor)
features a powerful array of tools for working with vocabulary from texts.
On the other hand, one may elect to go in the other direction, starting with a particular set of vocabulary
and finding a text that illustrates its use. This can be done by using a customized word list (for example,
from items included in given chapters of a textbook) or a standard list such as the Academic Word
List for English or the HSK character set (Hny Shupng Kosh, Chinese Proficiency Test) for
Mandarin. One could then do a search based on that vocabulary set, or a subset, to find appropriate texts
in language corpora, or on the Web. Using a language corpus has the added benefit of the availability of
an associated tool such as a concordance, to provide helpful resources, such as a keyword in context
listing. Lextutor and Antconc provide useful tools for working with language corpora for vocabulary
study. One recent study describes using word frequency lists to generate appropriate readings in English
for Chinese students from a corpus generated by an on-line journal (Huang, 2007).
CALL literature of the past decade is rich in studies of incidental vocabulary learning and the efficacy of
various approaches to glossing. Some point to the desirability of dual-coding (i.e., providing both a
translation/definition and an image; Sadoski, 2005). Others show advantages of multimedia glosses (Xu,
2010). On the other hand, some results of extensive glossing have been puzzling (Yoshi & Flaitz, 2002,
p. 49) or discouraging, and even troubling (Chun, 2007, p. 242). Studies highlight how dependent the
usefulness of glosses is on individual learners and the context in which they are working (Ma & Kelly,
2006); factors such as motivation, learning style, and cultural background all may play a significant role.
Because there does not seem to be a consensus on the best universal approach to providing
comprehension aids to texts, it may be advisable to provide a variety of options to the learner and allow
for some degree of personal choice and customization.
MEMORY PALACES AND MNEMONIC ELABORATION
Some studies have shown that collaborative or peer work with vocabulary building can be beneficial
(Horst, Cobb, & Nicolae, 2005; Jones, 2006). One approach which moves in the opposite direction calls
for individual learners to create a highly individualized system for vocabulary retention. This involves
linking in ones mind new vocabulary to something concrete and familiar to the learner, such as items in a
room in ones home. The familiar locale provides a memory hook which can be used to retrieve linked
items by perusing mentally the trajectory through that physical space. This is, in fact, the classical ars
memoriae or method of loci evoked by Cicero in De oratare, used by classical and medieval scholars to
Language Learning & Technology
Robert Godwin-Jones
remember speeches and for aid in recalling all kinds of systemizable knowledge. It was famously used by
the Jesuit Matteo Ricci in 16th century China to help prepare candidates in learning language and culture
for the all-important imperial exam. In our day, it is known to be used by winners of memory contests.
The technique goes by a variety of names including the Roman Room, the Peg System, and the Nook And
Cranny method; also used is the term journeys, as trips through familiar scenes can also be used as a
pegging mechanism. One language teacher has devised a Movie Method, in which films play this same
role, in this case, for helping students remember Japanese vocabulary. An interesting electronic
implementation of the concept of the Memory Palace is the Skill Builder, a vocabulary tool in the Tactical
Iraqi language learning program, developed for the U.S. military. The program makes use of external
structures in an Iraqi environment to introduce and systematize new material. This kind of embodied
language learning seems worth future exploration, perhaps in the context of Second Life or other virtual
environments.
The idea of associating a linguistic term to a concrete image is related to the concept of dual-coding, just
used here in a more elaborate system. Associating a series of images or sequence of events to help in
learning vocabulary seems to be particularly useful for languages such as Chinese, which by the nature of
its Hnz characters seems to invite such a technique. Remembering the Kanji by James Heising provides
a well-known example for Japanese. His method involves breaking down characters into constituent parts
(primitives) and assigning a meaning to each, derived from real or imagined etymology. To remember a
character, a fanciful story is provided which includes the primitives and their meanings. Heisigs method
has a lot of believers as well as number of detractors, but it has inspired creation of a host of software
tools geared to using it for vocabulary study, including KanjiCan (features additional character stories)
and KanjiGym (includes stroke order animations). The popular on-line Reviewing the Kanji also is built
around the Heisig method. Studies have shown (Kuo & Hooper, 2004) that having students create their
own stories or mnemonic elaborations (as Heisig encourages his readers to do), for remembering
vocabulary can be particularly effective. Reviewing the Kanji encourages users to do that and then to
share on-line their character stories with others. Some Japanese learners have taken a further step, by
linking stories together in so-called kanji chains. Idiomteach is a program using a similar system of
etymological elaboration to help students learn English idiomatic expressions (Boers, Demecheleer, &
Eyckmans, 2004). Studies have demonstrated that the more effort a learner puts into figuring out a
meaning and how to retain it (depth of processing), the more likely it is to be remembered (Loucky,
2006). From that perspective any kind of image, story, word associations, or other mnemonic aids a
student actively generates is all for the better.
The keyword method, dating back to the 1970s, also uses graphics as a memory aid. It involves a twostep process: first finding an L1 word which sounds similar to the pronunciation of the L2 item to be
learned, then associating an image or story with both the meaning of the word and the keyword used to
approximate its pronunciation. There has been considerable research on the keyword technique (summary
in Nation, 2001, p. 311), with some doubts raised over how well it works for longer-term retention. It has
fallen out of favor recently, but there are software programs such as Linkword which encourage its use.
Using the L1 in this way may seem counter-intuitive, but in fact there have been projects with promising
results that take advantage of code-mixing for vocabulary study (Celik, 2003).
Another technique using graphic representation as a memory aid is concept mapping. Several studies,
such as that by Bahr and Dousereau (2001), discuss its use in language learning. An intriguing example of
a free but powerful program for creating concept maps is VUE (Visual Understanding Environment),
from Tufts University, a multi-platform tool which incorporates sophisticated search with a flexible
graphic environment for linking items. Using VUE, one can not only link items visually to one another on
a Web page in various ways, but one can also link to information on the Web. Its easy to imagine how
such a program could be used in language learning, to create maps of related vocabulary items or in
techniques such as kanji chaining. VUE allows different kinds of visual linking, so one could conceivably
Language Learning & Technology
Robert Godwin-Jones
use different connectors for different kinds of word associations, such as collocations, meaning families,
synonyms, etc. A similar free concept mapping program is CmapTools, which features synchronous
shared map editing, a great feature for group work. The kind of focus here on lexical fields and word trees
would seem to be especially useful for more advanced students, who could use it for the study of
idiomatic expressions or even proverbs. The Collins Language Revolution system uses mind-mapping
combined with an audio program as its principal strategy for language learning.
SUPERMEMO AND SPACED REPETITION SOFTWARE
A popular blog, Kanjitown, explores an individual experience in learning Japanese. The author uses
extensively the memory techniques described in Remembering the Kanji and implements them through
the use of a software program called SuperMemo. SuperMemo is a Windows-only program created by
Piotr Wozniak (not related to Steve Wozniak, who co-founded Apple Computer). The first version of his
software was released in 1987, the most recent incarnation in 2006. The program, since expanded in
scope, was originally designed to help learn vocabulary on the basis of findings in cognitive psychology,
which go back to the 19th century (German psychologist Hermann Ebbinghauss forgetting curve),
namely that there is a frequent pattern for how people learn (and forget). This pattern dictates a particular
rhythm for reviewing items to be learned until they are committed to long-term memory. Instead of
studying or testing ones knowledge of a set of items every day, it is better to study them one day, wait
perhaps 3 days to study them again, then wait another 7 days after that. In the 1940s, Sebastian Leitner
devised a 5-step process, using index cards in a learning box (Lernkartei, still used in Germany today),
a card file box divided into 5 compartments. Flash cards are moved from the initial compartment (daily
review) to the next if they are remembered, if not they stay put. Each subsequent compartment has a
longer time lag before having its cards reviewed. Being able to remember a card in the final compartment
(reviewed only after a lengthy interval) allows it to exit the system, with the assumption that is now stored
in permanent memory. A number of electronic flash card systems are based on the Leitner system such
as StudyProf (includes a Palm app), Teachmaster (open source, Windows only), and phase-6 (partners
with a number of German publishers). The MemoryLifter used in the Multimedia Learning Suite and
other educational software uses a version of the Leitner method called the box system.
SuperMemo utilizes a similar approach, calculating when it is necessary to review an item just before it is
likely to be forgotten. The process is built around the users actions on reviewing an item, choosing on a
scale of 0 to 5 how well or poorly the item was recalled. The system then schedules a review of that item
based on that score and on the scores of previous viewings. This is not a trivial system to devise, and is
far from being an exact science, as demonstrated by the eleven different spacing algorithms (the first, SM
0, for pen and paper, the latest, SM 11) Wozniak has developed for SuperMemo; a glance at the walkthrough of the various algorithms shows how complex they are. The program in the last few years has
been expanded beyond just vocabulary study, implementing what Wozniak calls incremental reading, a
way to systematically collect texts and schedule the reading on all kinds of subjects. It resembles in a way
the memex universal knowledge system imagined by Vannevar Bush in the 1930s, a system which
some credit as an inspiration for the hypertextual linking system of the World Wide Web. Others have
developed different algorithms for spaced learning, including the Low-First Method, developed by Rika
Mizuno. In contrast to Wozniaks work, there have been a number of studies on the effectiveness of
Mizunos approach in language learning (Nakata, 2008).
SuperMemo is a very powerful program and has many dedicated users. However, its user interface is
awkward, and it is notoriously difficult to customize. In recent years a number of competing spaced
repetition software (SRS) programs have emerged, with better interfaces, more features, and enhanced
flexibility. Mnemosyne is an open source program using the same 0-5 scale but is much simpler to use
than SuperMemo. It also, interestingly, collects data (anonymously) from users, which is collected and
used to try to improve the spacing algorithm (which is based on SM 2). Smart.fm (formerly iKnow) is an
Language Learning & Technology
Robert Godwin-Jones
on-line SRS system which features an API (application programming interface) for developers, allowing
it to be integrated easily into other systems or Web sites. It also features additional ways of working with
vocabulary beyond virtual flash cards, including dictations, games, and quizzes. Other SRS programs of
note include Pauker (includes a Java client for smart phones), FullRecall (touts a self-learning spacing
algorithm) and Surusu (specifically designed to work with Remembering the Kanji).
A program that seems to be attracting a lot of interest recently is Anki. Anki uses the concept of facts,
which is a basic vocabulary item and its definition/explanation/annotation. Fields such as graphics, links,
audio, and notes can be included. A card can be created in a variety of ways based on the fields entered
for that fact. For Chinese, for example, one could have six fields such as character, pinyin, English
equivalent, note/character story, picture, and pronunciation. Multiple cards could be generated from these
fields, one card given in English and asking for the character, then showing when flipped, the character,
plus other fields. Another card could show the character, yet another the pinyin, etc. The ability to easily
create multiple kinds and directions of cards makes Anki much more flexible than other SRS programs.
There are a large number of ready-made card sets, or decks in multiple languages which have been
created for use in Anki. The program can import decks from a variety of other programs. It also features a
plug-in system which allows for further expansion and customization.
OUTLOOK
Anki synchronizes with a free on-line site. It also features a mobile version, with an iPhone app under
development. These features point to the current trends for electronic flash card programs, namely to
permit full Web and mobile access, with desktop syncing. There are also more and more Web based
flashcard systems that allow users to post their own lists to share with others. The most popular of these
sites, including the Flashcard Exchange and Quizlet, have accumulated thousands of sets in multiple
languages. These include lists from a number of standard beginning language textbooks. With the
growing popularity and availability of smarter, full-featured mobile phones, most of what has been
written in recent years about mobile assisted language learning (MALL) is fast becoming irrelevant.
Mobile applications are more and more expected to be as full-featured as desktop applications. An
example of the increasing sophistication of programs running on mobile devices is Flashcard Deluxe, a
powerful SRS program for the iPhone, which syncs with desktops. It offers Leitner style or spaced
repetition study and in-app browsing/download of decks from on-line sites such as Quizlet.
Programs like SuperMemo and Anki are designed for motivated individual learners. For such systems to
work the way they are designed, users must have the discipline to work with the program on a regular
basis, preferably daily. However, due to the spacing schedule such systems use, the amount of time spent
is reduced by working with only the items that are new or need to be reviewed before they are forgotten.
In that way, they are more efficient than a program like Rosetta Stone, which includes in its system
material familiar to the user, along with newer, more challenging items. The simple multiple choice items
and encouraging feedback in Rosetta Stone may make the learner feel good, but it would be interesting to
see a study of its efficacy compared to SRS systems. This is true as well for another well-known language
learning approach, the Pimsleur method, which uses its own kind of constant review (graduated interval
recall) in an audio-only environment. The Gradint program uses the Pimsleur approach. Another area for
investigation would be incorporation of SRS into instructed language learning, including how to
maximize its use with textbooks. One of the obvious integration possibilities is for publishers to provide
vocabulary lists or export options from proprietary flash card programs, so as to allow integration into
popular SRS programs. It would be welcome to see as well more experimentation along the lines of
Virtual Vocabulary (ViVo) from the University of Illinois-Chicago, an on-line environment for learning
German, which integrates SRS into a more comprehensive on-line language environment, featuring a
multimodal presentation with audio, image, sample sentences, writing with spell check, and
grammar/cultural information (Schuetze & Weimer-Stuckmann, 2010). The WUFUN project, for
Language Learning & Technology
Robert Godwin-Jones
teaching English to Chinese students, also features an interesting combination of mnemonic elaborations
with listening comprehension (Ma & Kelly, 2006).
Considerable work has been done in recent years developing frequency word lists in a variety of
languages. This kind of data can be especially useful for integrating into a SRS system, to bring novice
learners up to the point where they can begin incidental vocabulary acquisition through reading. SRS
programs may be particularly useful for languages significantly different from ones own, such as
Chinese or Japanese for speakers of Western European languages. Working with an SRS also probably
makes the most sense for learners without ready access to an immersive environment for the language(s)
they are studying. It may be also be particularly useful for language maintenance, especially for those
studying multiple languages. However, while SRS can be thought of as an extended rehearsal for
language use, it is not close to simulating real language use, particularly when the SRS, as is usually the
case, does not provide contextual examples. SRS involves students taking responsibility for their
language learning. Tom Cobb (2005) advocates going further in that direction, encouraging language
learners to use linguists tools, including language corpora, to provide appropriate materials for their
language study. This might involve using concordances to add examples in context to flashcards. With the
increasing interest in self-guided language study and language maintenance, it seems a no-brainer for
motivated learners to consider such an approach. It would be useful for CALL investigators to study
independent language study using different combinations of tools such as SRS, commercial programs like
Rosetta Stone, on-line services such as Livemocha, podcast sites such as ChinesePod, and social
networking sites. Needed too are studies analyzing longer-term retention, more than the 1-2 weeks
typically considered in vocabulary research projects as long-term recall. Finally, the efficacy of
dedicated vocabulary study for enhancing actual communicative competency needs to be addressed.
REFERENCES
Bahr, G. S., & Dansereau, D. F. (2001). Bilingual knowledge maps (BiK-Maps) in second-language
vocabulary learning. The Journal of Experimental Education, 70(1), 524.
Boers F., Demecheleer, M., & Eyckmans, J. (2004). In P. Bogaards & B. Laufer, Vocabulary in a second
language: Selection, acquisition, and testing (5378). Amsterdam: John Benjamins.
Celik, M. (2003). Teaching vocabulary through code-mixing. ELT Journal, 57(4), 361369.
Chun, D. (2007). Come ride the wave: But where is it taking us? CALICO Journal, 24(2), 239252.
Cobb, T. (2005). Foundations of linguisticsApproaches and concepts: Constructivism, applied
linguistics, and language education. Encyclopedia of language and linguistics (2nd ed.). Retrieved from
http://www.lextutor.ca/cv/constructivism_entry.htm
Dodigovic, M. (2005). Vocabulary profiling with electronic corpora: A case study in computer assisted
needs analysis. Computer Assisted Language Learning, 18(5), 443455.
Horst, M., Cobb, T., & Nicolae, I. (2005). Expanding academic vocabulary with an interactive on-line
database. Language Learning & Technology, 9(2), 90110. Retrieved from
http://llt.msu.edu/vol9num2/horst/default.html
Huang, H. (2007). Vocabulary learning in an automated graded reading program. Language Learning &
Technology, 11(3), 6482. Retrieved from http://llt.msu.edu/vol11num3/huangliou/default.html
Jones, L. (2006). Effects of collaboration and multimedia annotations on vocabulary learning and
listening comprehension. CALICO Journal, 24(1), 3358.
Language Learning & Technology
Robert Godwin-Jones
Kuo, M., & Hooper, S. (2004). The effects of visual and verbal coding mnemonics on learning Chinese
characters in computer-based instruction. Educational Technology Research and Development, 52(3), 23
38.
Loucky, J. (2007). Maximizing vocabulary development by systematically using a depth of lexical
processing taxonomy, CALL resources, and effective strategies. CALICO Journal, 23(2), 363399.
Ma, Q., & Kelly, P. (2006). Computer assisted vocabulary learning: Design and evaluation. Computer
Assisted Language Learning, 19(1), 1545.
Nakata, T. (2008). English vocabulary learning with word lists, word cards and computers; implications
from cognitive psychology for optimal spaced learning. ReCALL, 20(1), 320.
Nation, P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
Sadosky, M. (2005). A dual coding view of vocabulary learning. Reading & Writing Quarterly, 21(3),
221238.
Schuetze, U., & Weimer-Stuckmann, G. (2010). Virtual vocabulary: Research and learning in lexical
processing. CALICO Journal, 27(3), 517528.
Yoshi, M., & Flaitz, J. (2002). Second language incidental vocabulary retention: The effect of text and
picture annotation types. CALICO Journal, 20(1), 3358.
Xu, J. (2010). Using multimedia vocabulary annotations in L2 reading and listening Activities. CALICO
Journal, 27(2), 311327.
RESOURCE LIST
Memory Palaces and Mnemonic Elaboration
Adventures in Kanji-Town: How does Kanji-Town work? How do I create my own Kanji-Town
All Japanese All The Time Dot Com: How to learn Japanese. On your own, having fun and to
fluency.
Chain Method
Context Maps
Dr. Grunebergs Linkword Language Learning System - French, Spanish, German, Italian
KanjiCan
KanjiGym
Mind Mapping: A Wonderful Tool for Managing Vocabulary, Organizing Your Writing, and
Working With Your Tutor - Foreign Language Mastery
10
Robert Godwin-Jones
Flashcards Deluxe
Pauker
SuperMemo Algorithm
SuperMemo is useless!
The Theory Underlying Concept Maps and How to Construct and Use Them
11
12
13
CLEARs Rich Internet Applications initiative has been underway for over three years. RIA is a
research and development lab where our programmers are working on free tools that language
teachers can use to create online language teaching materialsor have their students create
activities themselves! There are currently ten RIAs, including:
o NEW! QuizBreak! (gameshow template for creating Jeopardy-like activities)
o NEW! Scribbles (script teaching tool for non-Roman characters)
o Revisions (process writing and feedback tool)
o Broadcasts (create your own podcasts)
o Worksheets (add multimedia elements to online worksheets)
o Audio Dropboxes (put a dropbox in any web page; students recordings get put into your
dropbox automatically)
14
Celebrating the Worlds Languages: A Guide to Creating a World Languages Day Event (guide)
This free publication provides a step-by-step guide to planning World Languages Day, a
university event for high school students designed to stimulate interest in learning languages and
to highlight the importance of cultural awareness.
Coming Soon!
CAL News
CAL News is our electronic newsletter created to provide periodic updates about our projects and
research as well as information about new publications, online resources, products, and services of
interest to our readers. Visit our web site to sign up.
Center for Research on the Educational Achievement and Teaching of English Language
Learners (CREATE)
Visit the CREATE web site to learn more about CREATE, its research, and free resources. To keep
current on CREATE activities sign up to receive its electronic newsletter, CREATE News.
15
Featured Publications:
Education for Adult English Language Learners in the United States: Trends, Research, and
Promising Practices
Foreign Language Teaching in U.S. Schools: Results of a National Survey
Realizing the Vision of Two-Way Immersion: Fostering Effective Programs and Classrooms
Refugees from Iraq (Expanded Refugee Backgrounder)
Using the SIOP Model: Professional Development Manual for Sheltered Instruction
Whats Different About Teaching Reading to Students Learning English?
Visit CALs Web site to learn more about our projects, resources, and services.
16
17
Sonja Lind
As a result, Hanna and de Nooy posit that language teachers should include discussion forums in their
curriculum, in order to help language learners develop intercultural communicative competence. They
claim that discussion forums can be taught as part of a genre-based approach to writing: Not only may
cultural differences in, for example, letter writing be carried over into a new genre, but a quite different
genre altogether (casual conversation, for example) may emerge as the primary model for practice (p.
37). Appropriate cultural behavior can be learned in forums through explicit commentary [by
moderators] on the appropriateness of contributions, implicit commentary [by moderators or other
users] on [cultural] appropriateness, informal induction of newcomers to the forum by seasoned
contributors, comparisons made with other genres and situations, and, of course, instances of protest
or conflict [by posters who use the forum as a way to express their opinions] (p. 8). However, not all
language learners receive such feedback from teachers or other forum users that inform their cultural
positioning, and some may even receive negative responses from forum users. As a result, the authors
emphasize the need for students to be prepared for negative reactions prior to entering a discourse
community in which there are certain cultural expectations.
At this point, it is helpful to draw a parallel between Hanna and de Nooys genre theory and Jim Gees
model of primary and secondary discourses (Gee, 2001, p. 54). Primary discourses are native social
and linguistic habits of any person; in contrast, secondary discourses are acquired later, learned as part of
socialization into other social and cultural discourses beyond those of ones family; for example,
secondary discourses are learned through socialization in outside groups or institutions, such as schools,
churches, and businesses. As more layers of secondary discourses are added throughout a persons life,
the persons background knowledge becomes more culturally complex. Gee (2001) states that [t]hese
secondary discourses all build on, and extend, the uses of language we acquired as part of our primary
discourse (p. 541). Similarly, Hanna and de Nooy liken the process through which language learners are
gradually exposed to an unfamiliar genre to foreign language discussion forums: with the teachers help,
learners can step further in and become confident and even culturally fluent in the secondary discourse
practices of a forum.
However, as Gee (2001) notes, the primary discourse often impacts the learning of a secondary discourse,
and teachers need to be aware of the cultural influences of the first language. According to Hanna and de
Nooy, just because a forum contributor uses colloquial language and writes in brief sentences in one
language does not mean that this is an appropriate way to contribute to a forum in another language. No
learner can assume that the Internet removes cultural differences (p. 20). The authors continue, Online
behaviour, then, is linked to other culturally determined modes of behaviour, but not in predictable ways
(p. 64). Furthermore, What happens in one online context may not happen in another (p. 39). In other
words, language learners need to be shown by their teachers how one genre, such as discussion forums,
can be read and written differently.
Learning Language and Culture via Public Internet Discussion Forums contains ten chapters and is
divided into two parts, the first part focusing on cultural differences between British and French online
newspapers, and the second part focusing on individual language learners and intercultural
communication styles. Both parts interweave research studies with theory and pedagogical implications.
All research methods are qualitative, and include ethnography, cultural studies, and discourse analysis of
online text. The most interesting chapters are those in which the authors describe their research of French
language forums. In Chapter 3, for instance, the authors analyze the similarities and differences between
two French newspaper forums (Le Monde; Le Nouvel Observateur) and two British newspaper forums
(The Guardian; BBC). The data was gathered between 2000 and 2002. Hanna and de Nooy found that, on
the British sites, comments often dispersed into tangents; contributors typically did not debate with each
other, language was more conversational, contributors used an informal register, and comments were
brief. However, on the French sites (see Figure 1 for screenshot of Le Monde discussion board),
contributors remained strictly on topic even after pages of discussion, debating was robust, language was
Language Learning & Technology
18
Sonja Lind
more formal, contributors used a formal register, and comments were lengthy, up to 500 words each.
Another chapter, Chapter 6, focuses on four French students and their participation strategies in the online
newspaper forum Forums Le Monde in 1999 and 2000. Two of the students were British and two were
American, but all four were studying French. The two British students posted introductions in French
(Hello! My name is...) and asked for penpals, but their introductions were considered trite by other
users, and although they were mostly positively received, they also received a few negative responses. As
a result, the British students became discouraged and did not continue to participate in the discussions.
However, the two American students prefaced their comments by apologizing for their French language
skills. These two students also introduced themselves as foreigners. In sharp contrast to the two British
students, both Americans were welcomed by forum participants. Interestingly, one of the American
students, David, wrote mostly in English, and he too was welcomed as much as Laura, the other
American student, who wrote mostly in French. This study indicates that neither friendliness, an approach
the British students used, nor linguistic accuracy, a skill David lacked, indicated intercultural competence.
Cultural positioning was more important than either fluency or friendliness, and the authors imply that
teachers should emphasize content over accuracy in teaching writing.
19
Sonja Lind
third-year university French class. Students were asked to post at least five contributions to a French
newspaper forum over a four-month semester. However, before posting to the forums, the instruction was
scaffolded, and students familiarized themselves with the communicative conventions in the forums.
Hanna and de Nooy note, It seems advisable therefore to preface any student involvement with an
investigation of what successful participation would mean for a particular forum (p. 117). In this context,
successful participation meant that responses to student postings would be mostly positive, contrary to
the students initial doubts. This finding implies that communicating in the target language may be
viewed positively even if the communication is not free of linguistic errors, as was the case with David,
the American student who mostly wrote in English. Again, the emphasis is not on fluency or errors, but
on how the students positioned themselves culturally.
However, Hanna and de Nooy also caution that while the teachers enjoyed teaching learners how to
contribute to the forums, students were not always enthusiastic about the forum assignments. One class
evaluation revealed that most students preferred oral discussions to online postings, and only five of the
32 students felt that forum participation [was] the most valuable aspect of the course (p. 177). Perhaps
this may have been due to the fact that the students were required to write at least five 300-word posts on
the forums and received grades for their work even though they were not rewarded for accuracy, just for
participation. Unfortunately, the authors do not explore the effect of this evaluation on the value students
placed on the activity.
Nevertheless, Hanna and de Nooy insist, that learning through forums takes students beyond classroom
cultures and learner-to-learner communication. Such forums provide opportunities to join in an authentic
cultural practice in the foreign language on its own terms, for neither teachers nor students determine the
rules and conventions of the online community (p. 186). In other words, the authors maintain that
authentic communication in a foreign language helped students become part of the discourse community
in a real way, not in a way that resembled role-play. Language teachers have long emphasized the need
for authentic communication, and this book certainly helps outline what communication may look like in
practice.
However, Hanna and de Nooy could have addressed the following potentially negative aspects of online
forum participation for language learners. First, while forums do give students, particularly shyer ones, a
chance to communicate with native speakers, most students preferred oral discussions with classmates
and native speakers to forum participation. Implications include a greater emphasis on oral discussions
rather than written assignments in language classes. Second, the text-focused approach of forums can be
alienating for learners with different learning preferences, such as visual, auditory, or kinesthetic styles.
Furthermore, the most popular discussion forums online are gamer or picture rating forums, where the
emphasis is on kinesthetic (gaming) or visual (photos) information. If the authors' goal was for their
students to interact with native speakers in an authentic context, they might have focused on a less textbased genre as well.
Certainly the newspaper forums will provide language learners with insider cultural knowledge, as the
authors contend, but did not state whose culture would be learned. Not all French speakers read or
participate in newspaper forums. However, the authors seem to imply that only formal French is
authentic. In contrast, the authors might have focused on more colloquial French in other forums, such
as a French video gaming forum. Supplemental research on other forums, such as gaming or hobby
forums, would provide a wider perspective on learning and teaching with this technology.
Additionally, a comparison between blogs, traditional course management software, and forums could
also have benefited this book. Blogs and wikis are popular with many language teachers, probably more
so than discussion forums. A recent Alexa review of the top 500 sites on the web indicated that search
engines (e.g., Google), social networking sites (e.g., Facebook), media sites (e.g., YouTube), and blogs
and microblogs (e.g., Blogger, Twitter) represent the most visited sites internationally
Language Learning & Technology
20
Sonja Lind
REFERENCES
Black, R. W. (2006). Language, culture, and identity in online fanfiction. E-Learning, 3(2), 170184.
Bloch, J. (2007). Abdullahs blogging: A generation 1.5 student enters the blogosphere. Language
Learning & Technology, 11(2), 128141. Retrieved from http://llt.msu.edu/vol11num2/bloch/default.html
Gee, J. P. (1997). Thinking, learning, and reading: The situated sociocultural mind. In D. Kirshner & J. A.
Whitson (Eds.), Situated cognition: Social, semiotic, and psychological perspectives (pp. 235259).
Mahwah, NJ: Lawrence Erlbaum.
Gee, J. P. (2001). What is literacy? In E. Cushman, E. R. Kintgen, B. Kroll, & M. Rose (Eds.), Literacy:
A critical sourcebook (pp. 537544). Boston, MA: Bedford/St. Martins.
Krashen, S. D. (1988). Second language acquisition and second language learning. New York: PrenticeHall International.
Lam, W. S. E. (2000). Second language literacy and the design of the self: A case study of a teenager
writing on the Internet. TESOL Quarterly, 34, 457482.
21
22
Ishaaq Akbarian
she found that interaction in lectures was associated with a higher frequency of the use of I, whereas
monologic language in the MICASE showed a tendency towards more frequent use of you. Her
research highlights the usefulness of MICASE as a specialized corpus.
In Exploring Epistemic Modality in Academic Discourse Using Corpora, Rizomilioti examines
epistemic modality (i.e., the speakers confidence or lack of confidence in the truth of the proposition
expressed, p. 55), in three small corpora from articles in biology, literary criticism, and archaeology
journals, aiming to find similarities and differences across disciplines in terms of the expression of both
reduced and emphasized certainty. Overall, the researcher found a high degree of uncertainty in
archaeology articles, whereas certain claims were made with certainty in biology. The highest levels of
certainty were observed in literary criticism. While showing the need for further investigation, this study
produces evidence-derived pedagogical materials for teaching academic writing in the three fields.
Part 2, Computer-Mediated Communication, includes three chapters. Apple and Gilabert Guerrero
report on a collaborative tandem e-mail project between LSP students in Dublin and Barcelona in their
article Finding Common Ground in LSP: A Computer-Mediated Communication Project. They
investigated two tandem groups who wrote to each other one with and one without assigned tasks. The
results showed that in task-assigned learning, LSP learners produc[ed] more language, more regularly
and in a sustained way (p. 85), suggesting that task-based learning is a flexible framework for
implementing tasks and catering to learners specific needs in computer-mediated communication
contexts.
In Uncovering Tasks and Texts Teaching ESP Through Online Workshops, Hussin describes two
interactive online workshops: one to develop cross-cultural communication skills of ESL nursing
students, and the other to teach ESL business students to write a research conference paper and avoid
plagiarism. Hussin discusses the positive features and challenges of each workshop. She concludes that
online workshops are effective in helping students develop and manage their communication skills and
academic writing.
In The SMAIL Project: A Dialogic Approach to Computer-Assisted Language Learning for the LSP
Classroom, Caballero Rodrguez and Ruiz Madrid report on developing a multimedia learning
environment, called SMAIL, to promote autonomous language learning. SMAIL was implemented in
accordance with the European Portfolio for Languages following the guidelines proposed by the
European Council. The system includes a learning styles questionnaire and a proficiency test in the
foreign language (French, Spanish, German, and English) to build a learner profile. The materials are
organized according to different genres, such as instruction leaflets, car ads, and argumentative texts.
Additionally, journey metaphors (i.e., discovering a language through a journey in which the learner
enters or discovers a new culture through its language) act as a blueprint for presenting learning activities
while taking into account learner diversity and autonomy. The authors describe a case study of how
learners interact with SMAIL and choose activities based on their personal interests.
Part 3 is titled Specific Technology-Based Projects in Different Educational Settings. It contains two
chapters. In Technology for Trust, Collaboration, and Autonomy among Asian Students at the
University Level, Devaux, Otterbach, and Cheng who represent three institutions (China, the United
States, and Taiwan) describe a collaborative technology-based project to help LSP students from Taiwan
and China to transition from an educational approach that stresses the accumulation of information and
memorization to a more active one. By situating their project in the zone of proximal development
(Vygotsky, 1978), the authors created a scaffolding environment within which they helped students build
trust, move from isolation to collaboration, and develop autonomous learning. The authors reported a
substantial improvement in writing academic papers, working as a team, building a community,
communicating and collaborating, and conducting original research.
In Networking for Learning and Teaching English for Specific Purposes, Healey describes a three-year
Language Learning & Technology
23
Ishaaq Akbarian
project in which specialists from Oregon conducted workshops aimed at improving skills in e-mail and
Internet use in an EFL/ESP environment in Tunisia where the use of technology and the Internet is scarce.
The project ended in collaboration both among faculty members of different departments and between the
participating institutions.
Part 4, titled Technology and Learner Autonomy in Higher Education, contains three chapters. In
Learning English with Computers at the University Level, Lasagabaster and Sierra used a questionnaire
to investigate the impressions of 59 undergraduate students about their learning experiences in a
university multimedia laboratory in a self-access center in the Basque Autonomous Community. Based on
the results, they suggest that the teachers could analyze the software programs used in the courses and
consider the students reactions to refine the pedagogical quality of these materials and their usage to
better meet LSP students needs (p. 171).
In Using the Internet to Promote Autonomous Learning in ESP, Luzn Marco and Gonzlez Pueyo
explore the use of the Internet for ESP teaching as well as ways in which teachers can exploit Internetbased materials to design activities to promote autonomous learning by ESP learners. They present a
sample Internet-based activity (http://webquest.sdsu.edu/) in which a group of students is presented with
an authentic situation along with a specific task such as solving a problem, or making a decision. The
authors conclude that such activities provide scaffolding and increase motivation, integration, learnercenteredness, and strategy use. They caution, however, that the use of technology does not per se
guarantee autonomy building, unless teachers take into account the students language level and the
relevance of the activities to the students goals, as well as provide guidance, feedback, and support.
In Integration of E-learning into a Tertiary Educational Context, Trinder describes the integration of an
e-learning component into face-to-face Business English classes in Austria. She discusses considerations
in designing web-delivered courses (Chapelle, 2001) such as learner fit, language learning potential, and
focus on meaning and presents students perceptions concerning the effectiveness of the classes. As
Trinder argues, integrating an e-learning component into a program involves a predisposition of the
learner to exploiting it to its best advantage (p. 205). This predisposition depends on learner internal
factors, such as needs and learning styles, interacting with contextual factors (e.g., endorsement by
teachers and peers), and courseware intrinsic factors (e.g. motivation).
Part five, Terminology and Lexis: Teaching and Translation, includes two chapters. Piqu-Angordans,
Posteguillo, and Melcion in The Development of a Computer Science Dictionary, or How to Help
Translate the Untranslateable report on a collaborative effort by researchers at two Spanish and one
British university to develop a bilingual dictionary of computing (English-Spanish/Spanish-English). The
dictionary is based on a corpus of 1,125,768 words, consisting of sub-corpora of different texts and
glossaries, and consists of words selected for relevance, clarity, and economy. Entries include head word,
part of speech, quotations, and technical commentary.
In The Importance of Key Words for LSP, Scott considers the application of the notion of keyness to
LSP. Keyness presupposes an interest in text and textuality (p. 232). There are two main underlying
aspects of keyness: namely, importance and aboutness, or what a communicative event is about. Scott
argues that certain procedures might be used to identify the items with great aboutness as opposed to
those with no aboutness and then to consider what the relationships may be between such keywords and
other items in the middle of such a continuum (p. 234). He then suggests using technological tools to help
students identify the main point(s) of a text. He also discusses different levels and types of context and
gives a useful example for detecting keyness using WordSmith Tools (Scott, 1999). In the remainder of
the chapter Scott looks at keywords from a pedagogical perspective, concluding that they can provide a
further way of raising language awareness (p. 241).
In Conclusions, Arn, Soler, and Rueda discuss the individual chapters from the perspectives of
specialized language, online communication, CALL in LSP, distance education, and learner autonomy.
Language Learning & Technology
24
Ishaaq Akbarian
They believe that it is no longer a matter of how to incorporate technology, but rather how to adapt LSP
practice to a context of constant technological changes (p. 257). The editors list key factors in the
process of adaptation to the changes brought about by IT. Among them are the need to adapt LSP practice
to technological changes; the need for LSP teachers to catch up with their students technological skills;
technological innovation and collaboration among LSP practitioners; and external factors that impact the
adoption of technology by LSP practitioners
Readers of Information Technology in Languages for Specific Purposes: Issues and Prospects will benefit
from the variety of perspectives on the use of technologies in LSP teaching, learning, and research.
However, much more work remains to be done. For instance, future practitioners and researchers could
focus on the application of IT to listening, speaking, reading, and writing skills, as well as grammar,
vocabulary, and pronunciation.
One limitation of the book is that it is limited to papers presented at the 6th International Conference on
Languages for Specific Purposes. As a result, some important areas of LSP are not included. For instance,
the volume could have benefitted from the inclusion of a chapter on the use of IT in LSP assessment. A
chapter dealing with the limitations of technology in teaching LSP would have provided a needed balance
between the positive and negative effects of technology in teaching LSP.
To conclude, Information Technology in Languages for Specific Purposes: Issues and Prospects is an
informative, well written, and timely contribution to the field of second language learning. Despite the
above-mentioned shortcomings, the volume is a helpful reference for LSP practitioners and researchers.
REFERENCES
Chapelle, C. (2001). Computer applications in second language acquisition. Cambridge: Cambridge
University Press.
Dudley-Evans, T., & St. John, M. (1998). Development in English for specific purposes: A
multidisciplinary approach. Cambridge: Cambridge University Press.
Scott, M. (1999). WordSmith Tools 4. Oxford: Oxford University Press. (Also version 5 available at
http://www.lexically.net/wordsmith/)
Vygotsky, L. (1978). Mind in society. Cambridge, MA: Harvard University Press.
25
Why do learners seem to prefer certain types of expressions over others in their discourse?
How can a bundle like the nature of the or at the end of the be incorporated into formal language
instruction?
With regard to the first question, it is now clear that non-natives actually use more of certain favorite
formulaic sequences which they know well and tend to overuse as safe bets, compared to natives (de
Cock, 2000; Foster, 2001; Granger, 1998). However, exactly why some sequences are apparently more
familiar than others is not entirely clear, and most likely involves a complexity that we are only just
beginning to comprehend. For example, Xu, McKenny, and Morgan (2010) also compared Chinese
learner and native corpora and found an overuse of the bundle as we all know. The researchers were able
to trace the phenomenon to a confluence of various factors, including a particular textbook which
prescribed the expression, differences in academic conventions in Chinese, and the relative pervasiveness
of an equivalent of as we all know in Mandarin. And that is just one bundle, from one particular
demographic.
Furthermore, it seems unlikely that widespread incorporation of lexical bundles of the type described in
Chen and Baker will occur on any significant level unless teachers and students alike can be convinced
that their use will result in some kind of concrete advantage (e.g., higher scores on essays). There is some
indirect evidence that this is generally the case (Lewis, 2008; Ohlrogge, 2009), but it is still unknown
whether an increase in any particular type of lexical bundlesuch as the ones natives use to hedge in
26
the Chen and Baker studywill ultimately translate into the kinds of positive gains that could motivate a
more widespread and systematic inclusion of them in courses.
Finally, perhaps of greatest relevance to pedagogy is, if and when such lexical bundles become
recognized as important, how does one go about learning/teaching them? The bundles identified in the
Chen and Baker study are extremely useful to the extent that they shed light on differences that merit
further exploration, but the ease with which the average teacher could explicitly teach those particular
word strings is less clear. Until very recently, no extensive lists were available specifically intended for
pedagogic use, but with the Academic Formulas List (Simpson-Vlach & Ellis, forthcoming) and the
Phrasal Expressions List (Martinez & Schmitt, under review), there seems to be an emerging trend of
resources becoming available that will help enable the selection of multiword lexical items that are
teachable and testable. However, the challenge will of course ultimately be how those items can best be
learned, and there is little doubt that the growing variety of multimedia described in the other three
articles in this special edition will contribute in that regard in ways we have yet to explore.
One such medium can already be considered relatively common in schools today: video. As pointed out
by Sydorenko in her paper Modality of input and vocabulary acquisition, although some research already
exists into the effectiveness of learning vocabulary through video, very few studies have investigated the
vocabulary-learning advantages of using captions (i.e., subtitles) with video, and fewer still have looked
at whether any particular type of modality (e.g., watching with or without captions) may be more
beneficial to the recall of lexis. Her findings have direct implications for language pedagogy: learners
appear to develop better form-meaning links with new vocabulary when video (and audio) is combined
with captions. Naturally, as with the Chen and Baker study, Sydorenkos research will prove valuable for
the evidence it provides, but perhaps even more valuable for the questions it provokes, such as Is there a
type of vocabulary that is learned better through captioned video? and Would the results be any different
in a non-laboratory setting?
Although Sydorenko does not seek to specifically answer the question of what kinds of lexical items were
better learned than others in the study, some clues are offered in the written reports offered by participants
in the study: Most words I learned were accompanied by actions on screen, such as sadites [sit down],
proshu vas [after you] (p. 44). As acknowledged by a number of researchers (e.g., Coulmas, 1981),
language is full of formulae which are attached to certain social routines, and formulaic items like sit
down and after you may be ideally learned through the medium of captioned video.
Two aspects of Sydorenkos study that learners reported having difficulty with were the speed of the
video and/or captions, and when participants did notice a new word or expression, retaining that formmeaning link in their heads. Since participants did not have control over the video in the laboratory setting
in which the study was conducted, one wonders if those complaints would subside if they could control
functions like pause and rewind.
Clearly, one important aspect of technology today is its growing mobility. Ten years ago the use of video
in language instruction was only discussed in the context of the classroom (or the self-access center) and,
much more rarely, at home. However, given the popularization of mobile devices that are increasingly
usable as mobile video-viewing devices (e.g., the iPod and iPhone), the concept of using video in the
fashion described by Sydorenkobut with learners at the controlsis far from science fiction. Although
watching captioned video on such devices may be an eye strain, with the release of the iPad and similar
devices this obstacle can be overcome.
Unlike video, which already has a fairly established tradition in the language classroom already, little is
known about the pedagogic advantages of learning a language through electronic game play, and less still
is known about what if any vocabulary gets picked up by players. To explore that issue, deHaan, Reed,
and Kuwada in The effect of interactivity with a music video game on second language vocabulary recall
asked participants to either play or watch a video game, and tested them on what words were learned
27
under each condition. Although there was a large difference in the immediate uptake of vocabulary
(immediate post-test: watchers 57% vs. players 18%), as advocated in Schmitt (2010), delayed scores are
what really count to evaluate durable learning. Here we find that the watchers still maintained a sizable
advantage (watchers 39% vs. players 13%), however even the players achieved very good results
considering that the test required them to remember and write the written form in the cloze blanks. Laufer
and Goldstein (2004) found that form recall is the most difficult level of mastery of the form-meaning
link, and many other studies into incidental learning do not show anything near this amount of learning
(see Schmitt, 2010) for an overview of incidental vocabulary learning, mainly from reading). This of
course raises the question: How can video games be optimized for vocabulary acquisition? As deHaan et
al. point out, the results of their study suggest that working memory limitations and language-focused
game interactivity will certainly play a role, but there is so much that is yet to be exploredwhat we
know now is that it seems to be worth the exploration.
The final study by Stockwell, Using mobile phones for vocabulary activities: examining the effect of
platform, is a very good example of the value of carrying out studies in non-laboratory settings,
particularly when mobile technology is involved (as is increasingly the norm). It is clear that in addition
to more conventional functions such as telephoning and text messaging, mobile devices are now regularly
used for everyday tasks once done exclusively on desk- and laptop computers like sending email and
Internet browsing. Researchers have begun to investigate the extent to which these relatively new
functions can be used in the classroom. However, as pointed out by Stockwell, the true value in a mobile
phone lies, obviously, in its mobility, and in order to truly understand its viability as a vocabularylearning tool it needs to be observed as it was intended to be usedoutside the classroom. After
collecting data from three cohorts spanning three years, Stockwell found that when learners had the
option to perform the same activities using a personal computer, participants preferred the computer
nearly 80% of the time. Causes cited for their aversion to the use of the mobile phone included issues with
screen size, keyboard size, and data transmission speed. However, once more, given todays trends in the
mobile market, one is compelled to ask what differenceif anya multi-touch enabled device (e.g.,
iPhone, Android) might make. Such handsets are slowly becoming the norm, pushing out conventional
keyboard phones, and allowing for a kind of interactive experience a conventional desktop computer does
not generally afford. Moreover, there is a plethora of apps released on a daily basis for such mobile
phones, and instead of having to access the Internet (as the participants did in the Stockwell study), it is
quite conceivable that a dedicated app might produce more satisfying resultsparticularly one which
provided such a tactile user interface. Finally, although at the time of writing there is still much to be
understood about it, the mobile-like devices like the iPad and its inevitable imitators could completely
overcome the challenge of screen size, while still remaining as portable as the conventional mobile phone.
In summary, what can be said regarding all four papers is that they contribute valuable evidence that
technology is one more medium through which vocabulary learning and teaching can be enhanced, but we
are still very far away from being able to claim that electronic media can play as large a role in
vocabulary acquisition as, say, reading a variety of authentic material on a regular basis. However, the
majority of the studies presented also suggest that the multimodality afforded by technology provides a
way for the vocabulary learner to engage with language input in ways not possible with more paper-based
media. Moreover, with the increasing integration of media and technologies into mobile devices that once
only existed in DVD players and personal computers, the suggestion that electronic vocabulary resources
may one day overtake more conventional ones (e.g., paper textbooks) as the main learning tool is not farfetched. In other words, while we still have much to understand about how vocabulary is learned through
multimedia, and though technology still has a long way to go as a vocabulary teaching and learning tool,
it is far from game over.
28
REFERENCES
Coulmas, F. (1981). Conversational routine. The Hague: Mouton.
de Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and writing. In C. Mair &
M. Hundt (Eds.), Corpus linguistics and linguistic theory (pp. 5168). Amsterdam: Rodopi.
Foster, P. (2001). Rules and routines: A consideration of their role in the task-based language production
of native and non-native speakers. In M. Bygate, P. Skehan, & M. Swain, (Eds.) Researching pedagogic
tasks: Second language learning, teaching, and testing (pp. 7594). Harlow: Longman.
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and formulae. In A. P.
Cowie (Ed.), Phraseology: theory, analysis and applications (pp. 145160). Oxford: Oxford University
Press.
Laufer, B., & Goldstein, Z. (2004). Testing vocabulary knowledge: Size, strength, and computer
adaptiveness. Language Learning, 54, 399436.
Lewis, M. (2008). The idiom principle in L2 English: Assessing elusive formulaic sequences as indicators
of idiomaticity, fluency and proficiency (Unpublished doctoral thesis). Stockholm: Stockholm University.
Martinez, R., & Schmitt, N. (under review). A phrasal expressions list.
Ohlrogge, A. (2009). Formulaic expressions in intermediate EFL writing assessment. In R. Corrigan, E.
A. Moravcsik, H. Ouali, & K. M. Wheatley (Eds.), Formulaic language volume 2: Acquisition, loss,
psychological reality, and functional explanations (pp. 375386). Amsterdam: John Benjamins
Publishing Company.
Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual. Basingstoke: Palgrave
Macmillan.
Simpson-Vlach, R., & Ellis, N.C. (forthcoming). An academic formulas list: New methods in phraseology
research. Applied Linguistics.
Xu, J., McKenny, J., & Morgan, M. (2010, March). As we all know a confidence builder? Paper
Presented at the FLaRN Interdisciplinary Conference on Formulaic Language. Paderborn, Germany.
29
30
bundles. For example, most bundles in conversation are clausal, whereas most bundles in academic prose
are phrasal. Other studies of bundles have focused primarily on comparisons between expert and nonexpert writing. Cortes (2002) investigated bundles in native freshman compositions and found that the
bundles used by these novice writers were functionally different from those in published academic prose.
In another study, Cortes (2004) compared native student writing with that in academic journals,
concluding that students rarely used the lexical bundles identified in the corpus of published writing. Even
if they did, the students used these bundles in a different manner. Working with academic writing only,
Hyland (2008b) indicated that there was disciplinary variation in the use of lexical bundles. He also
investigated the role of lexical bundles in published academic prose and in postgraduate writing and
found that postgraduate students tended to employ more formulaic expressions than native academics in
order to display their competence (Hyland, 2008a).
To date, only a few studies of L2 written data have performed structural and functional categorization of
lexical bundles. Although Hyland, in his two studies (2008a, 2008b), included masters theses and
doctoral dissertations produced by L2 English students in Hong Kong, he did not begin from a
perspective of second-language learning. Instead, he treated L2 postgraduate writing as highly
proficient, on the ground that all the data in his corpus of texts had been awarded high passes. Drawing
on the previous research, the present study aims to compare the use of recurrent word combinations in
native-speaker and non-native speaker academic writing in order to reveal the potential problems in
second language learning. Quantitative and qualitative analyses were carried out on three corpora in order
to identify similarities and differences in recurrent word combinations at different levels of writing
proficiency. One corpus (the L2 or learner corpus) contained writing from L1 Chinese learners of L2
English, while the two other comprised L1 writing: one from academics (whom we term expert writers)
and the other university students (who are similar in background to the L1 Chinese learners, aside from
their first language). Lexical bundles is adopted as the primary term throughout this study, as it is used by
Biber in a series of studies upon which the theoretical and analytical framework of the current study is
based. Another term, recurrent word combination, is also used interchangeably, given its transparent
literal meaning.
DATA AND METHODOLOGY
Data
Two existing corpora are used in the present study: the Freiburg-Lancaster-Oslo/Bergen (FLOB) corpus,
and the British Academic Written English (BAWE) corpus. To ensure comparability, only part of each
corpus was selected for investigation. The FLOB corpus is a one-million-word corpus of written British
English from the early 1990s, comprising fifteen genre categories. For the current study, only the category
of academic prose, FLOB-J, was used to represent native expert writing. FLOB-J contains eighty 2,000word excerpts from published academic texts, retrieved from journals or book sections. With regard to L1
and L2 student academic writing, parts of the BAWE corpus were utilized. The BAWE corpus, released
in 2008, contains approximately 3,000 pieces (approx. 6.5m. words) of proficient assessed student writing
from British universities. Two subcorpora were selected from the BAWE corpus: BAWE-CH contains
essays produced by L1 Chinese students of L2 English, and BAWE-EN is a comparable dataset
contributed by peer L1 English students. FLOB-J, BAWE-CH and BAWE-EN cover a wide range of
disciplines, including arts and humanities, life sciences, physical sciences and social sciences (for BAWE,
see Alsop & Nesi, 2009; for FLOB, see Hundt, Sand & Siemund, 1998). The size of each finalized corpus
for investigation is around 150,000 words (see Table 1).
31
Corpus
FLOB-J
BAWE-EN
BAWE-CH
Word count
164,742
155,781
146,872
No. of texts
80
60
53
Operationalization
Several key criteria have been pinpointed in the literature regarding how to generate a list of lexical
bundles using automated corpus tools. The first criterion is the cut-off frequency, which determines the
number of lexical bundles to be included in the analysis. The normalized frequency threshold for large
written corpora generally ranges between 20-40 per million words (e.g., Biber et al., 2004; Hyland,
2008b), while for relatively small spoken corpora, a raw cut-off frequency is often used, ranging from 210 (e.g., Altenberg, 1998; De Cock, 1998). The second criterion is the requirement that combinations
have to occur in different texts, usually in at least 3-5 texts (e.g., Biber & Barbieri, 2007; Cortes, 2004), or
10% of texts (e.g., Hyland, 2008a), which helps to avoid idiosyncrasies from individual writers/speakers.
The last issue concerns the length of word combinations, usually 2-, 3-, 4-, 5-, or 6-word units. Four-word
sequences are found to be the most researched length for writing studies, probably because the number of
4-word bundles is often within a manageable size (around 100) for manual categorization and
concordance checks. The frequency and dispersion thresholds adopted vary from study to study, and even
the sizes of corpora and subcorpora differ drastically, ranging from around 40,000 to over 5 million words.
After repeated experiments with the corpus data under investigation, the frequency and distribution
thresholds for determining 4-word lexical bundles were set to 4 times or more (approximately 25 times
per million words on average), occurring in at least three texts. This resulted in an optimum number of
bundles, which was considered sufficiently representative of the corpora being examined. One might
argue that an identical standardized threshold, such as 20 or 40 times per million words, should be applied
to each of the corpora investigated, as generally reported in the literature. However, when a normalized
rate is converted to raw frequencies, it substantially affects the number of generated word combinations
when comparing corpora of various sizes. For instance, if we compare an 80,000-word corpus with a
40,000-word corpus with a cut-off standardized frequency set at 40 times per million words, it means that
the converted raw-frequency threshold for the larger corpus is 3.2, whereas the converted raw-frequency
threshold for the smaller corpus is much lower, at 1.6. Any decimals have to be rounded up or down in
order to function as an operational cut-off frequency. Yet rounding down 3.2 to 3 results in a normalized
rate of 37.5 whereas rounding up 1.6 to 2 generates a normalized rate of 50, both of which are different
from the originally reported frequency threshold of 40 times per million words. Reporting only the
standardized frequency criterion could therefore be misleading, because a standardized cut-off frequency
would inevitably lose its expected impartiality after being converted into raw frequencies corresponding
to different corpus sizes. In this study, it could be argued that both the raw cut-off frequency and
corresponding normalized frequency should be reported in order to reflect transparently the threshold
adopted. For the sake of comparison, if the frequency threshold is set at 25 times per million words for the
present study, the converted raw frequencies for each corpus are 3.7, 3.9 and 4.1 times respectively,
which are all rounded up or down to 4 (cf. Table 2 and Table 3).
32
After automatic retrieval of 4-word clusters using the corpus tool WordSmith 4.0 (Scott, 2007), word
sequences containing content words that were present in the essay questions (e.g., financial and non
financial), or any other context-dependent bundles, usually incorporating proper nouns (e.g., in the UK
and, the Second World War), were manually excluded from the extracted bundle lists. It was also found
that overlapping word sequences could inflate the results of quantitative analysis. Overlaps were thus
checked manually via concordance analyses. Two major types of overlaps are discussed here. One is
complete overlap, referring to two 4-word bundles which are actually derived from a single 5-word
combination. For example, it has been suggested and has been suggested that both occur six times,
coming from the longer expression it has been suggested that. The other type of overlap is complete
subsumption, referring to a situation where two or more 4-word bundles overlap and the occurrences of
one of the bundles subsume those of the other overlapping bundle(s). For example, as a result of occurs
17 times, while a result of the occurs five times, both of which occur as a subset of the 5-word bundle as
a result of the. Each case of the above overlapping word sequences (12 cases in total) were combined into
one longer unit so as to guard against inflated results.
A further potential problem when comparing bundles across corpora involves what is actually counted
(i.e., type/token distinction). Should we count the number of types of bundles (e.g., counting as a result of
and it is possible to each as one type of bundle), or should we count the total occurrence of bundles (e.g.,
as a result of might occur 20 times in one corpus and 50 times in another)? One corpus could exhibit a
very narrow range of bundles but have very high frequencies of them, while another might have the
opposite pattern. We therefore distinguished between different types of bundles (types) and frequencies of
bundles (tokens).1 The numbers of bundle types and tokens, before and after data refinement, including
removing context-dependent bundles and overlapping ones, are shown in Table 4 below.
Table 4. Number of Bundles Before and After the Removal of Context-Dependent Bundles and Overlaps
Corpus
FLOB-J
BAWE-EN
BAWE-CH
Before refinement
No. of lexical
No. of lexical
bundles (types)
bundles (tokens)
118
749
120
757
90
554
After refinement
No. of lexical
No. of lexical
bundles (types)
bundles (tokens)
108
704
104
667
80
507
33
34
Table 5. Proportional Distribution of Lexical Bundles (Types) Across the Structural Categories in LSWE,
FLOB-J, BAWE-EN and BAWE-CH (cf. Biber et al., 1999)
Category
NP-based (1)
PP-based
(2)
(3)
(4)
VP-based
(5)
(6)
(7)
(8)
(9)
Pattern
noun phrase with
post-modifier
fragment
preposition +
noun phrase
fragment
copula be +
NP/AdjectiveP
VP with active
verb
anticipatory it +
VP/adjectiveP +
(complementclause)
passive verb + PP
fragment
(VP +) that-clause
fragment
(verb/adjective +)
to-clause fragment
others
Total
ACAD
(LSWE)
FLOBJ
BAWEEN
BAWECH
Example
30%
32.5%
15.4%
15%
33%
36%
28.8%
32.5%
as a result of
2%
2.6%
10.6%
6.3%
is one of the
--
0.9%
2.9%
6.3%
has a number of
9%
8.8%
5.8%
8.8%
it is possible to
6%
7%
10.6%
5%
is based on the
5%
2.6%
4.8%
6.3%
should be noted
that
9%
7%
18.3%
15%
are likely to be
6%
100%
2.6%
100%
2.8%
100%
4.8%
100%
as well as the
Bundle
Total
FLOB-J
the degree to which (5)*
the extent to which (6)
the fact that this (4)
BAWE-EN
the extent to which (8)
the fact that the (8)
the fact that they (4)
the way in which (7)
type
token
token
27
33
* The raw frequency is indicated in brackets, and this practice is used throughout this paper.
Secondly, a great number of NP + of and PP + of bundles can be grouped into two productive frames: the
+ Noun + of the/a, and in the + Noun + of. The professional writing in FLOB-J manifests a relatively
wide range of nouns that collocate with these two frames (Table 7 and Table 8). In this regard, it appears
that the patterns emerging from FLOB-J lend support to the finding reported by Biber et al. (2003), who
described the same two fixed frames (termed phrase-frame by Stubbs, 2007a) used for 43 and 17
different lexical bundles respectively in their academic prose as extremely productive frames (Biber et
al., 2003, p.78). In comparison, neither the British students nor the Chinese students seem to have
recognized the importance of these nominal or prepositional expressions in their academic writing.
Language Learning & Technology
35
end (10) , creation (4) , existence (4) , history (7) , impact (4) ,
magnitude (4) , results (4) , nature (17) , rest (11) , role (5) , rules
(5) , size (7) , status (4) , strength (5) , structure (4) , value (5)
BAW-EN
BAWE-CH
(7) ,
type
token
16
100
69
37
quality (4) ,
(5) ,
rest (8) ,
* The bundles appearing in two or three corpora are indicated in bold. This practice is used throughout this paper.
type
type
10
87
form (8)
35
form
19
FLOB-J
BAWE-EN
(23) ,
BAWE-CH
case
(5) ,
(10) ,
context
(4) ,
form
(4)
As seen in Table 5, both groups of student writing generally had more VP-based bundles than native
expert writing, and this tendency is particularly marked in certain subcategories. For example, we found
that the student writers in BAWE-EN and BAWE-CH used considerably more to-clause fragments
(see Table 9), showing a preference for the frame in order to + Verb. L1 Chinese students in particular
used six different verbs that fit in the slot: achieve, avoid, be, maintain, make and understand, while
British students had two such bundles: in order to make and in order to minimise. For this subcategory,
we see more similarity between BAWE-EN and BAWE-CH.
Table 9. Bundles in the Subcategory of to-Clause Fragments
Corpus FLOB-J
Bundle
to be able to
(5)
BAWE-EN
BAWE-CH
to be able to (4)
to ensure that the
Total
type
type
type
token
token
36
token
40
(4)
36
Although there is a substantial number of VP-based bundles in BAWE-CH, L1 Chinese students did not
use the Passive verb + prepositional phrases (PassPP) form as frequently as native speakers did. As can
be seen in Table 10, there are seven passive-verb bundles in FLOB-J and eleven in BAWE-EN, both of
which make up around 20% of the VP-based bundle types within each individual corpus. In comparison,
the four passive-verb bundles in BAWE-CH constitute merely 10% of the total VP-based bundle types.
Additionally, none of the four passive bundles were shared by either of the native group of writers.
Table 10. Bundles in the Subcategory Passive Verb + Prepositional Phrases
Corpus
FLOB-J
BAWE-EN
BAWE-CH
Bundle
be seen as a (5)
be included in the (4)
be taken into account (5)
be used in the (5)
can be applied to (7)
can be found in (6)
can be seen as (5)
can be seen in (4)
can be used for (5)
could be seen as (5)
should be placed on (4)
type
token
type
token
type
token
Total
7
34
11
55
4
19
Discourse Functions
The functional categorization adopted here follows the taxonomy devised by Biber and colleagues (Biber
& Barbieri, 2007; Biber et al., 2003, 2004). Three major categories were distinguished: referential
bundles, stance bundles, and discourse organizers.
Referential expressions are characterized by the function of attribute specification. The first type,
framing bundles, are used to specify a given attribute or condition (e.g., in terms of the). Another common
type of referential bundles is quantifying expressions (e.g., per cent of the), which qualify a proposition
with expressions related to anything potentially measurable, such as size, number, amount or extent. The
last subcategory of referential expressions includes place/time/text-deictic bundles (e.g., at the beginning
of).
Stance bundles are often used to express a writers evaluation of a proposition in terms of certainty or
uncertainty (epistemic) (e.g., seems to have been). They can also convey the writers attitude about
proposition (obligation/directive) (e.g., it is important to). If the writers judgment on the ability to do
something is involved, then they are grouped under ability (e.g., will be able to).
Epistemic: are more likely to, it can be argued, the fact that the
37
Discourse organizers are used to structure texts. They can introduce a topic (e.g., essay is going to),
elaborate on the topic (e.g., be taken into account), or make inference (e.g., in the sense that). In addition,
a large number of the discourse organizers discovered here function to identify the focus that the writer is
making (e.g., bear in mind that).
Topic introduction: essay is going to, last but not least, in this essay I
Topic elaboration: in more detail in, on the other hand, can be used to
Identification/focusing: one of the most, there would be no, we can see that
As can be seen from Figure 1, FLOB-J contains a higher proportion of referential expressions (60%),
whereas they are much less frequent in both BAWE-EN (37%) and BAW-CH (41%). On the other hand,
discourse organizers rank as the largest category in both BAWE-EN and BAWE-CH, having very similar
proportions at 39% and 42% respectively, while discourse organizers in FLOB-J make up only about half
of that (21%). As for stance bundles, BAWE-EN has the highest percentage of use at 24%, but this
category is the smallest one in each of the three corpora.
Function
100.0%
80.0%
Referential expressions
Stance bundles
Discourse organisers
41%
37%
Percent
60%
60.0%
16%
24%
40.0%
19%
20.0%
42%
39%
21%
0.0%
BAWE-CH
BAWE-EN
FLOB-J
Corpus
38
As can be seen from Table 11, only two cells, the referential expressions and discourse organizers in
FLOB-J, have an absolute value of R greater than 1.96, which suggests that these two categories in
FLOB-J made a statistically significant contribution to the rejection of the null hypothesis. We can use the
information from the values of R to conclude that there are significantly more referential expressions and
fewer discourse organizers in native academic writing in comparison with academic student writing.
Table 11. Standardized Residuals in a Chi-Square Contingency Table for Functional Distribution (Types)
X2 = 16.4, df = 4, p = 0.003
Cramers V=0.167
FLOB-J
Observed Count
Expected Count
R
BAWE-EN
Observed Count
Expected Count
R
BAWE-CH
Observed Count
Expected Count
R
Referential
expressions
65
50.3
2.1
38
48.4
-1.5
33
37.3
-0.7
Stance
bundles
20
21.5
-0.3
25
20.7
0.9
13
15.9
-0.7
Discourse
organizers
23
36.2
-2.2
41
34.9
1.0
34
26.8
1.4
The token distribution of functions among the three corpora is virtually the same as for type distribution.
As can be seen in Figure 2, the proportion of referential expressions remains the most marked difference
between FLOB-J, BAWE-EN and BAWE-CH, as referential expressions make up almost two thirds of
the bundles in FLOB-J. On the other hand, both BAWE-EN and BAWE-CH rely more heavily on
discourse organizers, having proportions as high as 39% and 48% respectively.
A chi-square test indicates that there is significant difference, in terms of the functional distribution of
bundle tokens, among the three groups at the 0.05 level (X2 = 148.5, df = 4, p < 0.0005, Cramers V =
0.199). The standardized residuals were again calculated. As can be seen from Table 12, apart from the
stance bundles in FLOB-J, every cell in this contingency table contributed significantly to the differences.
On the basis of the information provided by R, the referential expressions and discourse organizers in
FLOB-J are still found to make the most contribution to rejecting the null hypothesis, just like the type
distribution. On the whole, there are significantly more referential expressions and fewer discourse
organizers in native expert writing, while both groups of student writing contain significantly fewer
referential expressions and more discourse organizers. In addition, the British students, represented by
BAWE-EN, used more stance bundles than expected, whereas the Chinese students in BAWE-CH used
fewer stance bundles.
Drawing on the standardized residuals from functional analysis (Table 11 and Table 12), the FLOB-J
corpus appears to represent the group which differs the most from the other two groups of university
student writing. Given that the texts retrieved from FLOB-J are published academic texts, written by
native academics, and must therefore have been repeatedly edited by experienced editors, it is not too
surprising to see that FLOB-J distinguishes itself among the three groups of writing. The similarities
between BAWE-EN and BAWE-CH revealed by the standardized residuals also meet with our
expectations to a certain extent. The student writing in both BAWE-EN and BAWE-CH was produced by
university students, who can be regarded as novice academic writers. In addition, both groups of student
writing were originally extracted from the same BAWE corpus, although it should be born in mind that
the topics for each piece of assignment varied to a very large degree in these two student subcorpora,
covering many disciplines. It should be noted also that the text types and constituents in the FLOB-J and
Language Learning & Technology
39
BAWE subcorpora might have had an impact on the analysis, and this will be discussed further in the
section of Discussion.
Function
100.0%
Referential expressions
Stance bundles
Discourse organisers
80.0%
37%
39%
Percent
62%
60.0%
13%
24%
40.0%
18%
48%
20.0%
39%
20%
0.0%
BAWE-CH
BAWE-EN
FLOB-J
Corpus
Referential
expressions
437
Stance
bundles
125
Discourse
organizers
142
329.5
5.9
246
312.2
-3.7
196
237.3
-2.7
132.3
-0.6
161
125.4
3.2
67
95.3
-2.9
242.2
-6.4
260
229.4
2.0
244
174.4
5.3
As revealed in the type distribution (see Figure 1 and Table 11) and token distribution (see Figure 2
and Table 12), we already know that referential expressions are highly frequent in expert academic
writing, whereas university students do not rely on this discourse function as much. Among the referential
expressions, one type of quantifying bundle is noteworthy (i.e., the extent/degree modifiers, which are
Language Learning & Technology
40
present in both groups of native writing, but not in learner writing). There are four such bundles in FLOBJ: in so far as (6) , the degree to which (5) , the extent to which (6) , and to a large extent (4) , and two in
BAWE-EN: the extent to which (8), and to a certain extent (4) . It appears that learners do not use this type
of modifier very much, whereas native speakers tend to use them to modify the extent or degree of their
proposition as the following examples demonstrate:
On the other hand, Chinese student writers seem to use certain referential deictic expressions, such as in
the long run (13) , in the recent years (6) , and all over the world (6) , as exemplified below. These deictic
expressions do not appear in the repertoire of word combinations used by professional writers nor British
peer students.
They are more or less equivalent way of paying out retained earning,
while stock repurchases indeed have become an important source of payout
in the recent years. (BAWE-CH)
This strategy is now very popular all over the world, for it maximizes
the value of limited monetary amount of fringe benefits and gives the
employees some controls over their own rewards. (BAWE-CH)
The first word combination, in the long run, is an idiomatic expression, occurring 13 times in BAWE-CH
but only once in FLOB-J. This idiom, in the long run, is actually more characteristic of non-academic text
than of academic prose, and is quite frequent in speech, as indicated by the British National Corpus
(BNC),4 albeit not always being identified as an informal expression in dictionaries (e.g., Macmillan
Dictionary, Rundell, 2007). The second bundle, in the recent years, was generally expressed as in (more)
recent years and recently by native writers in FLOB-J and BAWE-EN. Interestingly, we found 2,344
instances of in recent years and only 2 instances of in the recent years in the BNC. This suggests that in
the recent year is therefore a learner bundle rather than a native bundle. The third expression, all over
the world, might reflect a general tendency of learners to be categorical and to over-generalize as this
expression appears to be favored by learners at various proficiency levels (Chen, 2009).
Turning now to stance bundles, it was found that the supposedly least-competent writers, represented by
the L2 writers in BAWE-CH, employed the smallest range of epistemic bundles, whereas the most
proficient writers in FLOB-J manifested the widest range of epistemic expressions. Further investigation
of the epistemic markers used by the native writers shows that both native groups are quite capable of
taking advantage of comprehensive measures to hedge their statements. The frame copula be + likely to
is frequently used in native writing to mitigate a proposition, with a few variations such as is likely to be
5
(7), are likely to be (9) , are more likely to (13) . In addition to this frame, native writers are also capable of
flexibly employing other hedging devices, including the Anticipatory it + adjective fragment frame (it
Language Learning & Technology
41
is clear that (19) , it is not clear (4) , it is possible to (6) ), modal verbs (would have to be (12) , would need to
be (4) , would be difficult to (5) ), hedging verbs (seems to have been (6) , it has been suggested (4) , it
can/could be argued (19) , it is estimated that (4) ), and hedging nouns (there is no evidence (4) , there is
evidence that (5) , the fact that the (8) , etc.) to qualify their propositions.
This change indicates that two relatively dissimilar clusters have been
merged and that the number of clusters prior to this merger is likely to
be the most appropriate. (FLOB-J)
By contrast, there are only four bundles in the L2 writing that can be regarded as hedging expressions: are
more likely to (5) , is considered to be (4) , it has been suggested that (6) , it is believed that (5) .
Both British and Chinese students used a relatively high number of discourse organizers in their writing
when compared to the academic prose in FLOB-J. In particular, they used more discourse organizers to
elaborate and/or clarify a topic, the majority of which are VP-based bundles, such as Passive verb +
prepositional phrase fragment (can be regarded as, be included in the, etc.), Verb + to-clause fragment
(can be used to, in order to make, etc.), and Subject + verb (this means that the, that is to say).
An impression of the instances above is that they all seem to be rather verbose. The most noticeable
example of tautology might be the last one, which repeatedly refers to travelling for social purposes in
India, using various paraphrases. The contrast with that is to say in the professional academic writing
below demonstrates one of the major differences between L1 expert writing and learner writing. In the
following example, by use of the expression that is to say, the native academic does not simply
paraphrase what has already been written as learners do, but instead progresses further, using other means
(e.g., giving a specific example) to illustrate the previous proposition.
It is now accepted on all sides that Britain needs more of its workforce
to be vocationally trained to intermediate levels; that is to say, to
craft or technician standards as represented, for example, by City and
Guilds examinations (at part 2) or BTEC National Certificates and
Diplomas. (FLOB-J)
42
DISCUSSION
The analysis in the previous sections set out to compare the use of recurrent word combinations, in terms
of their structures and functions, in native expert writing, native student writing and L2 student writing. A
deeper investigation, however, suggested that the quantitative analysis needed to be complemented and
supported by qualitative analyses which considered an examination of expanded concordance lines. By
utilizing such a hybrid methodology, a number of distinctive features, which vary according to level of
writing proficiency, have been unveiled.
L2 academic writing has been found to be stylistically more verbose (cf. Lorenz, 1998, 1999) and to show
less control of cautious language (cf. Hyland, 1994; Hyland & Milton, 1997). Consider the use of hedging
in cautious language for example. L1 Chinese learners of L2 English in the current study are found to
show some control of this feature in their academic writing, but do not demonstrate it as diversely and
robustly as native writers do. Indeed, Hyland and Milton (1997) compared expressions for qualification
and certainty in the writing of L1 and L2 students and found that Chinese students in Hong Kong in
particular had some problems in this pragmatic area. They concluded that this could be partly attributed to
a lack of introduction of hedging devices in EAP textbooks. Another aspect relating to L2 writers
underuse of hedging devices is their tendency to be categorical and to over-generalize. As Ringbom
(1998) discovered, even at advanced level, learner language was still in some respects more, in others less,
vague than native speaker language, although this was a word-based perspective rather than a
phraseological one. Investigating learners writing development using IELTS candidate scripts across
band scores, Kennedy and Thorp (2007) also pointed out that L2 learners at lower proficiency levels tend
to express their opinions in a more categorical manner, and that their writing is modified less by hedging.
The finding here, therefore, reinforces this distinctive aspect of L2 writing from a phraseological
viewpoint. The tendency to hedge less and instead adopt an overstating tone seems to be universal for
learners from different L1 backgrounds, as the studies discussed above are not exclusive to L1 Chinese
learners of L2 English. What is more, it appears that these features may change with proficiency
development, as evidenced by Kennedy and Thorp (2007). Learner writing is likely to improve as
proficiency progresses, most likely by edging closer to the norms of native expert writing and showing
better control of cautious language.
Another interesting issue is the relationship between the number of recurrent word combinations and
writing proficiency. As shown in Table 4, the number of recurrent word combinations increases with
advancing writing proficiency, which is the case both for the range of lexical bundles used (types), and
the overall occurrence of lexical bundles (tokens). It appears that the use of formulaic expressions grows
with writing proficiency. This finding is, nonetheless, contrary to some of the results reported in the
literature (De Cock, 2000; Hyland, 2008a). It should be noted that these studies did not remove
overlapping bundles or context-dependent ones, while the current research does. Take Hylands study
(2008a) for example. He compared academic clusters among published research articles, PhD
dissertations, and Master theses. In his conclusion, Hyland indicated that the least confident or proficient
students at Masters level relied on formulaic expressions most, while the expert writers used the fewest
clusters. Comparisons across studies like these, however, need to exercise extreme caution. Firstly,
Hyland included all the topic-related clusters occurring in his study (e.g., in the Hong Kong), while such
context-dependent bundles are excluded in the present paper. Next, our repeated experiments have
revealed that the number of recurrent word combinations retrieved might relate to corpus size to a large
extent. On the whole, larger corpora will generate fewer recurrent word combinations with the same cutoff normalized frequency, when compared with smaller corpora, because large corpora will elicit higher
converted raw frequencies, as discussed in the section on Operationalization. Furthermore, the dispersion
requirement (e.g., occurring in at least three texts or 10% of texts) also impacts on the number of
recurrent word combinations. It is virtually impossible to find different corpora, of exactly the same size
composed of the same number of texts, for direct comparison. For cross-study comparisons, we have to
Language Learning & Technology
43
bear these limitations in mind. As a result, it is still not conclusive as to whether there is a relationship
between proficiency and the number of formulaic expressions used, particularly when the student groups
are not identical, as in Hyland (2008a) and the current study. Interestingly, the results from the current
study are in line with those in De Cocks study (2004), in which she compared recurrent word
combinations between native and non-native speech. She found that, after discarding the repeats (e.g., I I
or the the) and the hesitation items (e.g., er or erm), native data actually contain more recurrent word
combinations of different lengths than non-native speech does.
It has to be acknowledged that the use of FLOB-J to represent native expert academic writing might have
had some impact on the word combinations derived. First of all, a large proportion of the texts included in
FLOB-J are hard-science based. This is probably why we found bundles such as a function of the, the
magnitude of the, the structure of the, and a high level of in FLOB-J, which appear to be strongly
concerned with the disciplines of hard science. Meanwhile, the journal papers or book sections selected in
FLOB-J are all 2000-word long excerpts, rather than the complete texts included in the BAWE studentwriting corpus. It is probable that there are more occasions in BAWE student writing to use discourse
organizers, as student essays are mostly structured as Introduction, Body and Conclusion. However, it
should also be noted that when examining concordance lines, we found very few discourse organizers
which could be attributed to the differences between excerpts and full texts (i.e., topic-introduction
bundles such as in this essay I or last but not least). In addition to the use of FLOB-J, clearly there have
been other constraints on the present study. For one, this corpus-driven approach cannot cater for
discontinuous word combinations, and thus certain information might be missing. For another, it is
notoriously difficult to obtain large quantities of quality learner data, and the learner writing investigated
in this paper is not error-tagged. We cannot know for sure if there are any learner errors which might have
affected the generation of word combinations, although these assignments have been assessed as being
good university essays.
CONCLUSION
This comparative study has revealed the fundamental differences and similarities between native and
learner academic writing. Through structural and functional comparisons, it has been found that the use of
lexical bundles in non-native and native student essays is surprisingly similar. They both contain many
more VP-based bundles and discourse organizers than native expert writing does, which appears to be a
sign of immature writing. On the other hand, native professional writers exhibit a wider range of NPbased bundles and referential markers. A further qualitative examination revealed, however, that native
student writing actually shares a few features distinctive in academic writing, such as the control of
cautious language in native professional writing. Non-native writing, however, demonstrates a tendency
that seems to be exclusive to L2 writing (e.g., over-generalizing and favoring certain idiomatic
expressions and connectors).
With the development of corpus techniques, the importance of corpus-extracted word combinations as
building blocks in constructing discourse has been increasingly recognized. However, the growing
interest in identifying phraseology with corpus tools during the past decade does not appear to have
encouraged ELT publishers or practitioners to put more emphasis on computer-retrieved formulaic
language in the curriculum and/or materials. In the current study, through investigation of three groups of
academic writing, it was found that there was a gap, in terms of the use of lexical bundles, between native
expert academic writing and university student writing (native and non-native alike). We argue that, after
careful selection and editing, the frequency-driven formulaic expressions found in native expert writing
can be of great help to learner writers to achieve a more native-like style of academic writing, and should
thus be integrated into ESL/EFL curricula.
44
NOTES
1. All the frequencies of bundles indicated in this study are raw frequencies rather than normalized ones.
2. In LSWE, the data of academic prose is as large as 5.3 million words.
3. The statistical package used is SPSS 17.0 (2008).
4. In the BNC, for academic writing, the frequency per million words of in the long run is 6.72. This
figure is 8.27 for non-academic prose and 4.23 for speech.
5. For reasons of space, the frequencies in brackets are the sum of both FLOB-J and BAWE-EN.
ACKNOWLEDGEMENTS
Some of the data in this study come from the British Academic Written English (BAWE) corpus, which
was developed at the Universities of Warwick, Reading and Oxford Brookes, under the directorship of
Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics [previously called
CELTE], Warwick), Paul Thompson (Department of Applied Linguistics, Reading), and Paul Wickens
(Westminster Institute of Education, Oxford Brookes), with funding from the ESRC (RES-000-23-0800).
REFERENCES
Alsop, S., & Nesi, H. (2009). Issues in the development of the British Academic Written English
(BAWE) corpus. Corpora, 4(1), 7183.
Altenberg, B. (1998). On the phraseology of spoken English: the evidence of recurrent wordcombinations. In A. P. Cowie (Ed.), Phraseology: theory, analysis and applications (pp. 101122).
Oxford: Oxford University Press.
Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for
Specific Purposes, 26, 263286.
Biber, D., & Conrad, S. (1999). Lexical Bundles in Conversations and Academic Prose. In H. Hasselgard
& S. Oksefjell (Eds.), Out of corpora: studies in honour of Stig Johansson (pp. 181190). Amsterdam:
Rodopi.
Language Learning & Technology
45
Biber, D., Conrad, S., & Cortes, V. (2003). Lexical bundles in speech and writing: an initial taxonomy. In
A. Wilson, P. Rayson & T. McEnery (Eds.), Corpus linguistics by the Lune: a festschrift for Geoffrey
Leech (pp. 7193). Frankfurt: Peter Lang.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at ...: Lexical bundles in university teaching and
textbooks. Applied Linguistics, 25(3), 371405.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman Grammar of Spoken
and Written English. London: Longman.
Chen, Y.-H. (2009). Lexical Bundles across Learner Writing Development. Unpublished doctoral thesis,
Lancaster University, Lancaster, UK.
Conklin, K., & Schmitt, N. (2008). Formulaic sequences: Are they processed more quickly than
nonformulaic language by native and nonnative speakers? Applied Linguistics, 29(1), 7289.
Cortes, V. (2002). Lexical bundles in Freshman composition. In R. Reppen, S. M. Fitzmaurice & D. Biber
(Eds.), Using corpora to explore linguistic variation (pp. 131145). Amsterdam: John Benjamins
Publishing Company.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history
and biology. English for Specific Purposes, 23, 397423.
De Cock, S. (1998). A recurrent word combination approach to the study of formulae in the speech of
native and non-native speakers of English. International Journal of Corpus Linguistics, 3(1), 5980.
De Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and writing. In C. Mair &
M. Hundt (Eds.), Corpus Linguistics and Linguistic Theory (pp. 5168). Amsterdam: Rodopi.
De Cock, S. (2004). Preferred sequences of words in NS and NNS speech. Belgium Journal of English
and Literatures (BELL), New Series 2, 225246.
De Cock, S., Granger, S., Leech, G., & McEnery, T. (1998). An automated approach to the phrasicon of
EFL learners. In S. Granger (Ed.), Learner English on computer (pp. 6779). London: Longman.
Granger, S., & Meunier, F. (Eds.). (2008). Phraseology: An interdisciplinary perspective. Amsterdam:
John Benjamins.
Granger, S., & Paquot, M. (2008). Disentangling the phraseological web. In S. Granger & Meunier (Eds.),
Phraseology: An interdisciplinary perspective. Amsterdam: John Benjamins.
Hundt, M., Sand, A., & Siemund, R. (1998). Manual of Information to accompany The Freiburg-LOB
Corpus of British English (FLOB). Retrieved from http://khnt.hit.uib.no/icame/manuals/flob/INDEX.HTM
Hyland, K. (1994). Hedging in academic writing and EAP textbooks. English for Specific Purpose, 13(3),
239156.
Hyland, K. (2008a). Academic clusters: Text patterning in published and postgraduate writing.
International Journal of Applied Linguistics, 18(1), 4162.
Hyland, K. (2008b). As can be seen: Lexical bundles and disciplinary variation. English for Specific
Purposes, 27(1), 421.
Hyland, K., & Milton, J. (1997). Qualification and certainty in L1 and L2 students writing. Journal of
Second Language Writing, 6(2), 183205.
Jiang, N., & Nekrasova, T. M. (2007). The processing of formulaic sequences by second language
speakers. The Modern Language Journal, 91, 433445.
46
Kennedy, C., & Thorp, D. (2007). A corpus investigation of linguistic responses to an IELTS Academic
Writing task. In L. Taylor & P. Falvey (Eds.), IELTS collected paper: research in speaking and writing
assessment (pp. 316378). Cambridge: Cambridge University Press.
Lorenz, G. (1998). Overstatement in advanced learners' writing: stylistic aspects of adjective
intensification. In S. Granger (Ed.), Learner English on Computer (pp. 5366). New York: Addison
Wesley Longman Limited.
Lorenz, G. (1999). Adjective intensification--Learners versus native speakers. A corpus study of
argumentative writing. Amsterdam: Radopi.
Meunier, F., & Granger, S. (Eds.). (2007). Phraseology in Foreign Language Learning and Teaching.
Amsterdam: John Benjamins Publishing.
Ringbom, H. (1998). Vocabulary frequencies in advanced learner English. In S. Granger (Ed.), Learner
English on Computer (pp. 4152). London: Addison Wesley Longman Limited.
Rundell, M. (Ed.). (2007). Macmillan English Dictionary For Advanced Learners (Second Edition).
Oxford: Macmillan Education.
Schmitt, N. (2004). Formulaic sequences: acquisition, processing, and use. Amsterdam: John Benjamins.
Schmitt, N., Grandage, S., & Adolphs, S. (2004). Are corpus-derived recurrent clusters
psycholinguistically valid? In N. Schmitt (Ed.), Formulaic Sequences (pp. 127152). Amsterdam: John
Benjamins Publishing.
Scott, M. (2007). Oxford WordSmith Tools (Version 4.0) [Computer software]. Oxford: Oxford
University Press.
SPSS for Windows (2008), Rel. 17.0.0. [Computer software]. Chicago: SPSS Inc.
Stubbs, M. (2007a). An example of frequent English phraseology: Distribution, structures and functions.
In R. Facchinetti (Ed.), Corpus Linguistics 25 years on (pp. 89105). Amsterdam: Radopi.
Stubbs, M. (2007b). Quantitative data on multi-word sequences in English: The case of word world. In
M. Hoey, M. Mahlberg, M. Stubbs & W. Teubert (Eds.), Text, Discourse and Corpora: Theory and
Analysis (pp. 163189). London: Continuum.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press.
Wray, A. (2008). Formulaic Language: Pushing the Boundaries. Oxford: Oxford University Press.
47
36
24
17
16
13
12
11
10
10
9
9
9
8
8
8
8
7
7
7
7
6
6
6
6
6
6
6
6
6
6
5
5
5
5
5
5
5
5
5
5
5
5
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
BAWE-EN
in the case of
can be used to
as a result of + (the)
it is important to
it could be argued that + (the)
(at) + the end of the
in terms of the
(for) + the development of the
it is possible to
is one of the
the rest of the
it can be seen + (that)
one of the main
as well as the
the use of the
in the same way
to be able to
(due) + to the fact that
an example of this + (is)
in the form of
it is clear that
the fact that the
in order to make
the nature of the
one of the most + (important)
and as a result
it is necessary to
the way in which
can be applied to + (the)
than that of the
and the use of
are more likely to
not be able to
the extent to which
was one of the
could be used to
is an example of + (a)
it is difficult to
the structure of the
can be found in
is the fact that
the length of the
with respect to the
would have to be
can be seen as +
+ be seen as a
at the same time
be taken into account
be used in the
could be seen as
it would have been
this is due to + (the)
with the development of
would be able to
are likely to be
can also be used + (to)
in relation to the
in terms of its
it can be argued + (that)
of the number of
through the use of
to a lack of
can be used for
for each of the
there would be no
a great deal of
23
19
17
17
14
13
13
13
13
12
12
12
11
10
9
9
8
8
8
8
8
8
8
7
7
7
7
7
7
7
6
6
6
6
6
6
6
6
6
6
6
6
6
6
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
4
FLOB-J
in the context of + (the/a)
in the case of
on the other hand
the nature of the
as a function of
on the basis of
in terms of the
it is necessary to
the way in which + (the)
(at) + the end of the
it is clear that
the rest of the
one of the most
at the same time
by the fact that
a wide range of
as a result of
that there is a
per cent of the
in the form of
on the one hand
the extent to which
in the sense that
would have to be
as we have seen
in the presence of
in the absence of
is likely to be
the size of the
are more likely to
as we shall see
that there is no
the history of the
the turn of the century
can be found in
in the first place
it is difficult to
it is possible to
of a number of
on the part of
in so far as
in the number of
it is important to
seems to have been
with respect to the
are shown in fig
can be used to
in the light of
a function of time
as shown in fig
be taken into account
for each of the
in a number of
it can be seen + that
to the fact that
would be difficult to
at the time of
be found in the
in the hands of
the degree to which
the role of the
the rules of the
the strength of the
the value of the
to be able to
a function of the
19
19
19
17
15
14
14
14
14
13
11
11
10
10
9
9
9
9
9
8
8
8
8
8
8
8
7
7
7
7
7
7
7
7
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
48
a high level of
a large number of
as soon as the
essay is going to
in the form of
is considered to be
it has to be
it is easy for
meet the requirement of
must be able to
of the number of
pay more attention to
the role of the
this is due to
4
4
4
4
4
4
4
4
4
4
4
4
4
4
an integral part of
as part of the
by the presence of
in an attempt to
is by no means
it is estimated that
it should be noted
of some of the
on the other hand
should be able to
the fact that they
there is no evidence
this means that the
to a certain extent
to be added to
to enable them to
to take into account
would need to be
as a way of
at the heart of
be included in the
because it is not
can be seen in
for the use of
in order to minimise
in the absence of
in this essay I
should be placed on
taking into account the
that is to say
that need to be
the quality of the
the size of the
this may be due + to
to cope with the
will be able to
will be used to
with the addition of
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
in the course of
there is evidence that
whether or not to
a large number of
an example of this
be seen in the
by a variety of
in contrast to the
in more detail in
in relation to the
in terms of a
in view of the
it has been suggested
it has not been
it is not always
on a number of
the fact that this
the right hand side
the status of the
the structure of the
the ways in which
to a large extent
a high level of
are likely to be
at each end of
at the beginning of
end of the spectrum
has a number of
in the face of
is concerned with the
it is not clear
the creation of a
the existence of a
the impact of the
the magnitude of the
the point of view
the results of the
the second half of
to that of the
was followed by a
was not so much
was one of the
5
5
5
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
49
50
Tetyana Sydorenko
visual images with verbal information. Vocabulary learning from written text (Al-Seghayer, 2001; Chun
& Plass, 1996a, 1996b; Plass et al., 1998, 2003) and aural passages (Jones & Plass, 2002) can be
enhanced if new words are annotated with both verbal input and images rather than when they are
annotated with only one of these stimuli. However, there are conflicting results on the effect of still
pictures versus dynamic images. Al-Seghayer found that multimedia annotations consisting of video and
text led to better vocabulary learning than annotations that combined pictures and text. However, the
reverse was found by Chun and Plass (1996a). A possible reason for the different findings is the
characteristics of the images, for example, their concreteness or familiarity. In studies with video input,
gestures and facial expressions have been found to aid listening comprehension in the L2 (Hernandez,
2005; Sueyoshi & Hardison, 2005). However, Baltova (1994) argues that authentic videos help with
global comprehension of information due to visual images, but they do not increase understanding of the
language per se. To help language learners with comprehension of the language, videos are often
augmented with on-screen text.
On-screen text can appear in various forms: subtitles (L1 text, L2 sound), reversed subtitles (L1 sound, L2
text), or captions (sound is in the same language as the text). Concerning comprehension, it is yet to be
resolved which of these on-screen text presentations is most beneficial (Baltova, 1999; Lambert, Boehler,
& Sidoti, 1981; Markham & Peter, 2003; Markham, Peter, & McCarthy, 2001). Danan (2004) suggests
that subtitles should be used with very difficult material; otherwise, the use of captions in the L2 is
advised. In vocabulary learning, L2 captions and reversed subtitles result in similar gains, and they are
better than subtitles for recall (Baltova, 1999; Danan, 1992) and recognition (Lambert et al., 1981). This
study focuses on captions in the L2 because captions appear to have an advantage over subtitles for
vocabulary acquisition and they provide more exposure to the L2 than reversed subtitles.
Vocabulary Acquisition from Captioned Videos
While captions facilitate listening comprehension in a foreign language (Baltova, 1999; Garza, 1991;
Guillory, 1998; Markham, 1993, 2001), their effect on vocabulary learning is not as transparent. Several
studies investigated the influence of captions on learning vocabulary as assessed by written tests (Baltova,
1999; Danan, 1992; Neuman & Koskinen, 1992). At least one other study assessed the learning via aural
tests (Markham, 1999). These researchers tests, following Nation (2001), can be classified as recognition
of form, recall of form (c-cloze, fill-in-the-blank, free recall), and recall of meaning (L2 to L1 translation).
Table 1 summarizes these four studies.
Table 1. Summary of Research on Vocabulary Acquisition from Captioned Videos
Test
Recognition
Written
Aural
Recall
Written
Study
Neuman &
Koskinen, 1992
Markham, 1999
Baltova, 1999
Danan, 1992
Neuman &
Koskinen, 1992
Aural
Participants
Specific Task
c-cloze
fill-in-the-blank,
translation
free recall
No studies
51
Tetyana Sydorenko
The studies cited above indicate that the presentation of video, audio, and captions (VAC) leads to better
performance on both written and aural tests than the presentation of video and audio (VA), yet there are at
least two gaps in the literature which make it difficult to generalize the findings. First, the studies have
been conducted with learners of different ages and at different levels of proficiency. It is difficult to
ascertain from the descriptions of the participants in the studies exactly how they would compare in terms
of proficiency. Roughly, the participants in the studies can be categorized as beginning, intermediate, and
advanced. Following this classification, it appears that aural vocabulary acquisition of beginning and
intermediate learners has not been studied. Additionally, previous research has investigated the
recognition, but not recall, of aural vocabulary. Form recognition is different from form recall or
translation because less cognitive processing is required for form recognition (Nation, 2001). To
determine which forms of vocabularywritten, aural, or bothare being learned from the VAC input,
learner performance on recognition and recall vocabulary tests in written and aural modalities should be
compared. Although acquisition of both written and aural forms of vocabulary is important in language
learning, commonly used tests of vocabulary are based on orthography, not phonology (Milton &
Hopkins, 2006). While it is generally assumed that learners can transfer their orthographic word
knowledge to phonological word knowledge, this might not be the case. Previous research suggests that
linguistic performance is best in the same modality in which new information was learned. This applies at
least to semantic categorization in L1 (Dodd, Oerlemans, & Robinson, 1988) and vocabulary recognition
in L1 (Nelson, Balass, & Perfetti, 2005) and L2 (Bird & Williams, 2002). Following this research, it is
expected that after receiving aural input, learners will perform better on aural than on written vocabulary
tests, and vice versa. However, it is necessary to investigate the outcome when both types of input are
presented. Will the input in both aural and written modes be learned, and if so, at what ratio?
What Modalities do Learners Attend to?
It is important to investigate what learners pay attention to for both practical and theoretical reasons.
While existing research indicates that the presentation of a video with audio and captions is superior at
least for written vocabulary learning, the concern of many teachers is that learners might not attend to
audio when they also have captions, which would hinder their listening skills development (Borras &
Lafayette, 1994). Garza (1991) argues that captions help develop listening skills, but he did not
investigate this claim empirically. Markham (1999) found that on an aural recognition test, performance
of advanced learners was better when they watched captioned rather than non-captioned videos. He
concluded that participants were attending to audio; however, it is not clear whether the same would be
true for lower-level learners.
The hypothesized reason why learners might not be paying attention to audio when watching captioned
video is that they have to divide their attention among three types of stimuli: visual images, text, and
audio. Because their attentional capacity is limited, learners have to use attention selectively (Robinson,
2003; Wickens, 2007). Otherwise, learners attempts to pay attention to all three modalities may result in
cognitive overload. Cognitive overload occurs even when tasks are performed in the native language and
is attributed to the limits of working memory (Baddeley, 1986, 1992; Chandler & Sweller, 1991; Miller,
1956; Sweller, 1999). The redundancy principle of the cognitive load theory (Sweller, 2005) is relevant
for the current study. This principle assumes that redundant material slows down information processing
and learning. Mayer (1997, 2001) applied cognitive load theory to the generative theory of multimedia
learning. In one study, Mayer, Heiser, and Lonn (2001) found that native speakers of English who saw an
animation and listened to a concurrent corresponding narration in their L1 were able to retain more
information from the narration than those who also received captions as a third modality. The researchers
concluded that captions are distracting when audio is also present because they carry the same
information, which follows the redundancy principle. Thus, according to the cognitive load theory, noncaptioned videos will be easier to process than captioned videos.
However, it appears that the prediction above is not borne out for second language learners. As the studies
on captioned videos indicate, three modes of presentation (i.e., video, audio, and captions) are more
beneficial for listening comprehension and vocabulary learning than two modes (i.e., video and audio).
The question is, how do language learners as opposed to native speakers attend to the three types of
Language Learning & Technology
52
Tetyana Sydorenko
input? Do they continuously attend to audio, but switch between images and captions? Or could it be that
language learners at times are not paying a lot of attention to visual images and instead focus on captions
and audio, especially when the images do not carry useful information? It is also possible that language
learners do not attend to audio as much as they do to captions or that they are not able to process audio as
well as captions. For example, Lambert et al. (1981) found that on a combined written and aural test, L2
learners recognized more words learned from reading than those learned through listening. This might
depend on the nature of instruction or input learners receive. If learners regularly receive more reading
than listening practice, they might process captions better than audio. On the other hand, captions may be
less beneficial for the learners whose L1 has a writing system that is very different from that of the L2. In
the studies reviewed above, writing systems in the L1 and L2 are similar. In the current study with
learners of Russian, the L1 and L2 orthographies differ, but learners of Russian at a college level master
the Cyrillic alphabet within two weeks. Therefore, the difference in orthography is not a factor in this
study.
To answer the question of whether learners attend to audio, one must compare the performance of the
VAC (captioned video with audio) and the VC (captioned video without audio) groups. If both groups
perform similarly post treatment, it would perhaps indicate that audio is not as necessary of a component
for acquisition. However, if the VAC group outperforms the VC group, attention to audio could be
deemed as beneficial to the learning process. If the VAC group underperforms the VC group, it might
mean that attention was too split for the VAC group to succeed.
Only two studies have investigated what modalities L2 learners pay attention to when presented with
three types of stimuli simultaneously: audio, video, and captions. In a study by Vanderplank (1988),
European and Arabic high-intermediate to advanced college students learning English watched captioned
British TV programs over the course of nine weeks. The European students were from France, Germany,
Austria, Denmark, Italy, and Spain. Two learners reported that they tried but could not pay much attention
to audio because the captions were present. Many students reported that initially they were distracted by
the presence of captions. However, over the course of the study, European students found captions useful
and not distracting, while the Arabic students mentioned that the captions changed too fast. Vanderplank
suggested that European students are used to captions and can utilize them better. However, the difference
between the L1 and L2 scripts might be the underlying factor (Winke, Gass, & Sydorenko, 2008).
Another study was conducted by Taylor (2005) with beginning college learners of Spanish. Captions were
found to be distracting for many learners with little exposure to Spanish, while more experienced learners
could utilize all three types of stimuli when processing videos. The majority of all students (26 of 35)
reported that they attended to the audio, suggesting that availability of captions does not make all students
ignore the audio completely. Both Vanderplank and Taylor concluded that over time learners can develop
strategies for processing input in the three modalities. While these studies provide useful information on
learners processing of captioned video, more studies are needed with a variety of learners and viewing
conditions.
Research Questions
The major gaps in research on captioned video concern (a) how captions affect learners acquisition of
vocabulary, especially, the difference between written and aural word forms, (b) how exactly the learners
are able to acquire vocabulary from videos, and (c) what input modalities learners pay attention to when
watching videos. To fill this gap, the present study investigates the following four research questions.
1. Does the modality of input have a differential effect on the learning of written and aural forms of
vocabulary?
2. Is the overall learning of words (i.e., combined written and aural vocabulary) affected by different
input modalities?
3. What input modalities do the learners attend to when watching videos?
4. What strategies do learners use to acquire new vocabulary from videos?
The hypotheses are as follows.
Language Learning & Technology
53
Tetyana Sydorenko
1a. The VC group will score higher on written than on aural tests, and the VA group will score higher
on aural than on written tests because studies suggest that performance is best in the modality in
which new information was learned.
1b. The VAC group will score higher on written than on aural tests because Lambert et al. (1981)
found that learners remembered more words from written than from aural input.
2a. The VAC group will perform better than the VA group based on the previous research on
captioned video.
2b. The VC group will outperform the VA group since written input appears to have an advantage
over aural input (Lambert et al., 1981).
2c. The VAC group will outperform the VC groups since there is some indication from previous
research that learners pay attention to audio.
3a. Based on the existing studies, all learners will pay attention to captions.
3b. Learners will also pay attention to visuals because they have been shown to increase listening
comprehension.
3c. Learners will pay less attention to audio than to captions because written input seems to be more
beneficial than aural.
4. Learners will associate visual images with written and/or aural verbal information to learn new
vocabulary, as has been shown to occur in studies on multimedia annotations.
METHODOLOGY
Participants
The participants were 26 non-heritage learners from two sections of second-semester (beginning) Russian
at a large Midwestern university. The participants age ranged from 18 to 26, with a mean of 20. The L1
was English for 25 participants and Cantonese for one. Of the native English speakers, one participant
also considered French as an L1 and another considered Italian and Spanish as L1s in addition to English.
There were 14 females and 12 males. The participants were compensated for their time with a gift
certificate for lunch (valued at 5 dollars). They also received extra credit in their Russian course for
participating.
The participants were divided randomly into three stimulus conditions: video with audio and captions
(VAC), video with audio (VA), and video with captions (VC). There were eight participants in the VAC
group, and nine participants in each of the other two groups. To ensure that the VAC, VA, and VC groups
did not differ in terms of their abilities to learn written and aural word forms, these abilities were tested
before the study.
Group Equality Test
All participants read one text and listened to another text two times, with the order and modality of texts
counterbalanced (see Appendix A for one of the texts). The topics of the texts, professions and buying,
had not been studied by the participants prior to the study. Almost every sentence in each text contained
one new word accompanied by a visual image. None of the new words were cognates. Every sentence of
the text, together with an accompanying picture when applicable, was presented on a PowerPoint slide for
five seconds. The participants then took written and aural recognition and translation tests. On the
recognition test, the participants had to mark the words that they thought appeared in the texts; these
words were mixed with non-words. On the written recognition test, the new words from prior reading
were presented on paper, and on the aural recognition test, the new words from prior listening were
presented aurally. For the written and aural translation tests, the task was to translate from Russian into
English the same target words that were on recognition tests. After that, the participants were asked to
indicate for each target word whether they have (1) never encountered it before, (2) encountered it before,
(3) knew its meaning, or (4) used it. Only those target words that were new for the participants were
Language Learning & Technology
54
Tetyana Sydorenko
included in the analysis. The results of a mixed-design ANOVA [Test (written recognition, aural
recognition, written translation, aural translation) x Group (VAC, VA, VC)] showed a non-significant
main effect of group, F(2, 17) = .44, p = .65, which means that the three treatment groups were
comparable on learning new vocabulary. The results also revealed a significant main effect of test, F(3,
51) = 18.42, p < .001, r = .51 and a non-significant Test x Group interaction, F(6, 51) = .96, p = .461. As
shown in Table 2 and Figure 1, the participants scored higher on written than on aural tests, suggesting
that their reading abilities were better than their listening abilities.
Table 2. Descriptive Statistics on the Group Equality Test
Written recognition
Aural recognition
Written translation
Aural translation
VAC (N = 8)
M
SD
.84
.10
.68
.11
.62
.22
.27
.18
VA (N = 9)
M
SD
.83
.08
.74
.11
.69
.27
.27
.24
VC (N = 9)
M
SD
.87
.12
.69
.19
.57
.34
.48
.19
All Groups
M
SD
.85
.10
.71
.14
.62
.28
.35
.22
Note. The scores represent percentage of words learned from the new target words.
55
Tetyana Sydorenko
(2001), Bird and Williams (2002), and Danan (1992), the word knowledge test was not given before the
study in order to avoid prompting the participants to pay special attention to new words. At the end of the
study, the participants completed a final questionnaire. The whole procedure took between 50 and 60
minutes.
Materials
The materials consisted of three video clips, each 2 to 3 minutes long, from a popular Russian comedy
series for native Russian speakers. The clips contained target words that were highly unlikely to be known
since they did not appear in the participants textbook. In addition, the participants instructor felt that
these words were highly unlikely to be heard in class. However, the participants could learn the target
words from video context, mainly due to a high correlation between visual images and dialogs. For
example, in one of the video clips a boy says A u menya brat boksyor (My brother is a boxer), after
which a male boxer appears. That is, the criteria for target word selection were that words or phrases must
be well-supported visually and most likely unknown to the participants.1
The topics of the video clips included professions, eating, and compliments. The participants knew only a
few words on the topics of food and professions, and they had not studied the compliments topic prior to
the study. There were 6 target words in Clip 1, 12 target words in Clip 2, and 10 target words in Clip 3
(see Appendix B). Of these 28 words, there were 13 nouns, 5 verbs, 5 adjectives, 1 adverb, 3 phrases
consisting of 2 words, and 1 single-word expression. About one fourth of the words were abstract, others
were concrete and well-supported visually. Of all target words, six were cognates, and two were false
cognates. Usually, learners can guess new words from context when at least 95% of the words in the text
are known (Nation, 2001). Following the intuition of the participants Russian instructor, the videos in
this study had a much lower percentage of known words.2
Instruments
Comprehension test
The comprehension test consisted of three very general true-false questions that did not require the
knowledge of the target words. Although a larger number of questions is desirable to test comprehension,
this was not possible because the videos were too short. The comprehension test was included to
encourage the participants to pay attention not only to the language used in the videos, but also to the
main ideas of the clips. Because comprehension of captioned videos has been widely studied, it was not
the focus of this study and thus was not one of the dependent variables.
Recognition test
Since recognition of new lexical forms is considered to be an initial step in vocabulary learning,
recognition of the target word forms was measured (see Appendix C). The recognition test involved
discriminating between words presented in the input (target items) and those that were not presented in
the input (non-target items) (Pulido, 2004). In this study, non-target items were non-words. Half of the
target words and half of the non-words were presented in a written form, the other half aurally. Following
Huibregtse, Admiraal, and Meara (2002), one point was awarded for a yes response to a target item,
and one point was awarded for a no response to a non-word. Responses to non-words had to be taken
into account to correct for guessing. Words that participants knew before the study were excluded from
the analysis. The scores from the three videos were combined and calculated as the percentage of
recognized new words from the total number of new words, following Smith (2004) (see Equation 1).
Non-words were also added to the equation.
recognized new words + unselected non-words
(1)
56
Tetyana Sydorenko
Translation Test
This test consisted of the same target words as the recognition test, but non-words were not included
(see Appendix C). Half of the words were presented in a written form, the other half aurally. One point
was given for each translation of the new word that was possible in the context of the video. For example,
in clip 1 the word lyotchik means pilot, but the translation airplane would have been accepted because
in the clip the word lyotchik appeared when the airplanes were flying. Although this measure does not
reflect the number of words the participants learned correctly, it shows some level of learning and the
ability to remember new form-meaning associations. The scores from the three videos were combined and
calculated as the percentage of new words translated from the total number of new words.
Word Knowledge Test
The purpose of the word knowledge test was to check which target words were new for each participant.
The participants rated their knowledge of the words prior to the study as 1 = never encountered before, 2
= encountered before, 3 = know the meaning, 4 = use it (see Appendix C). Some non-words from the
recognition tests were used to adjust for guessing. If a participant knew the meaning of the word prior to
the study or used it, such a word was not considered new and was excluded from the analysis.
Final Questionnaire
In the final questionnaire, the participants were asked open-ended questions on whether they liked the
videos and why, what strategies they used to learn the new words, what difficulties they had, and what
could have helped them to understand the videos better. Two Likert-scale questions asked the participants
to rate how much attention they paid to visual images and how helpful they were (following the design of
Sueyoshi & Hardison, 2005). The participants were asked the same questions about audio and captions if
they had them (see Appendix D).
Analysis
The between-participant independent variable was input modality (VAC, VA, and VC), and the withinparticipant independent variable was test modality (written and aural). There were two dependent
variables: recognition and translation test scores. A mixed-design ANOVA3 was conducted for each of
the two dependent variables.
The data from open-ended questions on the final questionnaire were analyzed qualitatively, using a
content analysis approach. In a content analysis approach, themes emerging from the data are grouped
into categories, which should also emerge from the data (Berg, 2001). The data can then be subjected to
frequency counts and descriptive statistics. The data from the two Likert-scale questions on the final
questionnaire were analyzed using descriptive statistics. These data were not subjected to statistical tests
due to the small number of tokens.
RESULTS
Does the modality of input have a differential effect on the learning of written and aural forms of
vocabulary?
The results of a mixed-design ANOVA on the vocabulary recognition test [Test (written, aural) x Input
Modality (VAC, VA, VC)] showed a non-significant main effect of test, F(1, 23) = 1.32, p = .263. The
results also revealed a significant Test x Input Modality interaction, F(2, 23) = 4.06, p = .031, r = .39.
Groups with captions scored higher on written than on aural recognition, while the VA group scored
higher on aural than on written recognition (see Table 3 and Figure 2). The results of a mixed-design
ANOVA on the vocabulary translation test [Test (written, aural) x Input Modality (VAC, VA, VC)]
showed a non-significant main effect of test, F(1, 23) = 2.43, p = .133, and a non-significant Test x Input
Modality interaction, F(2, 23) = .53, p = .599. This indicates that there were no significant differences
between written and aural translation for each group.
57
Tetyana Sydorenko
Is the overall learning of words (i.e., combined written and aural vocabulary) affected by different
input modalities?
The results of the mixed-design ANOVA on the vocabulary recognition test (mentioned above) showed a
non-significant main effect of input modality, F(2, 23) = 1.10, p = .351. This means that combined written
and aural recognition was the same for each group. The results of the mixed-design ANOVA on the
vocabulary translation test (described above) showed a significant main effect of input modality, F(2, 23)
= 3.75, p = .039, r = .37. A post-hoc Tukeys HSD test revealed a significant difference between the VAC
and the VA groups. The VAC group scored significantly higher than the VA group and non-significantly
higher than the VC group on overall translation. This suggests that the VAC combination is more
favorable than the VA combination for learning the meanings of new words.
Table 3. Descriptive Statistics on Vocabulary Tests
Written recognition
Aural recognition
Written translation
Aural translation
VAC (N = 8)
M
SD
.73
.10
.67
.15
.36
.11
.35
.18
VA (N = 9)
M
SD
.63
.10
.69
.08
.25
.13
.18
.13
VC (N = 9)
M
SD
.76
.12
.68
.06
.28
.10
.24
.12
All Groups
M
SD
.71
.12
.68
.10
.30
.12
.25
.15
Note. The scores represent percentage of words learned from the new target words.
58
Tetyana Sydorenko
VA group, the participants said they paid the same amount of attention to audio and video. In the VC
group, the participants reported paying more attention to captions than to video. That is, both the VAC
and the VC groups seemed to pay most attention to captions.
Table 4. Participants Perception of the Amount of Attention Paid to Audio, Captions, and Video
Audio
Captions
Video
VAC (N = 8)
M
SD
3.75
1.28
4.5
1.07
4.25
0.71
VA (N = 9)
M
SD
4.33
0.71
4.33
0.71
VC (N = 9)
M
SD
4.11
3.56
0.78
0.88
The participants were also asked to rate the utility of video, as well as captions and audio when available,
for their understanding of the clips. The descriptive statistics from these data are reported in Table 5. In
all three groups, the participants found video to be most helpful, although, as mentioned above, none of
the three groups reported paying most attention to video. The VAC group found audio to be the least
helpful.
Table 5. Participants Perception of the Amount of Help Obtained from Audio, Captions, and Video
Audio
Captions
Video
VAC (N = 8)
M
SD
3.25
1.39
4.13
0.99
4.75
0.46
VA (N = 9)
M
SD
3.11
0.93
4.78
0.44
VC (N = 9)
M
SD
4.00
4.67
1.00
0.71
Reported difficulties with watching the videos and completing vocabulary tests and their solutions also
reveal to what modalities the learners were attending (see Table 6). Participants in all groups had some
difficulty with audio or captions. Such difficulties were contributed to the speed of the dialogs (they were
too fast), lack of time to read all captions, and the heavy burden of reading captions while watching the
videos at the same time. One learner also mentioned focusing only on known words due to captions.
Another learner reported reading captions and scanning the images, but having no time to listen to audio.
One more learner tried to sound out captions because there was no audio. Because learners in the VAC
group did not specify whether a fast dialog and very fluent Russian referred to audio, captions, or
both, it is not clear which kind of stimulus as a whole was more difficult to process, captions or audio.
Participants in all groups reported that they had difficulty with unknown vocabulary and vocabulary tests.
Many participants mentioned that they could not figure out the meanings of new words or that there was
too much new vocabulary. One participant pointed out that it was difficult to learn words not supported
by images. Learners also mentioned not remembering which specific words were used in the videos,
especially their aural forms. Some participants reported that they could guess the meanings of the words
while watching the videos, but forgot the actual words by the time they had to take a vocabulary test.
59
Tetyana Sydorenko
Table 6. Difficulties with Watching Videos and Completing Vocabulary Tests and their Mitigation
VAC (N = 8)
VA (N = 9)
VC (N = 9)
Difficulties
With audio or captions
fast dialog
fast flow of captions
very fluent Russian
trying to read captions and watch videos at once
captions make one focus on known words
only read captions and scanned visual images
trying to sound out captions due to lack of audio
With vocabulary in the videos and tests
too much new vocabulary
figuring out meanings of new words
learning words not supported by images
were in the video
remembering aural word forms from the video
translating
remembering word-meaning associations
5
1
1
1
1
1
1
3
2
1
1
5
4
3
1
3
3
3
4
3
2
6
1
3
2
7
1
3
5
1
1
Note. The numbers indicate how many within the group provided the given response.
The learners were also asked what could have helped them understand the videos better. The participants
mentioned various types of scaffolding (depending on the group they were in), such as shorter sentences,
captions, sound, more time to read captions, or slower dialog. It appears that most learners wanted to have
access to all modalities (video, audio, and captions) and have more time to process the information in
these modalities. However, while only two learners in the VC group did not state that they would like to
Language Learning & Technology
60
Tetyana Sydorenko
have audio, six learners in the VA group did not say anything about captions. That is, it is either more
important or more natural for the learners to watch videos with audio than with captions, as one would
expect. Several learners provided another solution to the problem of not understanding the videos: a better
knowledge of Russian in general or vocabulary in particular.
What strategies do learners use to acquire new vocabulary from videos?
On the final questionnaire, the participants were asked whether they learned new words from the videos
and how they were able to do that. The strategies learners reported and their frequencies are provided in
Table 7 and can be divided into two categories: modality-specific strategies and common vocabulary
guessing strategies.
Regarding modality-specific strategies, six learners from each group reported using visual images to help
them figure out the meanings of new words. One participant wrote, Most words I learned were
accompanied by actions on screen, such as sadites [sit down], proshu vas [after you], and boksyor
[boxer]. The participants in the VAC group did not say whether they matched visual images to
captions, audio, or both.
Concerning common vocabulary guessing strategies, only one participant in the VAC group reported
using them; specifically, this learner relied on familiar roots. More participants in the other two groups
mentioned that they used guessing strategies: five participants in the VA group and four in the VC group.
The participants thought they had understood new words which were similar to their L1s (English,
Spanish, or Italian), although they did not realize that some of these words were false cognates. They also
reported using roots of familiar words, relying on the verbal context, and employing their knowledge of
grammar to understand new words.
Table 7. Learners Strategies and their Frequencies for Learning New Words
Strategies
Modality-specific strategies
matching visual images with words
reading captions
Common vocabulary guessing strategies
recognizing words that are similar to L1
using the roots of known words
paying attention to verbal context
paying attention to grammar
VAC (N = 8)
VA (N = 9)
VC (N = 9)
6
1
1
1
1
1
1
1
1
Note. The numbers indicate how many within the group provided the given response.
DISCUSSION
The intent of this study was to investigate vocabulary acquisition from different types of video input
when the goal is to both understand the content of the videos and to learn new words from them. The
results suggest that for beginning learners with better reading than listening skills: (a) captions facilitate
recognition of written word forms, while audio facilitates recognition of aural word forms; (b) more word
meanings are learned when videos are shown with both audio and captions than with either audio or
captions; (c) participants think they pay most attention to captions, then to video, then to audio, but they
consider video to be the most helpful; some participants have difficulty attending to all three modalities;
and (d) the meanings of some new words can be learned from very difficult authentic videos when the
language is well-supported by visual images. These findings are discussed below in detail.
61
Tetyana Sydorenko
Does the modality of input have a differential effect on the learning of written and aural forms of
vocabulary?
Since the VA group performed better on aural than on written recognition test, and the performance of the
VC group resulted in the reverse pattern, the results support the hypothesis that recognition of form is best
when modality of input and test modality are the same, as was found by Bird and Williams (2002). Jones
(2004), on the other hand, found that vocabulary recognition was not affected by the modality of the test.
However, in her study text was in L1, while in this study captions and vocabulary tests were in L2.
Additionally, in Jones study test modalities were written and pictorial, while in this study they were
written and aural.
Contrary to the hypothesis, it was found that for recall of meaning test modality does not interact with
input modality. As mentioned earlier, recognition and translation of vocabulary are different skills
(Nation, 2001). The recognition test indicates whether learners have noticed the forms in the input, that is,
their episodic memory of the forms, while the translation test indicates whether learners have understood
the meaning of the forms (Pulido, 2004). Compared to form recognition, production of meaning requires
deeper processing because learners need to deduce the meaning of the form while they watch the videos,
and then recall the meaning of the form when they take the test. It is possible that if learners understood
and remembered the meaning of the word, they have built the connections between the meaning and both
of its forms (written and aural). Thus it does not matter whether they have to produce the meaning of the
written or of the aural form of the same word, or at least the differences are not substantial. On the other
hand, if learners have only noticed the form either through reading or listening, but did not understand its
meaning, one can suppose that they have not built connections between the written and aural forms.
For form recognition, the results also support the hypothesis that given both written and aural input,
learners presented with video, audio, and captions would perform better on written than on aural tests.
This could be due to instruction because the participants in this study appeared to have better reading than
listening skills, or due to the previous finding that people in general process written input better than aural
(Nelson et al., 2005) at least when non-logographic script is used.
Is the overall learning of words (i.e., combined written and aural vocabulary) affected by different
input modalities?
The hypothesis was that learners would be able to recognize and translate more vocabulary in the VAC
group than in the VA group. That is, captions were predicted to increase the acquisition of vocabulary.
This was only partially confirmed: while the VAC group scored significantly higher than the VA group
on overall translation, there were no differences between the groups on overall recognition. It was
mentioned earlier that recognition of written and aural forms depends on modality of input, so there is an
interaction effect. It appears that for acquiring word meaning, the most beneficial condition is video
combined with both written and aural verbal input, but for form recognition, there is no effect from
combined written and aural verbal input.
The fact that learners in the VAC group were able to acquire more word meanings than those in the VA
group is in line with findings in previous research on captions (Baltova, 1999; Danan, 1992; Neuman &
Koskinen, 1992). However, this runs counter to Robinsons (2003) hypothesis that attentional division
among many stimuli may be negatively taxing for L2 learners. Even though the learners in the VAC
group had to divide their attention among more stimuli than those in the VC and the VA groups, and even
reported that following audio or captions was cognitively taxing, the process of doing so did not hinder
their language learning. In fact, in this study it was found that performing three tasks was better than
performing two. This finding also opposes the predictions of cognitive load theory: according to the
redundancy principle, the presentation of the same information simultaneously through two modalities
(text and audio) negatively affects information processing at least in L1 (Sweller, 2005). One explanation
why native speakers and L2 learners may process information in multiple modalities differently is the
idea that certain kinds of redundancy, such as repetitions, topic-fronting, or paraphrase, can be beneficial
for learners comprehension of input (Larsen-Freeman & Long, 1991). In multimedia environments,
redundancy stemming from different modalities might be also beneficial. Jones (2003) and Grgurovi and
Language Learning & Technology
62
Tetyana Sydorenko
Hegelheimer (2007) attribute the advantage of redundancy through multiple modalities to learners
individual preferences: learners choose what they need to focus on (video, audio, or captions).
Additionally, while the processing of speech is natural for native speakers, learners, especially at the
beginning level, might not even know where each word begins and ends. As Ellis (2003, p. 77) put it,
learning to understand a language involves parsing the speech stream into chunks which reliably mark
meaning. Captions could be more useful than distracting because they help learners parse the stream of
speech, as was suggested by Vanderplank (1993) and reflected in the learners comments in Winke et al.
(2008).
It is also possible that captions help beginners more than audio in learning new vocabulary from videos
because the VC group scored higher than the VA group. However, because this difference was not
significant, further research is needed. On the other hand, even if learners can process captions better than
audio, there is evidence in this dataset that they still attend to audio because the VAC group was able to
recall more word meanings than the VC group, although this difference was not significant. In other
words, attention (at some level) was paid to audio and can be considered as the factor that increased word
recall. This supports Markhams (1999) finding that learners attend to audio when they also have captions.
Although the VC group slightly outperformed the VA group, it is not suggested that learners should be
exposed to the VC rather than the VA input. The VC group was used only for research purposes and is
generally not suitable for instruction because, as evident from the participants comments, this condition
is unnatural. One learner tried to sound out captions because there was no audio, and most learners in the
VC group said they wanted audio.
What do the learners attend to when watching videos?
The participants reported paying most attention to captions, then video, then audio. Thus, as predicted,
learners paid attention to all three modalities, and they paid more attention to captions than to audio. This
suggests that captions might have an advantage over audio, which goes along with the results of the
vocabulary tests: in the VAC group written recognition was higher than aural, and the VC group
outperformed the VA group on the translation test. The participants also reported paying some attention to
audio rather then completely ignoring it, although one participant in the VAC group did mention that
he/she ignored the audio while trying to process video and captions. It is not surprising that participants
paid attention to video because studies on listening comprehension (Hernandez, 2005; Sueyoshi &
Hardison, 2005) and multimedia annotations (Al-Seghayer, 2001; Chun & Plass, 1996a) found that video
enhances comprehension and vocabulary learning as compared to verbal information. Wagner (2007)
found that when listening to a lecture or a dialog, all learners paid attention to video, although to varying
degrees.
While learners seem to attend more to captions than to video, the participants reported that video is more
helpful than captions. A plausible explanation is that language learners can only understand a portion of
captions in L2, but they have no difficulty processing visual images and thus find them more helpful. This
goes along with the fact that learners in all groups were reporting difficulties following audio or captions,
but not video, which supports previous findings (Taylor, 2005; Vanderplank, 1988). In Jones (2003),
some learners reported that although annotations in the form of L1 translations were helpful, they did not
encourage deep thinking. On the other hand, pictures did lead to deep processing of aural input, especially
when combined with text. If learners find L1 translations less helpful than images, they will surely find
captions in L2 less helpful than video. Studies reviewed in Paivio (2007) support learners reports that
imagery has an advantage over verbal input for information processing.
An interesting finding was that many learners preferred to have access to all modalities (video, audio, and
captions), even though they also reported difficulties processing all three modalities. While this seems to
be a contradiction, it could be that some cognitively taxing tasks are beneficial for language learners,
especially when multimedia is involved. In Jones (2003), learners who did not have access to annotations
during a listening task were frustrated because they could not comprehend the input, while learners who
had annotations in various modalities did not express such concerns. Thus, for language learners a
combination of input in multiple modalities may be more of an advantage rather than a distraction.
Language Learning & Technology
63
Tetyana Sydorenko
64
Tetyana Sydorenko
they paid attention to provide directions for further research, these results should be interpreted with
caution because no statistical analyses were performed, and the reliability of self-reported data is not clear.
Additionally, on the final questionnaire the participants reported some of the strategies they employed for
learning the new words from videos, but their explanations were not very detailed, possibly because
learners themselves are not aware of this process. Verbal reports, such as stimulated recall or think-aloud
protocols, might be a better method for investigating the processing of video. Eye-tracking technology
could be also used to investigate how learners attend to video and captions, but not audio.
The generalizability of the findings is also restricted by the nature of the videos and participant pool. The
videos were of the comedy genre rather than informative, and they contained a large number of vivid
images, which was conducive to the learning of new vocabulary. If videos had a heavier information load,
for example, depicting historic events, the learners could have paid more attention to the propositional
content rather than new vocabulary. The participants in this study were beginners at a university foreign
language program in the US. They did not often watch videos with captions in Russian, at least as part of
their course, and thus might have not developed strategies for dealing with different types of input
simultaneously. These learners also appeared to have better reading than listening skills. Thus, the results
cannot be generalized to learners of different proficiency levels, in different contexts, such as the target
language environment, or with different types of instruction. Finally, script differences can play a role in
the way learners process captions (Winke et al., 2008). Native speakers of English learning Chinese or
Arabic could receive fewer benefits from captions in vocabulary learning than learners of Russian or
German, or their processing strategies might be different.
Future studies could take different directions to investigate vocabulary acquisition from captioned videos.
One factor not taken into account in this study is individual differences in modality preferences, which
could influence vocabulary acquisition from videos. Drnyei (2005) suggested that some people are
visual and others are auditory learners. Spatial and verbal working memory might also contribute to
individual differences affecting cognitive processing of videos since this was a factor in studies on
multimedia annotations (Plass et al., 1998, 2003). The researchers could also look into other beneficial
ways of using captions in language instruction. For example, as one anonymous reviewer suggested,
captions in a form of dynamic glosses are now possible with recent technological developments and
should be investigated. Giving learners some control over their learning can increase the benefits of
instruction. In Jones (2003), participants liked the fact that they could choose from multiple modalities,
and in this study learners wished they had control over the number of times they could play the videos.
Finally, more research is needed to understand how language learners as opposed to native speakers learn
in multimedia environments as the results in this study suggest that the processes might be different.
NOTES
1. The initial criteria for target word selection were not strict, which made it possible to identify the
maximum number of possible target words. This was necessary in order to maximize the length of the
vocabulary tests, thus making them more reliable. For example, even though many participants might
have known the word holodnij (cold), it was nevertheless included in the vocabulary test. Ideally,
videos with a large number of unknown target words that were well-supported visually would be used,
but I was not able to find such videos. I was also not able to find more videos of this kind. The solution to
this limitation was the use of the word knowledge test to identify which words were new for each
participant. In addition, the target words were counterbalanced across written and aural vocabulary tests
in such a way that half of the words predicted to be possibly known were on the aural test, and the other
half on the written test. For the same reason, words that are easy to learn, such as cognates, were included.
However, they were evenly divided between written and aural vocabulary tests.
2. While a more principled approach of measuring the ratio of known/unknown words in the videos
would have been to give the participants a word knowledge test of all words in the videos, this was not
done because the participants were willing to volunteer for a limited amount of time. The use of the
instructors judgments was considered acceptable because the participants were in a foreign language
Language Learning & Technology
65
Tetyana Sydorenko
environment, and only three participants had completed a study-abroad program (as they indicated in the
background questionnaire). That is, most students had limited exposure to vocabulary not used in the
classroom. Since the ratio of known/unknown words in the videos was used to describe the nature of the
videos rather than as a variable, a more stringent measure was not considered crucial for this study.
3. Although the sample size in each group was relatively small, the homogeneity of variances assumption
was met, thus it was appropriate to use an ANOVA test.
ACKNOWLEDGMENTS
I would like to thank Dr. Paula Winke, Dr. Susan Gass, Dr. Shawn Loewen, Sara Hillman, and Maren
Schierloh for reviewing earlier drafts of this report. I am also thankful to Drs. Diana Pulido and Senta
Goertler for valuable suggestions. Finally, I would like to thank Dr. Dennie Hoopingarner and Michael
Kramizeh for technical support. All remaining errors are my own.
66
Tetyana Sydorenko
Chun, D. M., & Plass, J. L. (1996a). Effects of multimedia annotations on vocabulary acquisition. The
Modern Language Journal, 80, 183197.
Chun, D. M., & Plass, J. L. (1996b). Facilitating reading comprehension with multimedia. System, 24,
503519.
Danan, M. (1992). Reversed subtitling and dual coding theory: New directions for foreign language
instruction. Language Learning, 42(4), 497527.
Danan, M. (2004). Captioning and subtitling: Undervalued language learning strategies. Meta, 49(1), 67
77.
Dodd, B., Oerlemans, M., & Robinson, R. (1988). Cross-modal effects in repetition priming: A
comparison of lipread, graphic and heard stimuli. Visible Language, 22, 5877.
Drnyei, Z. (2005). The psychology of the language learner: Individual differences in second language
acquisition. Mahwah, NJ: Erlbaum.
Ellis, N. C. (2003). Constructions, chunking, and connectionism: The emergence of second language
structure. In C. J. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 63
103). Malden, MA: Blackwell.
Garza, T. J. (1991). Evaluating the use of captioned video materials in advanced foreign language
learning. Foreign Language Annals, 24(3), 239258.
Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Erlbaum.
Grgurovi, M., & Hegelheimer, V. (2007). Help options and multimedia listening: Students use of
subtitles and the transcript. Language Learning & Technology, 11(1), 4566. Retrieved
from http://llt.msu.edu/vol11num1/pdf/grgurovic.pdf
Guillory, H. G. (1998). The effects of keyword captions to authentic French video on learner
comprehension. CALICO Journal, 15(1-3), 89108.
Hernandez, S. S. (2005). The effects of video and captioned text and the influence of verbal and spatial
abilities on second language listening comprehension in a multimedia learning environment. Dissertation
Abstracts International, 65(8), 2958-A-2959-A. (UMI No. DA3142667)
Huibregtse, I., Admiraal, W., & Meara, P. (2002). Scores on a yes-no vocabulary test: Correction for
guessing and response style. Language Testing, 19, 227245.
Jones, L. (2003). Supporting listening comprehension and vocabulary acquisition with multimedia
annotations: The students voice. CALICO Journal, 21(1), 4165.
Jones, L. (2004). Testing L2 vocabulary recognition and recall using pictorial and written test items.
Language Learning & Technology, 8(3), 122143. Retrieved
from http://llt.msu.edu/vol8num3/jones/default.html
Jones, L., & Plass, J. L. (2002). Supporting listening comprehension and vocabulary acquisition with
multimedia annotations. The Modern Language Journal, 86, 546561.
Lambert, W. E., Boehler, I., & Sidoti, N. (1981). Choosing the languages of subtitles and spoken
dialogues for media presentations: Implications for second language education. Applied Psycholinguistics,
2, 133148.
Larsen-Freeman, D., & Long, M. (1991). An introduction to second language acquisition research.
London: Longman.
Markham, P. (1993). Captioned television videotapes: Effects of visual support on second language
comprehension. Journal of Educational Technology Systems, 21(3), 183191.
Markham, P. (1999). Captioned videotapes and second-language listening word recognition. Foreign
Language Annals, 32(3), 321328.
Language Learning & Technology
67
Tetyana Sydorenko
Markham, P. (2001). The influence of culture-specific background knowledge and captions on second
language comprehension. Journal of Educational Technology Systems, 29(4), 331343.
Markham, P., & Peter, L. (2003). The influence of English language and Spanish language captions on
foreign language listening/reading comprehension. Journal of Educational Technology Systems, 31(3),
331341.
Markham, P., Peter, L. A., & McCarthy, T. J. (2001). The effects of native language vs. target language
captions on foreign language students' DVD video comprehension. Foreign Language Annals, 34(5),
439445.
Mayer, R. E. (1997). Multimedia learning: Are we asking the right questions? Educational Psychologist,
32, 119.
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Mayer, R. E., Heiser, J., & Lonn, S. (2001). Cognitive constraints on multimedia learning: When
presenting more material results in less understanding. Journal of Educational Psychology, 93, 187198.
Miller, G. (1956). The magical number seven, plus or minus two: Some limits on our capacity for
processing information. Psychological Review, 63, 8197.
Milton, J., & Hopkins, N. (2006). Comparing phonological and orthographic vocabulary size: Do
vocabulary tests underestimate the knowledge of some learners? The Canadian Modern Language Review,
63(1), 127147.
Nation, I. S. P. (2001). Learning vocabulary in another language. New York: Cambridge University
Press.
Nelson, J., Balass, M., Perfetti, C. (2005). Differences between written and spoken input in learning new
words. Written Language and Literacy, 8(2), 101120.
Neuman, S. B., & Koskinen, P. (1992). Captioned television as comprehensible input: Effects of
incidental word learning from context for language minority students. Reading Research Quarterly, 27,
94106.
Paivio, A. (1986). Mental representations: A dual coding approach. Oxford: Oxford University Press.
Paivio, A. (1991). Dual coding theory: Retrospect and current status. Canadian Journal of Psychology, 45,
255287.
Paivio, A. (2007). Mind and its evolution: A dual coding theoretical approach. Mahwah, NJ: Erlbaum.
Plass, J. L., Chun, D. M., Mayer, R. E., & Leutner, D. (1998). Supporting visual and verbal learning
preferences in a second language multimedia learning environment. Journal of Educational Psychology,
90, 2536.
Plass, J. L., Chun, D. M., Mayer, R. E., & Leutner, D. (2003). Cognitive load in reading a foreign
language text with multimedia aids and the influence of verbal and spatial abilities. Computers in Human
Behavior, 19, 221243.
Plass, J., & Jones, L. (2005). Multimedia learning in second language acquisition. In R. Mayer (Ed.), The
Cambridge handbook of multimedia learning (pp. 467488). New York: Cambridge University Press.
Pulido, D. (2004). The relationship between text comprehension and second language incidental
vocabulary acquisition: A matter of topic familiarity? Language Learning, 54(3), 469523.
Robinson, P. (2003). Attention and memory during SLA. In C. J. Doughty & M. H. Long (Eds.), The
handbook of second language acquisition (pp. 631678). Malden, MA: Blackwell.
Smith, B. (2004). Computer-mediated negotiated interaction and lexical acquisition. Studies in Second
Language Acquisition, 26, 365398.
68
Tetyana Sydorenko
Sueyoshi, A., & Hardison, D. (2005). The role of gestures and facial cues in second language listening
comprehension. Language Learning, 55(4), 661699.
Sweller, J. (1999). Instructional design in technical areas. Camberwell, Australia: ACER Press.
Sweller, J. (2005). The redundancy principle in multimedia learning. In R. Mayer (Ed.), The Cambridge
handbook of multimedia learning (pp. 159168). New York: Cambridge University Press.
Taylor, G. (2005). Perceived processing strategies of students watching captioned video. Foreign
Language Annals, 38(3), 422427.
Vanderplank, R. (1988). The value of teletext sub-titles in language learning. English Language Teaching
Journal, 42(4), 272281.
Vanderplank, R. (1993). A very verbal medium: Language learning through closed captions. TESOL
Journal, 3(1), 1014.
Wagner, E. (2007). Are they watching? Test-taker viewing behavior during an L2 video listening
test. Language Learning & Technology, 11(1), 6786. Retrieved
from http://llt.msu.edu/vol11num1/wagner/default.html
Wickens, C. D. (2007). Attention to the second language. IRAL, 45(3), 177191.
Winke, P., Gass, S., & Sydorenko, T. (2008, August). The effects of captioning on video-based listening
activities in the second language classroom. Paper presented at the International Association of Applied
Linguistics conference, Essen, Germany.
69
Tetyana Sydorenko
: .
Olya: For my mom, I bought panty hose.
: .
Olya: For my aunt, I bought a head scarf.
: .
Olya: For my dad, I bought a belt.
: ?
Olya: And what did you buy?
: .
Lena: For my dad, I bought gloves.
: .
Lena: For my mom, I bought a hat.
: .
Lena: For my sister I bought a jewelry box.
: .
Lena: And my brother will get money.
Note. Target words supported by visual images are underlined here, but not in the study. Translation is provided for the journal
readers only, not for the participants.
70
Tetyana Sydorenko
Video 2
yesh (eat)
ne budu (I wont)
morkovka (carrot)
sol (salt)
lapsha (noodles)
lupa (magnifying glass)
goryachij (hot)
holodnij (cold)
lopaj (colloquial eat)
luk (onion)
kompot (compote)
korotkaya (short)
Video 3
sadites (sit down)
tsvetochki (flowers)
obaldet (wow)
divnij (nice)
proshu vas (after you)
pohozhi (look like)
naryadnaya (dressed up)
prichyoska (hairdo)
potresayushche (super)
kulonchik (pendant)
Note. All captions (including target words) appeared in Cyrillic in the study.
71
Tetyana Sydorenko
APPENDIX C. Sample Comprehension, Vocabulary, and Word Knowledge Tests for Videos
Mark the following as true or false.
____ The boys friends will help them fight.
____ The boys are saying they will be in the military when they grow up.
____ The boys are brothers.
Now you will hear six words or phrases. Check all of those that were in the video.
1.
4.
2.
5.
3.
6.
2. _______________________________________________________________
3. _______________________________________________________________
[Note. Translation test was not on the same page as the recognition test.]
What was your knowledge of these words before today? Circle the corresponding number.
Never encountered it
1
1
1
Encountered it
2
2
2
Use it
4
4
4
[Note. Words marked with * are non-words (they were not marked in the original test).]
72
Tetyana Sydorenko
Yes ___
No ___
Why?
2. Did you learn any new words when watching the videos?
Yes ___
No ___
If yes, describe what you did to figure out the meanings of new words.
3. For each of the following statements, please circle the option that applies to you.
a. When watching the videos, I was listening to the sound
All the time
Not at all
Not at all
c. When watching the videos, I was paying attention to the visual images
All the time
Not at all
4. What were the most difficult things for you when watching the videos?
5. What would have helped you to understand the videos better?
6. What was difficult while you were doing the exercises after the videos?
7. Please rate the following statements on a scale from 5 to 1.
a. The captions helped me to understand the videos.
Strongly agree
5
Strongly disagree
4
Strongly disagree
4
Strongly disagree
4
73
74
game-like virtual world. These studies have high ecological validity and pedagogical and practical
significance, but they did not investigate how a second language video game might be played by only one
individual. Since many video games can be played by a single player, our study focused on the humancomputer interaction of a single-player game.
Second language teachers, students or media designers may not be especially helped by general posits for
video games support of second language acquisition that focus on the benefits of playful learning
(Hubbard, 1991; Prensky, 2001), motivation, (Baltra, 1990; Carrier, 1991; deHaan, 2005a; Hubbard,
1991; Li & Topolewski, 2002), rewards (Li & Topolewski, 2002), or positive affect (Garcia-Carbonell,
Rising, Montero, & Watts, 2001). Understanding how language learning may happen with video games is
more than just the fact that language is involved in the play (Hubbard, 1991, p. 221222). It is easy to
blindly accept something as valuable for language learning simply because it involves language and
problem solving and students enjoy it . [M]edia selection should be done on the basis of whether it
really promotes language learning (Hubbard, 1991, p. 222).
Language teachers or learners may also not be aided by discussions focusing on game features such as:
comprehensible input, self study opportunities, subtitles, repetition, and authentic language (Baltra, 1990;
deHaan, 2005a; Hubbard, 1991; Meskill, 1990; Purushotma, 2005); these are features games share with
other educational technologies such as DVD movies. Second language research, teaching, and design
should focus on what distinguishes games from other multimedia. We agree with Clark (2001) that all
instructional media contain both technological affordances and educational communication between the
designer and user. Video games incorporate various technological and pedagogical elements to both
entertain and train the player. Pedagogical scaffolds may be essential for a users understanding of many
complex games and learning environments (Allen & Otto, 1996; Gee, 2003; Kulikowich & Young, 2001;
Um & deHaan, 2005). These scaffolds are worthy of study and hold enormous potential for assisting
teaching with, playing and designing games, but they are not found in every game. The starting point for
our project was a focus on what constitutes a game rather than what can be added to a game to make it
more enjoyable or educational.
Study Framework
Interactivity, the extent to which users can participate in modifying the form and content of a mediated
environment in real time (Steuer, 1993, p. 84) is a defining characteristic of video games (Murray, 1997;
Wolf & Perron, 2003). While many games include animated or video sequences that do not allow or
require player interaction, games invariably necessitate some degree of player interaction in order to
advance (Tamborini et al., 2001, p. 22). Turkle (1985) suggested that, while television is something you
watch, a video game is something you do, a world that you enter and, to a certain extent something
you become (pp. 6667). While interactivity certainly characterizes video games, it is a challenging
construct to frame and to study as it is overused and underdefined (Heeter, 2000, p. 75) and perhaps
the most grossly misunderstood and callously misused term associated with computers (Crawford,
2005, p. 25).
Numerous taxonomies of interactivity in educational media exist (e.g., Sims, 1997). Although
classifications help to identify player actions, research has not yet been conducted to demonstrate the
effects of these various interactivities on the second language acquisition process. Complex materials can
prevent learning (Paas, Renkl, & Sweller; 2003), and it has not yet been determined if or what complexity
is added by physical interaction with a second language technology or media. This article reports a study
in which the physical interactivity of a second language music video game was manipulated to investigate
the effect of interactivity on vocabulary acquisition and cognitive load. Before describing the methods,
results and conclusions of the project, an addition to a cognitive theory of language learning with
multimedia is suggested.
75
Plass and Jones (2005) presented an interactionist and cognitivist model of language learning with
multimedia by synthesizing Gass (1997) second language acquisition model and Mayers (2001)
cognitive theory of multimedia. Plass and Jones reviewed examples and studies of second language
instructional multimedia that assist language learning: (a) glosses: text and/or images that provide
additional information for unknown lexical items (e.g., Chun & Plass, 1996a, 1996b; Laufer & Hill, 2000)
and (b) simultaneously-presented aural and video information (e.g., Hernandez, 2004). Plass and Jones
(2005) stated that the level of cognitive load induced by the input enhancement and the role this load
may play in the acquisition of vocabulary and construction of meaning needs to be taken into
consideration (p. 483). Their consideration of apprehended input seemed to be concerned only with
words or pictures selected (by simple computer mouse movements) and viewed; neither the research
studies they review nor their theoretical framework address the physical interactivity required of more
complex learning environments such as virtual worlds and video games.
When language students watch a video with subtitles, they are required only to attend to input, and their
cognitive resources may not be so taxed as while playing a video game. Players of a video game in their
second language must perform additional playful and spontaneous tasks (dependent on the specific game
genre, for example, pressing a button in time with music, such as the game used in this study) while
simultaneously attending to aural and textual language. Not only the cognitive load of the input
enhancement, but also that of the fundamental interactivity with the learning environment (whether
simple computer mouse actions, or quick and complex video game controller movements) should be a
focus of inquiry. Cognitive load theory (Paas et al., 2003) provides a framework for understanding the
effect of interactivity on the language learning process. Since human cognitive architecture consists of a
limited short-term (or working) memory (Baddeley, 1992; Miller, 1956), and a games complex elements
(e.g., music and subtitles) can create an unalterable high demand on working memory (intrinsic cognitive
load), it is important to understand whether a media feature, such as interactivity, presents a student with
unnecessary extraneous cognitive load, which interferes with learning, or germane load, which enhances
learning.
There is some evidence for interactivity increasing mental activity. Pellouchoud, Smith, McEvoy, and
Gevins (1999) compared (using electroencephalograms) the mental effort required of children playing or
watching a Super Nintendo puzzle game for 15 minutes. The subjects experienced higher cognitive load
(i.e., higher theta rhythms and lower alpha and mu rhythms were recorded) when playing the game. No
learning outcomes were measured in their study. Brett (2001) reported a unique study in which language
learners were required not only to attend to language in various audio-visual presentations, but also to
simultaneously perform an interactive task. He found that students exposed to video and subtitles
performed best on written summaries, followed by subjects who used video, subtitles and simultaneous
on-screen comprehension tasks. Brett concluded that the complex learning environment of videos,
subtitles, and tasks caused cognitive overload. Since neither Pellouchoud et al. nor Brett specify which
type of cognitive load was caused by interaction with the media in their studies, further research is
required.
Our study was designed to investigate whether the interactivity (and simultaneously presented text, audio
and animation) of video games is extraneous cognitive load (thus having a negative effect on learning) or
germane load (thus having a positive effect on learning). In addition to cognitive load theory, the impact
of attention on language learning outcomes was also of conceptual value to the study. The importance of
attentions role in second language acquisition has been established both theoretically and empirically
(Leow, 2000; Mackey, Gass, & McDonough, 2000; Robinson, 1995; Rosa & ONeil, 1999; Schmidt,
2001; Tomlin & Villa, 1994). Not being able to attend to second language input in a media environment
(such as a video game) will prevent subsequent analyses, integration, and use of that language (Chapelle,
1998).
76
Kalyuga, Chandler, and Sweller (1999) described instructional media that cause split attention and
extraneous cognitive load as having several sources of information [that] are difficult or impossible to
understand in isolation and must be mentally integrated to achieve understanding (p. 367-368). Although
the audio-visual split attention Kalyuga et al. described can take place in video games (e.g., animations,
subtitles, and spoken dialogue), a more primary focus of research on split attention should be of that
between the audio-visual elements and the typifying physical interactivity of video games, since video
games physical interface requires frequent input from the player and the input required can disrupt the
players involvement with the game space (Taylor, 2002, p. 20). Because games can contain useful
linguistic information, it is necessary to determine whether interactivity contributes to or detracts a
student from noticing it.
deHaan (2005b) reported one Japanese as a foreign language students experience of playing a Japanese
baseball video game for one month. Although anecdotal positive learning outcomes were documented
(Kanji character reading improved 57% on the post-test from the pre-test), the participant reported that his
attention was divided between playing the game and listening to and reading the Japanese (I can hear
them talking, but Im concentrating on hitting the ball Im not listening to them and Im trying to
listen [to what the announcers are saying] . . . Im not paying attention to pitching p. 284) and he could
not focus on both at the same time, a result that supports Bretts (2001) findings and Kalyuga et al.s
(1999) suggestions.
Second language acquisition theory delineates the importance of noticing linguistic information in a
media environment (Chapelle, 1998; Schmidt, 2001), and video games seem to be a medium with various
features that can support the language acquisition process (deHaan, 2005a), yet the particular influence of
playful interactivity is not yet well understood. Cognitive load theory (Paas et al., 2003) and research
seems to suggest that physical interactivity will increase mental effort (Pellouchoud et al., 1999) and
hinder noticing and vocabulary acquisition (Brett, 2001; deHaan, 2005b). However, the question of how
interaction with a video game environment may affect second language acquisition has not been
adequately investigated.
RESEARCH QUESTIONS
1. What is the effect of the degree of interaction (i.e., watching or playing) with a music video game
on immediate written vocabulary recall? Are there additional effects for interaction and language
proficiency or video game proficiency on vocabulary recall?
2. What is the effect of the degree of interaction (i.e., watching or playing) with a music video game
on delayed written vocabulary recall?
3. What is the effect of the degree of interaction (i.e., watching or playing) with a music video game
on cognitive load? Are there additional effects for interaction and language proficiency or video
game proficiency on cognitive load?
4. Was there a difference between the attitudes of players versus watchers of the music video game?
Non-directional hypotheses were assumed and null hypotheses of no differences were tested in this study;
no expectations were made in regard to vocabulary recall, cognitive load or opinions for any group.
METHOD
Participants and design
This experimental study investigated to what degree, if at all, video game interactivity would help or
hinder the noticing and recall of second language vocabulary. Independent variables included
interactivity, language proficiency, and video game proficiency. Dependent measures included cognitive
Language Learning & Technology
77
load, vocabulary written recall (immediate and 2-week delayed), and participant opinions of the
treatment.
Eighty undergraduates (65 males, 15 females, ages 18-24), from a computer science university in rural
Japan participated. The participants spoke Japanese as their first language, had between 6 and 11 years of
formal English education, and were taking one or two weekly English for Specific Purposes (Computer
Science) classes. Very few participants had taken a standardized English proficiency test and very few
students studied English with media (e.g., movies, music, and books) outside of class. The participants
rated, on a scale ranging from 1 (much worse) to 7 (much better), their reading skills as slightly worse (M
= 3.49), their listening skills as slightly worse (M = 3.32), and their music video game skills as worse (M
= 2.75) than other students in their year at the university. Nine participants had traveled (all for less than a
month) in an English-speaking country. Only individuals who had never played Parappa the Rapper 2
(NanaOn-Sha, 2002) or similar music video games participated. The participants reported many years of
video game playing experience (M = 11.85), most considered role-playing and action/adventure games to
be their favorite video game genres, and 16 had played English-language video games. Fifty-six of the
participants liked music video games, and the participants played video games, on average, for 7.16 hours
each week (SD = 9.16, Mode = 3.0). The participants were randomly selected from the university via
flyers and email. The two experimental groups (players and watchers of a video game) did not differ
significantly in their: gender, age, level of education, familiarity with music video games, overall
language proficiency, overall video game proficiency, or self-reported pre-treatment knowledge of the
vocabulary of the video game used in the study (players M = 35.7 words, watchers M = 35.8 words, t(39)
= .152, p = .880),. The written pre-test required each participant to indicate whether he or she knew a
particular vocabulary item. The test was comprised of 41 distinct words from the video game lyrics used
as answers on the cloze test (duplicate vocabulary items from the lyrics were not included) and 21
additional distracters. There were 62 items in total on the vocabulary pretest, and these items were
arranged randomly. A cloze test of the game's lyrics was not used to gauge prior knowledge in order to
avoid priming the subjects for the post-tests.
This study manipulated video game interactivity. Forty pairs of students participated and in each pair, one
played the game and the other watched an identical video signal of the partners game (See Figure 1). For
the interactive treatment, participants played Stage One of the game Parappa the Rapper 2 on a Sony
PlayStation 2 connected to a 25 TV. English subtitles were displayed on the screen. Players did not
pause the game. Players used stereo headphones and sat in a cubicle. For the non-interactive treatment,
participants watched the video of the players game on a linked identical 25 TV in an adjoining cubicle.
The watchers could also see English subtitles on the screen and used stereo headphones. Participants were
grouped in order to ensure that each pair was exposed to identical language. The players and watchers
could not see each other or interact.
Materials
Parappa the Rapper 2 was used because of its authentic English language (it was designed for the
commercial North American video game market), its simultaneously presented aural and textual language
(English raps and English subtitles), and its prototypical rhythm game elements (gameplay requires
players to keep time with a musical rhythm; Wolf, 2001, p. 130).
78
Figure 1. A birds eye view of the treatment cubicles for the player and watcher of the game.
Note. Arrows indicate the delivery of the game consoles audio and video signal to both screens.
Each level of this game is a short rap; the player completes lines of the rap by pressing controller buttons
at the correct times. If the players timing is off, the line is not completed and if many lines are not
completed, the player fails the stage. The rap for the game stage played in this project contains
instructions about how to make a burger (e.g., heat the grill and turn the patty over). Some of the
lines of the rap are given by a non-player character (Beard Burger Master). English is not central to
gameplay. In other words, a player need not comprehend the games language to interact successfully
with the game. The games display includes a fast food restaurant (various ingredients and cooking
implements can be seen on the screen), a rhythm meter, a score meter, and subtitles of the games lyrics
(Figure 2). Parappa the Rapper 2s elements all relate and interact in complex ways. One button press
(cued by a moving icon on the rhythm bar) produces a heard word (e.g., burger) coupled with an action
in the 3D environment (e.g., flipping a burger) that is semantically linked to other words to create a line of
the rap and to other actions to create a fast food meal. The correctness of the button press is indicated by
an animation on the screen, a change in score, and a sound effect. The game seems to have high
interactivity since its elements need to be simultaneously noticed and understood by a player (or watcher).
Physical interactivity can have many forms and functions; the type that most closely approximates the
physical interaction the player had with this studys game is object interactivity in which ... things are
activated using ... a device [causing] some form of audio-visual response (Sims, 1997, p. 162),
categorized as basic stimulus-response (p. 161). The interactivity of the game in this study is quite
simple compared to many of the interactive learning environments used in teaching and research with
Language Learning & Technology
79
technology, for example Nelson and Ketelhuts (2007) virtual inquiry worlds. Still, since any extraneous
load can disrupt learning (Sweller, 1994), it is important to examine the effect of the interactivity that this
(or any) game has on learning processes and outcomes.
80
into high and low language and video game proficiency groups.
Next, participants were randomly assigned to one of the two treatments. The pairs participated for 20
minutes. Each participant was instructed to play (or watch) the game and learn the words of the rap. The
video game level was repeated five times by all pairs. This was done to approximate the repetitions of a
level, either for fun or because the player failed the stage, which might happen in authentic gameplay with
this particular genre. The time and repetitions were decided after a pilot test (six months prior to the
experiment) in which students who had participated for 30 minutes doing eight repetitions reported
boredom and shifting attention. The participants were not allowed to take notes or use their dictionaries.
Following the treatment, the participants each completed (presented in Japanese and English) a cognitive
load measurement, a vocabulary written recall test, and an opinion questionnaire. Finally, two weeks
following the treatment, the participants completed the same vocabulary written recall test.
Instruments
Cognitive load
The participants completed a Cognitive Load Subjective Experience Questionnaire targeting invested
mental effort (based on Paas, 1992; Cronbachs alpha > 0.85 as cited in Paas, Van Merrienboer, & Adam,
1994) and perceptions of material difficulty (based on Kalyuga et al., 1998; Cronbachs alpha = .4583).
Mental effort may not be the same as task difficulty (i.e., a particular learner may find the material
difficult, but not be willing to invest any mental effort to understand it). Items in the questionnaire
distinguished between the cognitive load from playing or watching the game and cognitive load from the
games language. The four questions were:
1. How much mental effort did you invest in playing (watching) the video game? (nine-point Likert
scale from extremely low mental effort to extremely high mental effort)
2. How easy or difficult was the video game to play (watch)? (seven-point Likert scale from
extremely easy to extremely difficult)
3. How much mental effort did you invest in studying the video games language? (nine-point Likert
scale from extremely low mental effort to extremely high mental effort)
4. How easy or difficult was the video games language to understand? (seven-point Likert scale
from extremely easy to extremely difficult)
Vocabulary written recall (immediate and 2-week delayed)
The lyrics of the game level were used as a written cloze test (i.e., stressed words from the lyrics were
replaced with blanks). There were 41 unique vocabulary words in the cloze test. Participants were
required to write the missing words from the games rap in the tests blanks. Answers were scored using
the acceptable scoring method, meaning misspelled but recognizable (e.g., musterd instead of
mustard) and answers with the correct stem (e.g., round instead of around) were accepted. Notably,
though, answers that matched semantically (e.g., watch the fire instead of the correct answer watch the
grill) were not accepted. Each correct answer was counted as one point. The same vocabulary written
recall test was given two weeks following completion of the treatment to measure vocabulary retention.
See Appendix A for the test (underlined words appeared as blanks for the participants).
Player and watcher opinions
Each subject reported his/her enjoyment of the game or video, the usefulness of the game or video for
studying English, and any questions or comments about his/her experience. These opinions were
analyzed: (a) to determine trends in the participants experiences of the video game and the video and (b)
to supplement the measures of vocabulary recall and cognitive load.
81
RESULTS
The data did not violate assumptions of normality, linearity, homogeneity of regression slopes, equality of
variance, or homogeneity of intercorrelations. For the statistical analyses, the alpha level was .05, the
power was .80, and the effect size was .46 (medium) based on G*Power 3 (Faul, Erdfelder, Lang, &
Buchner, 2007).
Research question 1. What is the effect of the degree of interaction (i.e., watching or playing) with a
music video game on immediate written vocabulary recall? Are there additional effects for
interaction and language proficiency or video game proficiency on vocabulary recall?
A paired-samples t-test revealed that the watchers of the video game recalled significantly more
vocabulary items (M = 21.70, SD = 6.94) than the players [M = 7.23, SD = 4.76, t(39) = 11.63, p<.05].
See Table 1. The eta squared statistic (.78) indicated a large effect size, Cohens d = 2.43 also showed a
large effect, and observed power was 0.99. No statistically significant main or interaction effects for
language proficiency or video game proficiency on vocabulary recall were found.
Table 1. Mean Scores on Immediate Post-Procedure Vocabulary Recall Test
Recall Test Scores
Treatment Group
Mean
SD
Players
40
7.23
4.76
Watchers
40
21.7
6.94
Research question 2. What is the effect of the degree of interaction (i.e., watching or playing) with a
music video game on delayed written vocabulary recall?
The average immediate (Time 1) and delayed (Time 2) vocabulary recall scores of the players and
watchers are presented in Table 2. Missing delayed posttest data required that 14 participants (i.e., 7 pairs
of participants) be removed from the data set, leaving 66 participants (33 pairs) data. A two-factor
(interactivity: watching or playing the video game) analysis of variance (ANOVA) with a repeated
measure on one factor (time: immediate or delayed posttest) revealed that the main effect for time was
significant F (1,64) = 76.82, p < .05, Eta-squared = .546 (a strong effect), Cohens d = 0.91 (a large
effect), observed power = 0.99. The immediate recall scores were much higher than the delayed recall
scores. A significant main effect for interactivity was also obtained, F (1,64) = 129.01, p < .05, Etasquared = .668 (a strong effect), Cohens d = 2.54 (a large effect), observed power = 1.00. The vocabulary
recall scores of the watchers were significantly higher, on average, than the scores of the players.
Table 2. Mean Scores on Immediate and Delayed Vocabulary Recall Tests
Time 1
(immediately post-procedure)
Time 2
(2 weeks after procedure)
Treatment Group
Mean
SD
Mean
SD
Players
33
7.42
5.07
33
5.15
3.81
Watchers
33
23.27
6.09
33
16.03
5.79
A significant interaction effect (i.e., Time x Interactivity) was also obtained, F (1,64) = 20.96, p < .05,
Eta-squared = .247 (a medium effect), observed power = 0.98. There was a much larger decrease in scores
for the watchers (from 23.27 to 16.03) than for the players (from 7.42 to 5.15), although the very low
scores of the players on the recall test immediately following the procedure may explain some of the
Language Learning & Technology
82
83
game level (foods, cooking tools) and the sentences describing the pictures made it easy to learn the
vocabulary.
Players concentration
Many players commented that it was very difficult for them to pay attention to both the game and the
language simultaneously. They stated difficulty in pressing the buttons accurately when they focused on
the language and that they could not listen to or read the English when they concentrated on the
gameplay. A few of the players commented that they wanted to be a watcher to learn the English. One
player commented that other video game genres would be better reading practice.
In summary, both the players and the watchers of the video game recalled vocabulary from the game, but
the players recalled significantly less vocabulary than the watchers. This seems to be a result of the
extraneous cognitive load induced by the interactivity of the game; the players perceived the game and its
language to be significantly more difficult than the watchers did. Both players and watchers forgot
significant amounts of vocabulary over the course of the study. Players reported difficulty simultaneously
attending to gameplay and vocabulary.
DISCUSSION
The players of the video game did recall some of its language, confirming various posits (Baltra, 1990;
deHaan, 2005a; Hubbard, 1991; Meskill, 1990) that video games are potential sources of linguistic
information for language learners. However, the watchers of the game recalled more vocabulary items
than the players. It is important to note that the difference between the players and watchers mental
effort in the games language was not statistically significant. The players and watchers invested
comparable mental effort on media and language they perceived to be of different difficulties; the players
poorer recall of vocabulary seems to be attributable to the interactivity of the video game.
It is also important to note that the players and watchers did not differ significantly in their pre-treatment
knowledge of the video game vocabulary. If their knowledge had differed, the difference in vocabulary
recall scores could be attributable to prior knowledge. Additionally, although both players and watchers
reported knowing many of the games words prior to the study (players M = 35.7 words, watchers M =
35.8 words), the players recalled only 7.2 words and the watchers recalled only 21.7 words. The
participants were not able to notice or recall many known words in either the video and video game
environments.
It can be concluded that the physical interactivity of this particular game was extraneous cognitive load,
that is, the interactivity was not conducive to learning and seems to have unnecessarily diverted the
players attention from the vocabulary and hindered recall. The interactivity of this video game was not
germane cognitive load; it did not contribute to schema development. The watchers of the game were not
exposed to the additional extraneous load of the physical interactivity and were able to devote more
cognitive resources to the intrinsic load of the game and its language. This finding extends the research
conducted by Pellouchoud et al. (1999); their participants also experienced higher cognitive load when
playing a video game than when watching a video game.
Previous research (Brett, 2001; deHaan, 2005b) illustrated that interactivity with foreign language
multimedia learning environments can hinder language acquisition. Our study reinforces those findings;
the players recalled fewer vocabulary items than the watchers. With continued research on other
interactivities (such as those mentioned by Sims, 1997), learning environments (like the multiplayer
games studied by Sykes et al., 2008, and Zheng et al., 2009) and language elements, it may be possible to
broaden Plass and Jones (2005) discussion of apprehended input in second language multimedia
environments to include a consideration of the users physical manipulation of the medias interface.
84
As Hubbard (1991) suggested, it is important to be critical about the potential of video games to support
second language acquisition. This study demonstrated that even though a video game can be enjoyable
and contain foreign language vocabulary, its interactivity hindered the language acquisition process;
players were unable to recall the games vocabulary as well as watchers could recall them.
The reason for the lack of more positive learning outcomes for the players in this study may be framed by
Murrays (1997) discussion of interactivity and agency:
Activity alone is not agency. Some games, like chess, can have relatively few or infrequent actions
but a high degree of agency, since the actions are highly autonomous, selected from a large range of
possible choices, and wholly determine the course of the game. (p. 128)
This studys video game did not in any way allow the players to navigate through the rap, or to make
meaningful choices about the games language. The object interactivity (Sims, 1997, p. 161) of the
video game lacked user agency, and the controlled experimental research design seemed to have
prevented the learners from select[ing] and organiz[ing] their own learning resources (Schwienhorst,
2002, p. 197) or focusing the learners on the meaning and purpose (Culley, Mulford, & Milbury-Steen,
1986, p. 70) of language. More agentive (i.e., immersive virtual; Sims, 1997, p. 167) interactivities
(e.g., video games with more controllable in-game tasks that foster deeper consideration of language) may
support better incidental second language vocabulary recall outcomes than the basic stimulus-response
(Sims, 1997, p. 161) interactivity of this studys video game. Also, English was not crucial for gameplay;
this may have contributed to the results. Games that integrate language use and play may be better for
language acquisition.
The players inability to simultaneously attend to the game and its language support the theoretical and
empirical work on noticing and second language acquisition (e.g., Schmidt, 2001) and split attention
effects (Kalyuga et al., 1999). The players of the video game were asked to play the game and attend to
the vocabulary simultaneously and these multiple foci of attention prevented them from noticing and
recalling more vocabulary items than the watchers.
As Gass (1997) has delineated, before language can be truly acquired, it must be comprehended,
integrated with prior knowledge, and used purposefully. Our results do not suggest that vocabulary can be
acquired more effectively through non-interactive environments than interactive environments, only that
the initial exposure to the vocabulary of this game, for these players, was made more difficult by their
simultaneous interaction with the video game. As in previous studies involving incidental vocabulary
acquisition (Knight, 1994; Rott, 1999), both the watchers and the players of the game forgot vocabulary
over time.
This study did not capture casual gameplay; neither the players nor the watchers were permitted to pause
or take a break from the game to reflect. Long (1991) explicated the need for language learners to selfinitiate momentary transfers of attention to elements of language; if the player and watcher had had more
agency with their media (i.e., been allowed to pause the game, or control how they interacted with the
game), the results may have differed. This study was conducted in a laboratory setting, and since context
is important to cognition and learning processes, results may differ if this game were used at home or in a
classroom setting.
Limitations
The results of this study must be considered in light of its limitations. An important limitation is the low
number of female participants (15 out of 80, although there were no between-group gender differences).
Another limitation is that the instrument used to measure cognitive load was a self-report questionnaire
that did not have strong internal consistency (the mental effort scale had an alpha coefficient of .551, and
the material difficulty scale had an alpha coefficient of .565). The vocabulary-based cloze test should also
not be taken as an ultimate measure or predictor of learning with games, and the production task may not
Language Learning & Technology
85
have captured the players and watchers focus on comprehending game language. A further limitation is
the manner in which the participants were grouped into high and low language and game
proficiencies. One threat to generalizability is the treatment media; the results from watching or playing a
game in a typical video game playing environment (e.g., on a sofa in a friends living room) might be
different, and longer or repeated playing or watching sessions may have resulted in better learning
outcomes for both groups. The results of this study are also limited in their ability to generalize to the
English as a Foreign Language population at large, or to other topics or aspects of second language
acquisition.
Implications
This study did not attempt to determine that one media (e.g., a video game or a video) was better or worse
than another at supporting second language acquisition, since all media are complex combinations of
technological affordances and communication between designers and users (Clark, 2001). This study
carefully manipulated one aspect of a video game in order to provide an initial understanding of how
interactivity affects recall and attention of second language vocabulary. As some have argued (e.g.,
Arnseth, 2006; Squire, 2002), the power of games for educational purposes may not reside in the games
themselves, but in the context and activities related to and extending from play. Further educational game
research (from various perspectives) and careful design and pedagogy are required.
Implications for research
Although the basic stimulus-response interactivity of this studys game negatively impacted attention and
recall, the interactivity of other games may have different outcomes. Further studies should investigate
interactivities that more closely align with the language of the game (e.g., many sports games voice
commentary describes player actions), or give the learner deeper choices about in-game actions (e.g.,
many simulation and strategy games allow great agency). Purushotma (2005) describes the link between
player choice and goal-related incidental language learning (with remediated text glossed by images and
animations) in the life simulation game The Sims; that games interactivity might lead to positive
learning outcomes for the players in a replication of the current study. Continued interactivity research
could utilize Sims (1997) taxonomy of interactivity as well as perspectives such as endogenous fantasy
(Habgood, Ainsworth, & Benford, 2005) or action memory (Engelkamp, 2001).
This study focused on the initial exposure to vocabulary items and immediate and delayed recall. Future
research should examine how interactivity helps or hinders other stages of the second language
acquisition process. Studies that examine how game vocabulary is integrated with previous knowledge
(i.e., how interactivity affects understanding), or is used communicatively, could greatly benefit digital
game-based language learning research. Other affordances of video games (e.g., stories, play, subtitles,
repetition, feedback, and visual representations of language) should continue to be examined. As well,
studies should focus not only on vocabulary acquisition, but also on phonetic, syntactic, pragmatic
knowledge building, and the transfer of these skills to communicative use.
Further research regarding language acquisition through video games could examine how various
instructional techniques affect the learning process. These studies could focus not only on the scaffolds
game designers use to introduce players to the mechanics of a particular game (outlined in Gee, 2003;
Prensky, 2003; Um & deHaan, 2005), but also strategies that multimedia designers and classroom
teachers use to encourage learners to focus more explicitly on the linguistic content of materials.
Paribakht and Wesche (1997) suggest that explicit instruction may support a more complete acquisition of
vocabulary; studies on instructional techniques might range from pre-gameplay activities (i.e., schema
activation) to multimedia glosses of vocabulary (as suggested by the studies reviewed by Plass & Jones,
2005) or intelligent feedback in the video game environment to individual or class debriefings following
gameplay (according to the focus on form approach delineated by Long, 1991).
Language Learning & Technology
86
In this study, the players of the interactive video game experienced higher extraneous cognitive load (i.e.,
perception of material difficulty) than the watchers of the non-interactive video did. As Ben-Shaul (2003)
emphasizes, it is important to explore how the interactivity of video games affects cognitive load and
knowledge building. Researchers should investigate the wide variety of video game interactivities and
their various cognitive effects.
The use of various empirical perspectives could provide a deeper understanding of how video games
affect second language acquisition. Experimental studies can be useful for systematic comparisons and
generalizations of features, but ethnographic work may provide a more valid understanding of gameplay
and learning with foreign language video games. Naturalistic gameplay studies (as called for by Squire,
2002) are needed of players of games in a second language. Action research (i.e., precise accounts and
analyses of actual classroom practice) may provide valuable insights regarding teaching methods and
learning outcomes with foreign language video games. Observations, interviews, and stimulated recalls of
more natural gameplay habits (e.g., turn taking, pausing to reflect on gameplay, discussing the game with
friends, using Internet resources) may provide valuable insights into how to design and teach with digital
games. Conducting mixed methods research with a variety of game genres seems to be a logical next step
for digital game-based language research due to the early nature of the field.
Implications for instructional design
Designers of educational games for the teaching of foreign languages should consider the type of
interactivity they are requiring from their users, especially if their game is based on a similar interactivity
as that used in this study. The incorporation of additional ludic (e.g., cooperative or competitive modes of
play) or social tasks (e.g., recording and sharing video of gameplay) may foster better attention and
processing of the games language. Although the results of this study cannot be widely generalized to
other types of video games or other multimedia environments used for language instruction (e.g.,
interactive DVDs or websites), designers of those media should carefully consider the degree and type of
interaction they require from their users in order to avoid overwhelming them. Purushotma, Thorne and
Wheatley (2008) offer numerous principles for the design of digital games for language learning.
Implications for self study
Students use a variety of media to autonomously learn a foreign language. As video games continue to
gain popularity, it seems likely that learners will import or download foreign language games. Students
should realize that not all video games are useful for language learning; they should choose their study
materials carefully. If students want to use a game like Parappa the Rapper 2, they should be aware of the
difficulty in balancing their attention between gameplay and language. Students may not be as
overwhelmed as the players in this study by: repeating levels, taking breaks between sessions, using notes
and dictionaries, recording their play to watch later, consulting online forums and guides, and playing
with friends (perhaps alternating and discussing the game and its language after each turn). Video games
used in conjunction with learner strategies may be more beneficial than this experimental studys
controlled play and tasks. Students, for example, may choose a game of a different genre than the game
used in this study.
Implications for pedagogy
Because of students enjoyment of video games, language teachers may be interested in using games in
their classrooms. While games do contain a wealth of comprehensible language, the results of this study
suggest that teachers should carefully consider the interactivity of the games they want to use in class and
design pedagogical strategies for scaffolding students play and language learning with mindfullyselected games. Teachers may like to choose a game of a different genre than the game used in this study.
Scaffolds might be used before, during or after gameplay.
87
Before gameplay, a teacher might ask the students to brainstorm vocabulary of the situation in the game
(e.g., a fast food restaurant), since more complete schema can contribute to lower cognitive load (Douglas
& Hargadon, 2001; Sweller, 1994). A teacher might discuss the difficulty in balancing gameplay and
attention to language, perhaps emphasizing this point by having one student play the game in front of the
class and talk about her experience and foci of attention during play. Pre-teaching vocabulary using drills
or dictionary work, or viewing a non-interactive video of the game before play might also be effective
pre-play scaffolding.
During gameplay, a teacher might suggest that pairs of players and watchers alternate playing and
watching to balance gameplay and linguistic analysis. These pairs could be asked to transcribe or
complete a cloze activity of the games language together. During gameplay, a teacher could use a focus
on form approach to draw students attention to unique phonetic, morphological, or syntactic elements of
the games language (especially after errors made during students meaningful L2 communication about
the game), and then have students continue playing to examine the language in its natural context.
After gameplay, students could write definitions and original sentences for the unfamiliar words noticed
during gameplay. Students could also create gameplay tips for other players in order for the other players
to free up cognitive resources for attention to and analysis of language. In order to push the students
toward linguistic output (Swain, 1995), a teacher might have the students write and perform original roleplays based on the vocabulary of the video game (e.g., co-workers in a fast food restaurant). If a variety of
video games are used in the classroom, the class might create a database of language noticed during
gameplay and then investigate word collocations in various video game contexts.
ACKNOWLEDGEMENTS
We are grateful to Jan L. Plass, Lixing (Frank) Tang, the editors, and the reviewers for their useful and
insightful suggestions.
ABOUT THE AUTHORS
Jonathan deHaan is an Associate Professor in the Faculty of International Relations at the University of
Shizuoka. He earned his Ph.D. in Educational Communication and Technology from New York
University. His research focuses on second language learning and teaching with games and simulations.
E-mail: dehaan@u-shizuoka-ken.ac.jp
W. Michael Reed was a retired professor of Educational Communication and Technology at New York
University and the IRB/IACUC Administrator for Radford University in Virginia. His research interests
spanned over a 25-year period and focused on educational computing, problem-solving, metacognition,
and composing processes.
Katsuko Kuwada is a doctoral student in the International Cultural Studies program at Tohoku University.
She investigates language and culture; her current research compares the use of first-person subjects in
Japanese and English based on different cultural backgrounds.
E-mail: k-kuwada@u-aizu.ac.jp
88
REFERENCES
Allen, B. S., & Otto, R. G. (1996). Media as lived environments: the ecological psychology of
educational technology. In D. Jonassen (Ed.), Handbook of research for educational communications and
technology (pp. 199225). New York: Macmillan.
Arnseth, H. C. (2006). Learning to play or playing to learn - A critical account of the models of
communication informing educational research on computer gameplay. Game Studies, 6(1). Retrieved
from http://gamestudies.org/0601/articles/arnseth
Baddeley, A. (1992). Working memory. Science, 255, 556559.
Baltra, A. (1990). Language learning through computer adventure games. Simulation and Gaming
Journal, 21, 445452.
Ben-Shaul, N. (2003). Split attention problems in interactive moving audiovisual texts. Retrieved from
http://hypertext.rmit.edu.au/dac/papers/BenShaul.pdf
Brett, P. (2001, June). Too many media in my multimedia? A study of the effects of combinations of media
on a recall task. Paper presented at Escuela Superior de Administracion y Direccion de Empresas,
Barcelona, Spain.
Carrier, M. (1991). Simulations in English language teaching: A cooperative approach. Simulation and
Gaming Journal, 22(2), 22433.
Chapelle, C. (1998). Multimedia CALL: Lessons to be learned from research on instructed SLA.
Language Learning & Technology, 2(1), 2139. Retrieved from http://llt.msu.edu/vol2num1/article1/
Chun, D. M., & Plass, J. L. (1996a). Effects of multimedia annotations on vocabulary acquisition. The
Modern Language Journal, 80, 183198.
Chun, D. M., & Plass, J. L. (1996b). Facilitating reading comprehension with multimedia. System, 24,
503519.
Clark, R. E. (Ed.) (2001). Learning from media. Greenwich, CT: Information Age Publishing.
Crawford, C. (2005). Chris Crawford on interactive storytelling. Berkeley, CA: New Riders.
Culley, G., Mulford, G., & Mulbury-Steen, J. (1986). A foreign language adventure game: Progress report
on an application of AI to language instruction. CALICO Journal, 4, 6994.
deHaan, J. (2005a). Learning language through video games: A theoretical framework, an analysis of
game genres and questions for future research. In S. Schaffer & M. Price (Eds.), Interactive Convergence:
Critical Issues in Multimedia (vol. 10), Chapter 14, pp. 229239. Retrieved from http://www.interdisciplinary.net/publishing/idp/eBooks/icindex.htm
deHaan, J. (2005b). Acquisition of Japanese as a foreign language through a baseball video game.
Foreign Language Annals, 38(2), 278282.
Douglas, Y., & Hargadon, A. (2001). The pleasures of immersion and engagement: Schemas, scripts and
the fifth business. Digital Creativity, 12(3), 153166.
Engelkamp, J. (2001). Action memory: A system-oriented approach. In H. D. Zimmer, R. Cohen, M.
Guynn, J. Engelkamp, R. Kormi-Nouri, & M. N. Foley (Eds.), Memory for action: A distinct form of
episodic memory? (pp. 4996). New York: Oxford University Press.
Entertainment Software Association. (2008). Industry Facts. Retrieved from http://www.theesa.com/
facts/index.asp
89
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power
analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39,
175191.
Garcia-Carbonelli, A., Rising, B., Montero, B., & Watts, F. (2001). Simulation/gaming and the
acquisition of competence in another language. Simulation and Gaming Journal, 32(4), 481491.
Gass, S. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum
Associates Publishers.
Gee, J. P. (2003). What videogames have to teach us about learning and literacy. New York: Palgrave
Macmillan.
Gee, J. P. (2007). Good video games and good learning: Collected essays on video games, learning and
literacy (New literacies and digital epistemologies). New York: Peter Lang Publishers.
Habgood, M. P. J., Ainsworth, S. E., & Benford, S. (2005). Endogenous fantasy and learning in digital
games. Simulation and Gaming Journal, 36(4), 483498.
Heeter, C. (2000). Interactivity in the context of designed experiences. Journal of Interactive Advertising,
1(1), 75-89.
Hernandez, S. S. (2004). The effects of video and captioned text and the influence of verbal and spatial
abilities on second language listening comprehension in a multimedia environment (Unpublished doctoral
dissertation). New York University, New York.
Hubbard, P. (1991). Evaluating computer games for language learning. Simulation and Gaming Journal,
22, 220-223.
Kalyuga, S., Chandler, P., & Sweller, J. (1998). Levels of expertise and instructional design. Human
Factors, 40(1), 117.
Kalyuga, S., Chandler, P., & Sweller, J. (1999). Managing split-attention and redundancy in multimedia
instruction. Applied Cognitive Psychology, 13(4), 351372.
Knight, S. (1994). Dictionary: The tool of last resort in foreign language reading? A new perspective.
Modern Language Journal, 78(3), 285299.
Kulikowich, J. M., & Young, M. F. (2001). Locating an ecological psychology methodology for situated
action. The Journal of Learning Sciences, 10(1&2), 165202.
Laufer, B., Hill, M., 2000. What lexical information do L2 learners select in a CALL dictionary and how
does it affect word retention?. Language Learning and Technology, 32, 5876, Available from:
http://llt.msu.edu/vol3num2/laufer-hill/index.html
Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware
learners. Studies in Second Language Acquisition, 22, 557584.
Li, R. C., & Topolewski, D. (2002). ZIP & TERRY: A new attempt at designing language learning
simulation. Simulation & Gaming Journal, 33, 181186.
Long, M. H. (1991). Focus on form: A design feature in language teaching methodology. In K. de Bot, R.
Ginsberg, & C. Kransch (Eds.), Foreign Language Research in Cross-Cultural Perspective (pp. 3952).
Amsterdam: John Benjamins.
Mackey, A., Gass, S., & McDonough, K. (2000). Learners perceptions about feedback. Studies in Second
Language Acquisition, 22, 471497.
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Language Learning & Technology
90
Meskill, C. (1990). Where in the world of English is Carmen Sandiego? Simulation and Gaming Journal,
21(4), 457460.
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for
processing information. Psychological Review, 63, 8197.
Murray, J. (1997). Hamlet on the holodeck. Cambridge, MA: MIT Press.
NanaOn-Sha. (US Release 2002). Parappa the Rapper 2 [computer software]. Sony Computer
Entertainment.
Nelson, B. C., & Ketelhut, D. J. (2007). Scientific inquiry in educational multi-user virtual environments.
Educational Psychology Review, 19(3), 307326.
Paribakht, T. S., & Wesche, M. (1997). Vocabulary enhancement activities and reading for meaning in
second language vocabulary acquisition. In J. Coady and T. Huckin (Eds.), Second language vocabulary
acquisition: A rationale for pedagogy (pp. 174200). Cambridge: Cambridge University Press.
Paas, F. (1992). Training strategies for attaining transfer of problem-solving skill in statistics: A
cognitive-load approach. Journal of Educational Psychology, 84, 429434.
Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent
developments. Educational Psychologist, 38, 14.
Paas, F., Van Merrienboer, J., & Adam, J. (1994). Measurement of cognitive load in instructional
research. Perceptual and Motor Skills, 79, 419-430.
Pellouchoud, E., Smith, M. E., McEvoy, L., & Gevins, A. (1999). Mental effort related EEG modulation
during video game play: Comparison between juvenile epileptic and normal control subjects. Epilepsia,
40(4), 3843.
Piirainen-Marsh, A., & Tainio, L. (2009). Other-repetition as a resource for participation in the activity of
playing a video game. Modern Language Journal, 93, 153169.
Plass, J. L., & Jones, L. C. (2005). Multimedia learning in second language acquisition. In R. E. Mayer
(Ed.), Cambridge handbook on multimedia learning (pp. 467488). Cambridge, MA: Cambridge
University Press.
Prensky, M. (2001). Digital game-based learning. New York: McGraw Hill.
Prensky, M. (2003). Escape from planet Jar-Gon. Or, what video games have to teach academics about
teaching and writing. Retrieved from http://www.marcprensky.com/writing/Prensky%20%20Review%20of%20James%20Paul%20Gee%20Book.pdf
Purushotma, R. (2005). Commentary: Youre not studying, youre just .... Language Learning &
Technology, 9(1), 8096. Retrieved from http://llt.msu.edu/vol9num1/purushotma/default.html
Purushotma, R., Thorne, S. L., & Wheatley, J. (2008). 10 Key Principles for Designing Video Games for
Foreign Language Learning. Retrieved from http://knol.google.com/k/ravi-purushotma/10-keyprinciples-for-designing-video
Robinson, P. (1995). Attention, memory, and the noticing hypothesis. Language Learning, 45, 283
331.
Rosa, E., & ONeill, M. D. (1999). Explicitness, intake and the issue of awareness: Another piece to the
puzzle. Studies in Second Language Acquisition, 21, 511556.
Rott, S. (1999). The effect of exposure frequency on intermediate language learners incidental
vocabulary acquisition and retention through reading. Studies in Second Language Acquisition, 21, 589
Language Learning & Technology
91
619.
Sawyer, B., & Smith, P. (2008). Serious games taxonomy. Retrieved from
http://www.dmill.com/presentations/serious-games-taxonomy-2008.pdf
Schmidt, R. W. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp.
332). New York: Cambridge University Press.
Schwienhorst, K. (2002). Why virtual, why environments? Implementing virtual reality concepts in
computer-assisted language learning. Simulation and Gaming Journal, 33(2), 196209.
Sims, R. (1997). Interactivity: A forgotten art? Computers in Human Behavior, 13, 15780.
Squire, K. D. (2002). Cultural framing of computer/video games. Game Studies, 2(1). Retrieved from
http://www.gamestudies.org/0102/squire/
Squire, K. D. (2006). From content to context: Video games as designed experiences Educational
Researcher, 35(8), 1929.
Steuer, J. (1993). Defining virtual reality: Dimensions determining telepresence. Social Responses to
Communication Technologies Paper #104. Retrieved from http://www.presenceresearch.org/papers/steuer92defining.pdf
Steinkuehler, C. (2007). Massively multiplayer online gaming as a constellation of literacy practices.
eLearning, 4(3), 297318.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidlhofer
(Eds.), Principle and practice in applied linguistics (pp. 125144). Oxford, England: Oxford University
Press.
Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and
Instruction, 4, 295312.
Sykes, J., Oskoz, A., & Thorne, S. L. (forthcoming). Second language use, socialization, and learning in
Internet interest communities and online gaming. Retrieved from
http://language.la.psu.edu/~thorne/Thorne_etal_MLJ_2009_Draft.pdf
Tamborini, R., Eastin, M., Lachlan, K., Skalski, P., Fediuk, T., & Brady, R. (2001, May). Hostile
thoughts, presence and violent virtual video games. Paper presented at the 51st annual convention of the
International Communication Association, Washington, D.C. Retrieved from
http://info.cas.msu.edu/icagames/HTP.pdf
Taylor, L. N. (2002). Video games: Perspective, point-of-view, and immersion (Unpublished Masters
Thesis). University of Florida, Florida. Retrieved from http://purl.fcla.edu/fcla/etd/UFE1000166
Tomlin, R., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in
Second Language Acquisition, 16, 183-204.
Turkle, S. (1985). The second self. New York: Simon & Schuster.
Um, E. J., & deHaan, J. (2005). Video games and cognitive apprenticeship-based learning. Proceedings of
the Association for Educational Communications and Technology, USA, 28, 499505.
Wolf, M. J. P. (2001). The medium of the video game. Austin: University of Texas Press.
Wolf, M. J. P., & Perron, B. (2003). The video game theory reader. New York: Routledge.
Zheng, D., Young, M. F., Brewer, R. B., & Wagner, M. (2009). Attitude and self-efficacy change:
English language learning in virtual worlds. CALICO Journal, 27, 205231.
Language Learning & Technology
92
93
94
95
Glenn Stockwell
or details of upcoming television programs they wanted the learners to watch. Each of these studies
capitalises on different features of mobile phones (e-mail, web browsers and SMS) and illustrates the
broad potential of the phone as a learning tool.
Mobile Phones and Vocabulary Learning
Studies investigating using mobile phones for learning vocabulary have also started to appear in the
literature, and the nature of the activities and the focuses of the research have been varied. Browne and
Culligan (2008), for example, provide an overview of an environment where learners complete activities
on a computer, after which time they can access vocabulary flash cards on their mobile phones that are
generated based on items that the system predicts that they need to work on. In their study, a description
is given of how the activities are beneficial, specifically that targeted items are provided for learners to
study at a time and place that suits them, but details are not given regarding how the system was actually
used by the learners. Another example is described by Thornton and Houser (2005), where learners were
asked to access video lessons about English idioms from their mobile phones during class time and
complete short multiple choice activities about the idioms they had learnt, also on their mobile phones in
class. The materials were given a positive evaluation by the learners, who found them both fun and useful.
One study that attempted to investigate the way in which learners acquire vocabulary through mobile
phones was conducted by Chen, Hsieh, and Kinshuk (2008). Learners deemed to have varying verbal and
visual learning skills according to an online survey into short-term memory abilities were provided with
four different types of annotations for learning English vocabulary depending on their learning
preferences determined in the survey. Flashcards were sent to their mobile phones via SMS which
included one of four different types of annotation; English word only, English word with written
annotation, English word with pictorial annotation, and English word with both written and pictorial
annotation. The flashcards were viewed in the classroom and learners were given 50 minutes to learn 24
vocabulary items. In a post-test carried out immediately after the activities on desktop computers in the
classroom, they found that the pictorial annotation assisted learners who had lower verbal and higher
visual ability to retain vocabulary, at least in the short term.
Studies that have looked at actual mobile phone use outside the classroom include research into sending
messages to learners mobile phones by Thornton and Houser (2005) and Kennedy and Levy (2008). Both
studies were based on the push mode of operation, that is, where teachers control the frequency and the
timing of messages sent to learners. Thornton and Houser sent short mini-lessons for learning vocabulary
via e-mail to learners mobile phones three times a day, using new words in multiple contexts to allow
learners to infer the meanings. Similarly, learners in Kennedy and Levys study were given the option to
receive messages which presented known words in new contexts and new words in contexts that were
familiar to the learners through SMS to their mobile phones on an average of nine to ten messages per
week. A survey was administered in both studies, and in each case indicated that learners felt that these
messages were very helpful for learning vocabulary, although some indicated that the messages were too
frequent. To determine effectiveness, the learners who received the e-mail mini-lessons in Thornton and
Housers study were compared against learners who could access the same materials through a website
designed for the mobile phone and learners who were given the materials on paper. The study only ran for
a two-week period but showed that the learners who received the e-mails scored better on post-tests
compared with the other two groups. No measures of effectiveness were conducted in Kennedy and
Levys study.
A limitation plaguing research into using mobile phones for language learning, however, is that much of it
occurs in artificial environments, generally within the classroom itself. In order to get a real indication of
the nature of mobile learning, it is necessary to view its use in naturalistic settings. An example of where
learners mobile phone usage was tracked outside the classroom was conducted by Stockwell (2008) in a
study of 75 pre-intermediate learners of English. Learners were provided with tailored vocabulary
activities based on listening activities covered in class, and were able to complete these either on mobile
phones through the Internet browser function on their phones or on a normal desktop or laptop computer
(PC). Surveys and server log data revealed that learners used the PC in preference to the mobile phone in
the vast majority of cases, many of whom indicated from the outset that they did not intend to use the
Language Learning & Technology
96
Glenn Stockwell
mobile phone for their vocabulary study, citing problems such as the cost of Internet access, the screen
size, the keypad and the study environment as the primary reasons. The study showed that 61% of
learners did not use the mobile phone at all, with a further 24% of learners using the mobile phone for less
than 20% of the vocabulary activities. Depending on their usage patterns across the period investigated,
Stockwell classified learners as non-users, try-and-quit users, sporadic users, balanced users, or heavy
users of the mobile phone. Many learners who indicated that they intended to use the mobile phone in a
pre-survey either did not do so or used the mobile phone very minimally (and, incidentally, there were
also learners who indicated that they did not initially intend to use the mobile phone, but later used it
relatively frequently). While the reasons cited by learners such as cost, screen size and the inconvenient
keypad shed some light on why some learners chose not to use the mobile phone for the activities, they
give us only minimal insight into sporadic or balanced usage of the phone.
This leads us to ask what it is that causes learners to choose to use one platform or another at specific
times. Obviously, environmental factors such as access to a PC at a particular time would be expected to
play some role, but are there other factors involved as well? Data from the Stockwell (2008) study tend to
indicate that reasons behind platform selection are complex. There was a clear novelty effect, with the
mobile phone being used for 17% of the first lesson, but only 3.2% of the last lesson, but the increases
and decreases in use across the semester suggest that this was not the only factor contributing to choice of
platform. Of interest was the fact that even those learners who did not use the mobile phone provided
positive comments about the concept of learning through this platform, but nonetheless, it appears that
their concerns outweighed the perceived benefits of using the mobile phone.
These studies have suggested that while learners have a positive view of mobile learning, and feel that
there are the potential benefits, not all students are willing to engage in it. What is it that leads learners to
make such decisions? This question is dealt with in the following section.
Effects of the Mobile Platform on Learning
Current discussions on the use of mobile devices in learning environments generally focus on the
affordances of the device, the skills and attitudes of the learner, and the environmental constraints of
learning through a mobile platform. In her FRAME model (Framework for the Rational Analysis of
Mobile Education), Koole (2009) proposes that mobile learning occurs in an intersection of device,
learner and social aspects. Because the device acts as the interface between the learner and the activities,
she argues that it is important to assess characteristics such as the physical characteristics (e.g., size and
weight), input capabilities (e.g., keypad or touchpad), output capabilities (e.g., screen size and audio
functions), file storage and retrieval, processor speed, and the error rates (i.e., malfunctions which result
from flaws in hardware, software and/or interface design). In addition, she proposes that learner skills also
play a central role, and prior knowledge and experience with mobile devices for learningas well as
feelings towards activitiescan either positively or negatively affect the way in which learners engage
themselves with mobile-based tasks. Though not directly discussed by Koole, consideration of the
environment in which learning occurs and the psychological barriers is paramount (Wang & Higgins,
2006). If learners feel that the mobile environment is not conducive to learning, it can have a detrimental
effect on the way in which activities are undertaken. Learners in Stockwells (2008) study, for example,
wrote that the mobile phone was not a tool for studying and that they couldnt get into study mode
with the mobile (p. 260) as reasons for not using the mobile phone to complete the activities.
This perception of the problems with the mobile device as learning tool is one that must be overcome if
mobile phones are to enter the mainstream. In saying this, however, it is a problem that may eventually
cease to be an issue by itself as acceptance of the mobile phone as learning tool becomes more
widespread. Where once many students and teachers may have had a less than positive view towards
computers for learning, we now see computers used with learners of all ages, and in the spirit of digital
natives as coined by Prensky (2001), there are learners who perceive learning through computers as
more natural than through more traditional means. Until we reach this stage with mobile learning,
however, there is a need to design our learning environments around the existing technologies in such a
way as to encourage learners to feel comfortable with working within them. Unfortunately, as KukulskaHulme (2005) argues, many of the mobile devices that learners have access to are simply not designed for
Language Learning & Technology
97
Glenn Stockwell
educational purposes, meaning that learners find them difficult to use for the activities that teachers
expect them to undertake. While some of the blame for this may lie with developers, in many cases it is
not that such devices do not exist, but rather that those that are appropriate for specific purposes are too
expensive for many learners. The goal for teachers, then, is to be aware of what tools learners possess,
and to chose and/or adapt resources to suit these tools.
Stockwells (2008) study showed that even when materials were adapted to their mobile phones, when
given a choice of whether to use their mobile phone or a PC for completing vocabulary-learning activities,
the overwhelming majority of learners chose the PC. The purpose of the current study was to look
specifically at whether there were any features inherent to completing activities on the mobile platform
that may have affected learners decisions to use a PC rather than their mobile phone (and vice-versa). Do
they achieve higher scores with the PC so opt to use the PC instead? Alternatively, do learners find that
the activities take too long to complete on the mobile phone? Rather than looking at subjective survey
data, the current study looks at the actual scores achieved by learners depending on the activity type, the
amount of time spent engaged on the activities, and any longitudinal differences that were apparent across
the period investigated. In order to investigate these issues, the following research questions were posed:
1. Are there differences in the scores achieved in activities completed on mobile and PC platforms?
2. Are there differences in the time required to complete activities on mobile and PC platforms?
3. Do learners improve in speed and scores over time on each platform?
The method used for the study is described in the following section.
METHOD
Participants
The study was conducted over a three-year period with three individual cohorts coming to a total of 175
learners. All learners were enrolled in a compulsory first-year English-language subject in the School of
Law at Waseda University, Tokyo, which focussed predominantly on improving listening skills and
vocabulary. The three cohorts were made up of students enrolled in a total of seven classes; three classes
in 2007 (n = 80), two classes in 2008 (n = 50) and two classes in 2009 (n = 45). All seven classes were
taught by the author and the teaching approach and content were the same. The classes were not streamed,
and learners were randomly assigned by the facultys student affairs office. Learners were mostly fresh
out of high school, and were all aged between 18 and 21. They were generally considered as being of a
pre-intermediate level, with TOEIC scores ranging between 450 and 650, the majority being at the lower
end of the range. Most learners were quite motivated to achieve high scores in the subject in order to
maintain their GPA (Grade Point Average) to enable them to enter the graduate law school. According to
a survey administered at the end of the semester, none of the learners had ever used a mobile phone for
language learning before taking the subject. The survey also indicated that nearly two-thirds of the
learners had planned to use the mobile phone at the beginning of the semester.
Classes were held once a week for a 15-week period, and as class time was dedicated mainly to listening
activities, learners were required to study vocabulary outside of class (more details are provided in the
following section). Vocabulary activities based on the textbook materials were developed and made
available to learners either on PC or on their mobile phones. An orientation on how to use the activities
was given in the first class in the semester, which included showing learners how to log in and complete
each activity type. Time was also spent on ensuring that learners understood how to complete the
activities, and learners were given the opportunity to ask questions about how to use the system on both
the PC and the mobile platforms in class. The vocabulary activities were included as part of the
assessment for the subject (10% of the overall score for the subject), and learners were told that they
could choose between using a PC, their mobile phone, or any combination that they wanted to during the
semester. It was emphasised that there was no pressure to use one platform or the other, and that activities
completed on either platform would be included in their progress equally. They were also informed that
they could switch between platforms at will, and could even start a lesson on one platform and complete it
Language Learning & Technology
98
Glenn Stockwell
on the other if they saw fit. They were told in advance of the study that data would be collected and used
for research and further development purposes, but that records would be collected anonymously with no
information linking their scores to their identity. Scores for individual activities were not included as part
of the assessment, and learners were assigned a grade depending on how many of the ten lessons they
completed. Records of completed lessons were correlated with real names at the end of each semester so
that grades could be assigned, with a score of 10 being awarded if all activities were completed by the end
of the semester.
Learners were able to access the PC version of the materials from any computer, including the university
computer laboratories or their homes. The mobile phone version was also available anywhere that
learners had a signal, and provided that they had Internet capabilities on their phones. All 175 learners
indicated orally in the first class that they did have a phone with this function. The mobile system was
trialled with Internet-capable handsets of each of the three major mobile phone carriers in Japan (NTT
DoCoMo, KDDI AU, and Softbank) and no incompatibility issues were apparent. Apart from fixing some
minor bugs such as formatting errors, the system remained essentially unchanged across the three-year
period.
Vocabulary Activity System
The vocabulary activity system entitled VocabTutor was the same system that was used in both the
Stockwell (2007b and 2008) studies, so only a brief description has been included here. The system was
written in PHP and MySQL, and integrated into Moodle, which was used for management of class grades
and for provision of the audio passages for the lessons covered in the textbook. When students undertook
the vocabulary activities on PC, they were required to log into the Moodle system, and then follow a link
to the activities. The mobile system was the same as the PC system in content, but had a simplified
interface to fit the smaller screen, and graphic images were removed to make loading faster. Learners
accessed the mobile version from a different address on the same server, and logged in using the same
user name and password as they used for Moodle. Both the PC and mobile versions accessed the same
databases, and regardless of which platform was used for completing the activities, they were recorded in
exactly the same way, but a record was kept of which platform was used.
The vocabulary activities for each lesson were designed to include more passive activities in the
beginning, only requiring learners to select the correct word from a list of alternatives, through to more
productive activities at the end, where learners were required to write the correct word in the appropriate
tense. There were initially six different activity types included with the system, but one was dropped in
the first week of the first cohort because of a programming error, and to maintain consistency was
excluded from the two subsequent cohorts. The remaining activity types included: (a) choosing the
appropriate word for an English sentence, (b) choosing the appropriate English word for a Japanese
meaning, (c) choosing the appropriate English word for an English definition, (d) writing a word in
English for an English definition, and (e) writing the appropriate English word for an English sentence.
Writing consisted of no more than a single word per question in both the PC and mobile systems. This
was to maintain a simpler interface when considering the mobile platform, and to keep as much
consistency as possible between the mobile and PC platforms. Examples of each of the question types are
included in the Appendix. Note that the question type and format was the same for each platform, so the
examples given in the appendix apply to both platforms.
The system consisted of an intelligent engine that adapted to the learners depending on whether they
scored correctly or incorrectly for each vocabulary item. Items for each lesson were presented in random
order, and then assigned a competency score depending on learner responses. If the learner answered
correctly the first time an item appeared, it was assigned an initial competency score of 6, whereas if they
scored incorrectly, the item was assigned an initial competency score of 3. For each correct response, the
competency score for each item increased by 1, and decreased by 1 for each incorrect response. An item
was considered as known by the system if it reached a competency score of 8. These numbers were
selected to ensure that even if the learner got an item correct the first time, it would need to be correct a
further two times in a row before considered as known, and that if they got an item wrong the first time
it appeared, the learner would need to get it correct a further five times in a row for it to be correct. Even
Language Learning & Technology
99
Glenn Stockwell
items considered as known were included in the activities periodically to ensure that the learner was
still able to answer correctly. Items with a lower competency score were presented to the learner more
frequently than those with a higher competency score, and learners needed to attain a competency score
of at least 8 with all items in a given lesson before they were able to go on to the next lesson. As a result,
those learners who scored more correct answers were able to progress through the lessons more quickly
than those that made errors. Each lesson included 13-17 vocabulary items which were selected from the
commercially produced textbook. Where multiple-choice questions were given to learners, distracters
were automatically generated by the system, matching the required word in both part of speech and tense.
Extensive testing was done to ensure that duplicate correct answers did not appear in the questions.
The vocabulary activities were linked closely to in-class activities. As mentioned above, the primary focus
of the class was listening, and the textbook contained several comprehension activities and
communication tasks based on short videos. Each lesson of the textbook contained a number of key
vocabulary items which were both highlighted in example sentences and the subject of a simple wordmatching task. Learners could listen to an audio version of the video either directly through the Moodle
system, or they could download an MP3 version of the audio that could be transferred to their phone or
other MP3 player. The class in the following week included a short listening cloze quiz where the
vocabulary items were omitted from a transcript of the video. The vocabulary activities were designed to
assist in learning the vocabulary which came up in the lessons to help with both the in-class activities and
the listening cloze quiz.
Data Collection and Analysis
The data were collected through detailed server logs that were automatically kept by the system. No
vocabulary pre- or post-test was undertaken as the objective of the study was not to investigate learner
development with the vocabulary but rather to identify how learners used their mobile phones for
language learning when they had alternative methods of completing the activities, and their performance
on both platforms. The server logs kept records of, among other things, the platform the learners used to
complete an activity, the lesson number, the type of activity, the time the activity was started and
completed, and the score attained for the activity. Other information including the number of attempts on
each item, along with the overall accuracy measures of the vocabulary items was also recorded in each
learners profile, but these were not used in the current study.
As the combined logs contained over 54,000 entries, the author developed software to organize the data
for analysis. Data were broken down into the total number of activities completed by each learner, the
platform used, the time taken to complete each activity, and the scores that were achieved. Detailed
figures were generated to determine whether learners used one platform or the other consistently, or in
bursts across the semester. Comparisons across the three years were also made, which were examined for
patterns in mobile usage. The results of the study are presented in the following section.
RESULTS
The data were first analysed to determine the proportion of mobile phone use by learners. A summary of
mobile phone usage across all three cohorts (N = 175) is presented in Figure 1. There was a significant
number of learners who did not use the mobile phone at all, but rather elected to complete all activities on
the PC. Learners who completed all of the activities on mobile phone (100%) or all activities on PC (0%)
are indicated separately. As can been seen in the figure, 60% (105 learners) did not use the mobile phone
at all for the activities, and a further 18.9% (33 learners) used the mobile phone for 20% or less of the
activities completed. Only very small numbers of learners used the mobile phone for the majority of the
activities, with just 3 learners (1.7%) electing to use the mobile phone for all of the vocabulary activities.
100
Glenn Stockwell
70.0
60.0
Percentage of Students
60.0
50.0
40.0
30.0
18.9
20.0
10.0
6.9
5.1
2.9
4.6
1.7
21-40
41-60
61-80
81-99
100
0.0
0
1-20
Percentage of Activities
Figure 1. Percentage of the vocabulary activities completed by the learners on the mobile phone (N =
175).
A breakdown of the mobile phone usage in the three cohorts can be seen in Table 1. The number of
learners who did not use the mobile phone at all came to 58.8% in 2007, rising to 78% in 2008, with a
sharp drop in 2009 to only 40.4%. A much wider spread of mobile usage was evident in 2009 when
compared to earlier years, and although 2009 was the only year in which no learners opted to use the
mobile phone for all activities, there was a larger proportion of learners compared with previous years
(11.1%) who used the mobile phone for more than 80% of the activities.
Table 1. Percentage of Activities Completed on Mobile Phones in Each Year (N = 175)
Percentage of Tasks
Completed on Mobile
100
81-99
61-80
41-60
21-40
1-20
0
n
2007
1 (1.3%)
3 (3.8%)
2 (2.5%)
2 (2.5%)
6 (7.5%)
19 (23.8%)
47 (58.8%)
80
2009
0 (0.0%)
5 (11.1%)
2 (4.4%)
6 (13.3%)
3 (6.7%)
10 (22.2%)
19 (42.2%)
45
Analysis of the data shows that there was not a great difference in the scores achieved as a result of the
platform. As Figure 2 illustrates, while the degree of difficulty of the activity seemed to be higher as the
tasks became more active (as shown by the lower scores), the scores achieved on both platforms were
generally very similar. Some scores were marginally higher on the PC (Activity 3 and Activity 5) and
others were marginally higher on the mobile (Activity 4). There was some degree of variation in learner
scores for each of the activities, with a standard deviation of 15.44, 22.32, 25.14, 29.98, and 29.26
respectively for the PC, and 15.70, 22.81, 26.29, 32.15, and 32.24 for the mobile, indicating that there
was a greater variation in the scores for the more difficult writing activities.
101
Score
Glenn Stockwell
PC
Mobile
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
Activity 1
Activity 2
Activity 3
Activity 4
Activity 5
Activity Type
Figure 2. The scores achieved for each activity on both PC and mobile phone (N = 175).
Minutes
A general overview of the amount of time required to complete each activity on the PC and mobile
platforms for all learners was generated, as can be seen in Figure 3. The figure shows that each of the
activities took significantly longer to complete on the mobile phone when compared with the PC,
generally requiring around 1.4 minutes more for each activity. Despite the fact that Activities 1, 2, and 3
were multiple choice questions where the learner simply clicked on a radio button before each option, and
that Activities 4 and 5 required learners to write complete words in English, there did not seem to be a
particularly big difference in time for the production activities compared with the multiple choice.
PC
Mobile
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
Activity 1
Activity 2
Activity 3
Activity 4
Activity 5
Activity Type
Figure 3. The number of minutes required to complete each activity on both PC and mobile phone (N =
175).
102
Glenn Stockwell
While general data regarding all 175 participants in the study were informative, comparisons of data
specific to the activities on each platform required sufficient activities to have been completed on both the
PC and the mobile phone. As such, those learners who completed between 21% and 80% of the activities
on mobile phone were selected for comparison, which came to a total of 26 learners. Samples of the usage
patterns during the semester have been included in Figure 4. Mobile phone usage was indicated with a
circle (o) while PC usage was indicated with a line (-). The column marked Number indicates the
number of the student in the database records (not their university student number). The results showed
that there was a great deal of variation in how the learners used the mobile phone. As the examples shown
in Figure 4 demonstrate, there was extended usage of one platform followed by short bursts on the other
for some learners (Student 602), while others used both platforms in a relatively balanced manner
(Student 667). For learners who mainly used the PC, there were rather random instances of a single
attempt on the mobile phone, but this pattern was rare for users who mainly used the mobile. There were
also cases where some learners started mainly on one platform, and then over a period of time changed to
predominantly use the other, sometimes with occasional use of the original platform until the end of the
semester. There was also variation in the total number of activities that were completed, ranging from 95
through to 167. This depended on how correct the responses were, and if there were a lot of errors in the
responses, learners were required a do a higher number of activities in order to complete all of ten lessons.
The learner who was only required to complete 95 activities achieved perfect scores for almost every
activity.
Student
602
Pattern
---o-o--oooooooooooo----oooooo---o-o--oooo-----ooo------ooooo---ooo
667
----ooooo--------------oooooooooooo--------oooooooooooooo----------
Lesson
1
81.1
83.4
82.2
2
81.0
77.5
79.6
3
81.4
86.3
84.3
4
84.5
79.7
82.2
5
84.6
82.2
83.8
6
83.8
83.1
83.4
7
85.0
82.5
83.5
8
89.7
86.7
88.0
9
86.2
93.1
89.5
10
87.1
89.0
87.7
The mean amount of time taken to complete the activities showed a much more definitive difference. The
overall time has been broken down by lesson in Table 3. The table shows that it took consistently longer
to complete the lessons on the mobile phone when compared with the computer, and that this time
Language Learning & Technology
103
Glenn Stockwell
difference did not appear to decrease throughout the semester. For the most part, lessons that took slightly
less time on the PC also took a comparatively shorter amount of time on the mobile phone, and the
difference was relatively consistent at around 1.3 to 2.0 minutes. As with the scores for the activities,
there were no real trends that became apparent across the semester. It was interesting to note, however,
that only eight of the 26 learners completed the final lesson, Lesson 10, on their mobile phone, preferring
to use the computer instead. Similarly, only nine learners used it to complete Lesson 5.This pattern may
have been a result of the tests which were held after the 5th and the 10th lessons, and was a tendency that
was also seen in the earlier study (Stockwell, 2008). Apart from these two lessons (and a slightly lower
figure for Lesson 9), overall usage remained relatively consistent on both platforms throughout the
semester.
Table 3. Mean Time (Minutes) Taken to Complete the Activities per Lesson (n = 26)
Platform
Used
PC
Mobile
Overall
Lesson
1
2.23
3.97
3.08
2
1.91
3.78
2.68
3
2.54
4.49
3.72
4
2.13
3.46
2.77
5
1.61
3.20
2.18
6
2.30
3.58
2.99
7
2.01
3.62
2.95
8
2.00
4.03
3.15
9
1.89
3.48
2.65
10
1.60
3.16
2.14
One observation that was noted regarding platform choice was that learners for the most part completed a
lesson on one platform or the other, and although some learners occasionally made a single attempt on the
other platform midway through a lesson, the data indicated that usage was generally in blocks according
to lesson. Examination of the access logs confirms that learners did not tend to leave a long time in
between activities within a single lesson, and the majority of learners did the activities for each lesson in
quick succession, and then stopped after the lesson was completed.
In some cases it appeared that learners decided to swap from the mobile phone to the PC after activities as
a result of the amount of time taken to complete the activities. This swapping of platform took a couple of
different forms. The first type is exemplified by a learner who started on the activities exclusively using
the mobile phone, but after the first two lessons decided to try to use the PC in conjunction with the
mobile phone. While there was not a great difference in scores, the learner took significantly longer on
the mobile phone for the third and fourth lesson, and then chose to use the PC only for the fifth and sixth
lessons, swapping back and forth between the mobile and the PC for the remainder. The second type can
be seen in a learner that started the first two lessons using both the mobile phone and the PC, but did not
display a great difference in time between the two platforms. For the third lesson, they completed all of
the activities on the mobile phone, but took much longer than for the activities done on the phone for the
previous two lessons. As a result, they chose to use the PC for the fourth lesson, and completed the
activities very quickly, so then stuck with using the PC for the rest of the semester. It should be pointed
out, however, that this was not consistent for all of the learners. There were learners who took
considerably longer to complete activities on the mobile phone when compared with the PC but did not
change to the PC. In one such example, the learner completed the first two lessons using both the PC and
the mobile phone, but took much longer to do the activities on the mobile phone than the PC. Despite this,
they decided to use the mobile phone almost exclusively from the third lesson onwards for all of the
remaining lessons.
DISCUSSION
Features of the Mobile Platform
The study set out to shed light on the use of a mobile phone as a tool for completing interactive
vocabulary activities when compared with completing the same activities on a desktop (or laptop)
computer. There were three main questions that the study sought to answer, dealing with the scores
achieved and the time taken on each platform, and any longitudinal patterns that occurred across the
Language Learning & Technology
104
Glenn Stockwell
semester. The first question looked at the differences in scores achieved in activities completed on both
mobile and PC platforms. This showed no consistent difference between the mobile phone and the PC,
with some activities scoring higher on one platform and others scoring higher on the other. It was initially
expected that learners might score lower on the productive activities (Activity 4 and 5) on the mobile
phone when compared with the PC, as the smaller screen and keypad were definitely less convenient for
entering text. Surprisingly, no such difference was apparent, with the learners actually scoring slightly
higher on Activity 4 on the mobile than the PC, and a very marginal difference in favour of the PC for
Activity 5.
The second research question focused on the differences in time required to complete activity on mobile
and PC platforms. The results of the study clearly demonstrated that, with the activities used in the current
study at least, there was a clear difference in the amount of time required to do activities on the mobile
phone compared with the computer. Despite the fact that the interface was simplified as much as possible
to make the possible effects of the small screen and keypad as minimal as possible, learning through the
mobile phone just took much longer. The results indicated that the time taken to complete different types
of activity (i.e., multiple choice or word production) did not seem to be greatly affected by the platform,
and activities that took longer on PC seemed to take a proportionately longer amount of time on the
mobile phone.
It is difficult to tell whether the extra time taken to complete the activities on the mobile phone was a
result of the mobile phone interface or because of other unforseen environmental issues that might arise
when completing tasks on a mobile phone. From an interface perspective, care was taken to ensure that
when a question was shown, all of the possible responses to that question could be seen on the screen at
the same time. However, if learners wished to check the answers of several questions in an activity, they
would have to scroll up and down through the screen in order to see all of the questions, whereas on the
PC this could be done with little or no scrolling at all. This scrolling may have contributed to the time
taken, particularly if the learner wanted to confirm all questions repeatedly before submitting their
answers.
Environmental factors which may have contributed to the extra time taken on the activities are more
difficult to determine, but it is possible to consider scenarios that might shed some light on why it took
longer on the mobile phone compared to the PC. The mobility of the platform means that learners could
do the activities in any conceivable environment where the phone can access the Internet, such as in a
train, in the library, walking down the street or in a coffee shop. Learners who complete activities on a
busy train, for example, may find it difficult to concentrate, as they are preoccupied with other things
around them, such as ensuring they do not miss their stop or keeping their balance if they are standing. As
a result, they may not do the activity as a single unit where they answer all questions at once, but instead
answer one question at a time, with their mind focussed elsewhere in between. This would likely give the
impression of taking much longer to complete the activities when in fact the actual time spent inputting
was much the same, but with additional periods where the activity window was open without working on
it. Obviously it is unlikely that all learners were in non-ideal environments as they completed the
activities, but there is the potential that it contributed in some cases. This could only be confirmed,
however, by having learners keep a detailed journal themselves of where they did each of the activities on
the mobile phone.
The final research question sought to determine whether learners improve in speed and scores over time
on each platform. The data did not seem to suggest that this was the case, and activities in some lessons
took longer than other lessons, regardless of the platform. Similarly, some lessons appeared to be more
difficult than others, and learners achieved higher scores in some than others with very little difference
between the scores on the mobile phone or PC. Was there, then, any effect evident after completing the
activities for a period of time? One interesting observation was that in the first couple of lessons, learners
were more likely to swap around between platforms during a single lesson. Once they had gotten used to
each platform, however, they were far less likely to change platform midway through the lesson. It
appears that the learners were using the initial lessons to determine how they felt about the platforms, and
then the subsequent lessons were completed on the platform that they felt best suited their surroundings or
feelings at a given time.
Language Learning & Technology
105
Glenn Stockwell
106
Glenn Stockwell
mobile phones for learning before the study, experience with other technologies was not determined. It is
conceivable that if learners had experience with PCs in advance of the study, they would naturally use the
platform that they were more comfortable with, hence this information would have been useful.
A further limitation was the study did not specifically identify how many of the learners commuted and
how long was spent on commuting. Given the potential importance of commuting time for completing the
vocabulary activities on the mobile phone, this information could shed light on whether those who did not
use the mobile phone chose not to simply because they did not have relatively unbroken periods of time
without a computer in front of them where they could engage in the activities, such as might occur while
commuting, or if there were other factors involved.
A final limitation was that learners usage habits with the mobile phone for non-learning purposes were
not known. The post-questionnaire identified that learners used their mobile phones for sending personal
e-mails, checking news and train information, but questions asking for details such as frequency were not
included on the questionnaire. Knowing what learners use their mobile phones for in non-learning
environments could provide valuable information to determine how mobile usage might be improved.
Implications for Further Research
As could be seen from the limitations above, several questions remain regarding learning vocabulary
through mobile phones. Firstly, the impact of how the questions are presented on both platforms requires
further investigation. In the current study, both the mobile and the PC displayed, depending on the
question type, eight to ten questions at a single time. If the number were reduced for the mobile phone, it
is possible that it may take less time, and hence learners may be more inclined to use the mobile. It should
be noted, however, that if the number of questions per activity are reduced, there would need to be more
activities undertaken to get through the vocabulary, hence there is the possibility of more access costs.
Further research would shed light onto whether learners are prepared to do more activitiesbearing in
mind the potential increased costin order to have a slightly simpler interface, or whether they prefer to
complete the activities on the mobile even though it is likely that it will take longer.
Secondly, as was noted in the limitations, the research did not reveal very much about the locations the
mobile phones were used for completing the activities. Having learners keep a journal of exactly when
and where they used the mobile activities, along with detailed information of how they use their mobile
phones for non-learning activities would be very valuable in terms of activity design. For example, if
learners frequently check e-mail on their mobile phones, sending reminders to their phones including a
link to activities, such as in the push mode employed by Kennedy and Levy (2008), may prompt them
to complete the activities rather than passively waiting for learners to access them of their own accord.
Obviously there are cost concerns (receiving e-mails on mobile phone is not free) that mean that some
learners may not want this imposed on them. It would be interesting to see, however, whether providing
this option contributes to improved usage.
CONCLUSION
Mobile learning for language learning has reached a stage where it is starting to move out of the
classroom and into the real world. Through mobile phones, we have the potential to provide a rich
learning environment for our learners, but there are still issues that must be considered before they can
reach their full potential. Obviously, there is still the problem of the lack of willingness to try new mobile
technologies, but this is something that may slowly become less of an issue as perceptions change.
The fact that activities may take longer on mobile phones compared with computers does not necessarily
detract from their usefulness. Just as teachers needed time to find appropriate times and places for using
computers in the early days of CALL, the same could be said about the need to find a time and place for
mobile phones. Even with computers these concepts have changed as computers became connected to the
Internet, and learners no longer needed to visit laboratories but could access materials from home at times
that were convenient to them. With the mobile phone, there is an even greater sense of freedom of time
and place, but this freedom also can make it more difficult to make decisions about which times and
places are the most suitable.
Language Learning & Technology
107
Glenn Stockwell
Thus, freedom is about experimentation and making choices. This experimentation and decision-making
must occur on the part of both teachers and learners. Many learners in the current study were able to make
decisions about how to use the mobile phone as a direct result of trying both the PC and mobile platforms,
and determining what they felt was best for them. Examining how learners use mobile phones in natural
contexts can help inform teachers in designing activities and materials for learners, who can then
experiment to find what works best with their learners. As attitudes change over time, so too will learner
preferences and expectations with mobile technologies. Successful use of mobile technologies relies on
keeping up with our changing learners, and continuing to give them opportunities to experiment and
discover.
ACKNOWLEDGMENTS
I would like to express my gratitude to the three anonymous reviewers for their insightful suggestions that
led to significant improvements in this paper.
REFERENCES
Ally, M. (2009). (Ed.). Mobile learning: Transforming the delivery of education & training. Athabasca:
AU Press.
Browne, C., & Culligan, B. (2008). Combining technology and IRT testing to build student knowledge of
high frequency vocabulary. The JALT CALL Journal, 4(2), 316.
Chen, N.-S., Hsieh, S.-W., & Kinshuk. (2008). Effects of short-term memory and content representation
type on mobile language learning. Language Learning & Technology, 12(3), 93113. Retrieved
from http://llt.msu.edu/vol12num3/chenetal.pdf
Ducate, L., & Lomicka, L. (2009). Podcasting: An effective tool for honing language students
pronunciation? Language Learning & Technology, 13(3), 6686. Retrieved
from http://llt.msu.edu/vol13num3/ducatelomicka.pdf
Kennedy, C., & Levy, M. (2008). Litaliano al telefonino: Using SMS to support beginners language
learning. ReCALL, 20(3), 315350.
Kiernan, P., & Aizawa, K. (2004). Cell phones in task based learning. Are cell phones useful language
learning tools? ReCALL, 16(1), 7184.
Koole, M. (2009). A model for framing mobile learning. In M. Ally (Ed.), Mobile learning: Transforming
the delivery of education & training (pp. 2547). Athabasca: AU Press.
Kukulska-Hulme, A. (2005). Mobile usability and user experience. In A. Kukulska-Hulme & J. Traxler
(Eds.), Mobile learning: A handbook for educators and trainers (pp. 4556). London: Routledge.
Kukulska-Hulme, A., & Traxler, J. (2005). (Eds.). Mobile learning: A handbook for educators and
trainers. London: Routledge.
108
Glenn Stockwell
Levy, M., & Kennedy, C. (2005). Learning Italian via mobile SMS. In A. Kukulska-Hulme & J. Traxler
(Eds.), Mobile learning: A handbook for educators and trainers (pp. 7683). London: Routledge.
Ma, Q. (2004). Theoretical and design issues for a computer assisted vocabulary learning program:
WUFUN. Proceedings of the Eleventh International CALL Conference (pp. 241251). Antwerp:
University of Antwerp.
Prensky, M. (2001). Digital natives, digital immigrants. On the Horizon, 9(5), 16.
Rosell-Aguilar, F. (2007). Top of the podsIn search of a podcasting podagogy for language learning.
Computer Assisted Language Learning, 20(5), 471492.
Stockwell, G. (2007a). A review of technology choice for teaching language skills in the CALL literature.
ReCALL, 19(2), 105120.
Stockwell, G. (2007b). Vocabulary on the move: Investigating an intelligent mobile phone-based
vocabulary tutor. Computer Assisted Language Learning, 20(4), 365383.
Stockwell, G. (2008). Investigating learner preparedness for and usage patterns of mobile learning.
ReCALL, 20(3), 253270.
Taylor, R.P., & Gitsaki, C. (2003). Teaching WELL in a computerless classroom. Computer Assisted
Language Learning, 16(4), 275294.
Thornton, P., & Houser, C. (2002). M-learning: Learning in transit. In P. Lewis (Ed.), The changing face
of CALL: A Japanese perspective (pp. 229243). The Netherlands: Swets & Zeitlinger.
Thornton, P., & Houser, C. (2005). Using mobile phones in English education in Japan. Journal of
Computer Assisted Learning, 21(3), 217228.
Wang, S., & Higgins, M. (2006). Limitations of mobile phone learning. The JALT CALL Journal, 2(1), 3
14.
109
Glenn Stockwell
1. motion
2. generator
3. awareness
4. fossil
3. Choose the appropriate English word for an English definition:
A remnant or trace of an organism of a past geologic age, such as a skeleton or leaf imprint,
embedded and preserved in the earth's crust.
1. flood
2. fossil
3. emission
4. disorder
4. Write a word in English for an English definition:
A remnant or trace of an organism of a past geologic age, such as a skeleton or leaf imprint,
embedded and preserved in the earth's crust: __________
5. Write the appropriate English word for an English sentence:
Scientists know about the existence of dinosaurs from __________ .
110
CALL & technological hegemonies (including hegemonic implications of the Internet and Web,
commonly used Web 2.0 tools, and mobile technologies)
CALL & pedagogical hegemonies (including hegemonic implications of social constructivism and
associated interactive, collaborative, student-centred pedagogies; curriculum and course design; and
the design of open access materials and digital repositories)
CALL & educational hegemonies (including hegemonic educational and institutional policies,
expectations and norms)
CALL & social hegemonies (including the hegemonic implications of norms and practices of online
interaction)
CALL & inter/cultural hegemonies (including hegemonic implications of Western cultural norms
and Western approaches to tolerance, openness, relativism and the skills associated with intercultural
competence)
CALL & sociopolitical hegemonies (including the hegemonic implications of democratic structures in
education, and resistance to hegemonies)
Please send letter of intent and 250-word abstract by October 1, 2010 to llted@hawaii.edu
111
Publication timeline:
112