Loanwords in Iraqye, a Cushitic language of Tanzania Maarten Meus and Martha Qorro 3. Loanwords in Gawwada, a Cushitic language of Ethiopia Mauro Tesco 4, Loanwords in Hausa, a Chadic language in Wese Aftica art Awagana and H, Eikehard Wolf, with Darts Ear, — Loanwords in Kanuri, a Saharan language Doris Lar and fl, EMehard Wolf with Ari Awagana. wi Table of contents 6. Loanwords in Tarifiye, a Berber language of Morocco Marters Koss mani cssosesntntnennsnnnnnene sencence ees EOL 7. Loanwords in Seychelles Creole Susazme Michaelis with Marcel Roralie.seswe . vee BAS 8 Loanwords in Romanian m Sdutte. sevens 230 9. Loanwords in Selice Romani, an Indo-sAryan language of Slovakia Viktor Elite... vettntttnnnie intncniecennceenennnece eee 260) 10. Loanwords in Lower Sor Hake Bartel on LL. Loanwords in Old High German Roland Scbrdbmann. 330 12. Loanwords in Durch Nicoline van der Sip. sone 888 13. Loanwords in British English Anthony Grant esi nennennian enn sinennene cote nrensenaes 360 14. Loanwsords in Kildin Seami, a Uric language of northern Eucope Michael RigSler caeseneresernenaneesene sees semenseesnrnrner riers B84 15. Loanwords in Bezhra, a Nakh-Daghestanian of che North Caucasus Bernard Comrie and Madzbid Kbalilov eu neuen 16. Loanwords in Avchi, a Nakh-Daghestanian of the North Caucasus Marina Chumakina. 17. Loanwords in Manan, Kriscine A. Hldebrande., a Tibeto-Burman language of Nepal Table conten» si 18 Loanwords in Ket, a Veniseian language of Siberia Exdward Vajdtcorcon 7 severe ATL 19. Loanwords in Sakha (Yakut),a Turkic language of Siberia Brigitte Pakondorf and Innokentij N. Novgorod sree 496 20. Loanwords in Orogen, a Tungusie language of China Fengxiang Li and Etndiay J. Whaley 21. Loanwords in Japanese Obrisvopher K. Schritde.crnsessen nescence ecient sess SAS 22. Loanwords in Mandarin Chinese Vola Wiebuich anid Uri Vadiar snacennenseesrnsneeeenennneeee SIS 23. Loanwords in "Thai Titima Sutbieear and Uri Fadmor. 24. Loanwords in Wiernamese Mark J. Alves 7 sone 617 25. Loanwords in White Hmong Martha Ratliff oss nemnnnnennasesanennnt sone ees 688 26. Loinwords in Ceq Wong, an Austroasiatic language of Peninsular Malaysia Nicole Krtpe venremmnnnrnrnrn semen nes O58 27. Loanwords in Indonesian Uri Padmore 28 Loanords in Malagasy Alexander Adelaar vn. 29. Loanwords in Takia, an Oceanic language of Papua New Guinea Malcolm Rass, vi Table of contents 30. Loanword in Hawaiian “Oia Parr Jones. ccneeronninnnnininnnn ene FDL BL. Loanwords in Gucindji, 4 Pama-Nyungan language of Australia Pasrick McConsell seonsninesnnnnns 790 32. Loanwords in Yaqui, a Uto-Aztecan language of Mexico Zarina Estrada Ferniiidetsn semen vee 823 33. Loanwords in Zinacantin Toni, a Mayan language of Mexico Cecil H, Brown ow sscncttnnnnsentunninnntetesn nee AB, 34. Lounwords in Q'eqchi’, 1 Mayan language of Guaternala Soren Wichmann and Kerry Hillncwnssnnnennnnnernnensees STB 35. Loanwords in Oromi, an Ocomangucan language of Mi Ewald Hokking and Dik Bakker. 897 36. Loanwords in Saramaccan, an English-based creole of Suriname Jef Gont.. oo 98 37. Loanwords in Imbabura Quechua Jorge Gémez Rendon and Willem Adelaar..corcecneee cesseeenenes 944 38, Loanwords in Kalina, « Cariban language of French Guiana Odile Renauile-Lesewrt sees nerrresnrcnevere seeesceeniersrrmneneanenenersre nine DOB 29. Loanwords in Hup, a Nadahup language of Amazonia Patience Epps severe snrneractaien anarnn sens soon 992 40, Loanwords in Wichi, a Maraco-Maraguayan language of Argentina Alejancira Vidal and Veronica Nercevian. 1o1s 41, Loanwords in Mapudungun, a language of Chile and Argentina Lucia a. Most of the authors were able to attend one or more of the ten meetings between 2003 and 2007, organized competently by Max Planck staff members; we thank in particular Julia Cissewski, Claudia Büchel, Peter Fröhlich and Claudia Schmidt. We also had great help from a number of highly motivated and reliable student assistants, not only in checking and correcting the databases, but also in editing and even typesetting this volume. Thanks are due especially to Yan Eva-Maria Schmortte, Birgit Jinen, Luise Dorenbusch, Jenny Seeg, Alex Jahraus, and Tyko Dirksmeyer. For the maps, we had invaluable help from Sandra Michaelis from Max Planck's multimedia department. But the most important person over the years has been our indefatigable database manager, Bradley Taylor, without whom this project would have had to remain much more modest in its goals and achievements. For the creation of the online version of the World Loanword Database, we are grateful to the Max Planck Digital Library, especially Robert Forkel. Whaley Linguistics and Cognitive Science Dartmouth College 6086 Reed Hall Hanover, NH 03755 USA. Email: lindsay.whaley@daremouth edu Homepage: hetpi/wn dare mouth ede-linguiv Seulty/vhley hem Soren Wichmann Max Planck Institute for Exolutionary Anthropology Deutscher Platz 6 04103, Germany E-mail: wichmann@eva.mpg.de Homepage: herpdfemsilv.napget-wichmaan! Thekl Wiebusch Contre de Recherches Linguist 54 Bd. Raspail 73006 Paris France ues sur Asie Orientale (CRLAQ), CNRS E-mail; thekla.wiebusch@houmail.fr H. Ekkehard Wolf Institut fir Affikanistik, Universitit Leipal Mailing address: Schleiblick 5 Weseby 24354 Kosel Germany E-mail: wolff@uni-leiprig.de Homepage: huipe/immncani-lelpigde afta Chapter I The Loanword Typology project and the World Loanword Database Martin Haspelmath and Uri Tadmor 1. Goals and collaborative design How likely is ic chat a word with a given lexical meaning would be borrawed from one language into another? Before we set out on the Laanword Typology project, this question could be answered mostly on the basis of impressionistic observations, such as “body pare terms are unlikely so be borrowed", of “cerms for new arviticrs are alten borrowed”. In contrast, the research reported in this book is an empirical study of borrow= ability — the relative likelihood that wosds with particular meanings would be borrowed. In the study, we used the chssical methods of linguistic typology: a i) (iii) ‘establishing a worldwide sample of languages (4 languages, see §2) surveying the eypes of loanwords found in these languages, on the basis of a fixed list of lexical meanings (1,460 lexical meanings, see §3) attempting generalizations across the languages of the sample (see chapter I ‘on findings and results), ‘There are a variety of reasons why one would like to know, for each lexical mean= ing, what its chances of bsing borcowed are: @ (oy In assessing gencalogical relatedness between languages, it is important to separate inherited material from borrowed material. Loanwords point ro his~ torical contact between nvo languages (the presence of peopte with at least some knowledge of both languages at some stage), but nat to genealogical re lntediness (ie. dessens from a common ancestral language). But which words ate the most likely to be inherited? Linguists oftea assume that there is a set of words thar are highly stable, unlikely to be replaced by borrowings, smean~ ing shifi, or new formations (ar become obsolete without replacement), but this notion needs to be made more precise to enhance the validity of concli- sions based on it. ‘The likelihuod of lexical borrowing depends on the type of contact situation. ‘A language of @ population undler the political control of another group may be likely to borrow adlminiserative terms from the dominane group's language, and seafaring populations may contribure marine words to languages spoken inland, An invading population may borrow terms for local flora and fauna even if it is technologically and economically superior to the indigenous 4 Martin Haspelmails and Uri Taadmor ‘Table 11 ‘The LAVT project languages and the contributors Ranguage ‘Archi Bethea Ceq Wong Durch English Gowan Goring Hausa Havalian Hap Fenbubuea Quechas Indonesian Tage Inpanse Kalina Xan Ket Kaldia Saami LLawer Surbian Mala Manange Mandarin Chinese Mapudungun (Old High German Oregon Oromi eye! Aiiation Teg, Nakh-Daghestanian Tectic, Nakh-Daghestanian Aslan, Auscro-Asiaic Germanic, nda European Germanic, nwo European Cushiie, Airo-Asitic Pama Newnan ‘Chaulic, Mfo-Asiatie Polynesian, “Austianesin Nadahup Quechua Malay, Avstronesion ‘Cushitie, Afros Asie Dppancse-Ryuhyuin aban Sahanan Fenian Uralie ‘Sluvig Indo-Eurupean Southeast Barico, ‘Avs Pi Sinitic, Sino-Tibetan ineTibecan ore) Germanic, ade Buropean Tunguie ‘Oromanguean Mayan Main focationts) Dagestan, Russian Federution Daghestan, Rusian Fedeeition West Mabysia Netherlands Beiean, USA, Canad, ‘Australia Exhapia ‘Australia Nigwi, Hawai evil, Coloma eesdoe Indonesia Tanzania Japan Veneavels Niger, Niger Rusia Russia Gerrnny Madagascar Nepal China Chile, Argentina Northern Germany China Mexico Guatemala, El Salsae dor, Belize Contributor(s) ‘Maria Chama emardl Comrie and Madshid Khaltlor Nicale Knespe Nicotine van der Sips Anthony Grane Mauro Toco Patrile MeCemvell ‘Aci Awayana and HL Ehhehacd Wolff, wich Dons Lahr “Oiwi Parker Jonss Patienee Epp Jorge A. Gémex Readén ‘and Wilken Achar) Us"Padrmor ‘Maarten Mous and Martha aro Christopher K, Sehmide Odile Renault-Lereure Doric Lahe ard H, Ekehard Wolly, with Ack Awagana Tidwardl Wajda and Andeey Nefedow Michael Rieslee Mauhe Bats Alexander Adelie ristine Hildebrand “Thekla Wiciusch and Uri Tudor) Lucia A. Gollusco, Fresia Mellico, and Adana Fraguas Rohd Sckuhmana Tengtang Laand Linday J. Whaley ‘Dik Bakker and Ennld Heke King Soren Wichmann ana Rercy Hall 1. The Loanword Typology project and the World Laomword Dutabase 3 Temance, in Schule Tndo-Foropean Sah "rurkic Siberia Brigitte Pakendorf and nnokeney N, Novgoradow Sarmnaccon Bnglish-based creole Surinaon Jef Good Selice Romani TndrLranian, Sloikla Viktor Ek ‘ndo-Eueopean Seychelles Creole Feench-baced creole Seychelles Susanne Michaels, with Mar fl Rosalie and Katsi Mure Swahili Tammania, Kerya, ‘Thilo C. Sehadeerg Uganda, B. R. Congo Tak ‘Oceanic, Pyyua New Guinea -Maloolm Ross That Pat-Kadat hand Tir Suthiwan and Ui Tadmor) “Caste Becher AMo-Asiaie Moracco ‘Maatten Kosemacia Viewumese VietwMuong, Vietnam Mark J. Alves Austra Astic White Hmong Hmong-Mien Lao Marcha Ratliff Yaqui eorAutecan Meio Yasina Estrada Fennindee wiehi Matco-Maraguuyan—Aggeoting, olivia Alejandra Vil and Yeriniea Nercesimn Zinacantén Trott Mexico 3. The Loanword Typology meaning list 3.1, General features of the list The LWT meaning list consists of 1,460 lexical meanings, which are listed in the appeadlix to this chapter. Most ean be expected to have word counterpatts in any language, at least in principle, e.g, “head, “to eat’, ‘ead’, and ‘strong’, However, some other meanings are not expected to occur in many languages. These include ‘numeous meanings inherited from the lists on which our list was based (iseussed later in this section) such as ‘meat’ ‘awl, ‘nit’, “battle-a’, ‘nethag', “men’s house’, ‘mother-in-law of a man’, and “father-in-law of a woman’, A few region-specific meanings were also added by the LWT project based on suggestions by cont tors, for example ‘larch, ‘manioe bread, ‘tumpline’, and ‘snowshoe. By asking the contributors to provide the counterparts of these meanings, we aimed so obtain comparable lexical samples from all project languages. Note that the list is a “meaning list”, not a “word list”, The items on the list are meanings shat could be relevant in any fanguage, not words of a particular language (in par~ ioular, they are not words of our working language English; see §3.3). Of course, the compueability of the lexicons af different laaguages is necessarily limited: biogeographical and cultural variation entail different kinds of lexical meanings that occur in different languages. Amazonian Languages do not have words for ‘snowshoe’ of ‘mosque’, and Siberian languages do not have words for 6 Martin Haspelmails and Uri Taadmor joucan’ or ‘manic’, None of the 41 individual language databases has counterparts for all 1,460 meanings, but only a handéul of the subdatabascs have fewer than 1,000 counterparts. In addition, even when talking about similar things, difference languages divide the world in different ways. For rodents of the geaus Mis, lndo- rnesian uses the word rikus, bus this word does not mean simply ‘mouse’, because it can equally be used for the geaus Rarcus (rat). Similarly, English normally makes aio distinction between the genera Ciconfa and Mycteria calling both of ther stork. Such luck of cross-linguistic semantic congruence is pervasive and will be discussed Further below (§4). ‘The LWT meaning list is based on the meaning list of the Intercontinental Dictionary Series (IDS), a project founded by Mary Ritchie Key (1924-2003) and now headed by Bernard Comrie. Key modeled the IDS list, which consists of 1,310 meanings, after Carl Darling Buck's Dicrionary of Selected Synonyms inthe Principal Indo-European Language: (1949, University of Chicago Press), which contains over 1,200 meanings. The lexical meanings of Buck’s list are naturally biased towards catligr periods and che geagraphical region of Europe and southwestern Asia, Key aulded aiany meanings €0 the list that are appropriate in other biogeographical and cultural sectings, especially (native) South America, ‘The LNCT meaning list in cludes all 1,310 meanings on she IDS list as well as 150 additional meanings which fall principally into three categories: (2) concepts important to geographical regions beyond the geographical and cultural biases of the IDS list; «b) meanings that ap- pear on the Swadesh 207 list bur not on the IDS list; and (¢) other meanings deemed diagnostically useful, especially common meanings pertaining to modern life, as such cerms are almost entirely missing from the IDS list Cradio’, ‘bus’, ‘hos= pital’, and similar items), As mentioned above, the contributors were allowed to add Further meanings to their individual language subdatabases beyond the meanings on the LWT list, but these were not counted for statistical purposes. 3.2, Semantic fields and semantic word classes ‘The 1,460 meanings are divided into 24 flelds. OF these, 22 were semantic flelds retained from Buck's (1949) list and Key's TDS list (slightly renamed in some cases), and two fields were added (23 and 24). These additional two fields are not, strictly speaking, semantic fields, but they wee deemed important for our study, ‘The 24 fields are given in Table 2. In same cases, the grouping of words is fairly obvious (eg. animal names in field 3, body parts in Aeld 4), but in many other eases the grouping of the words is somewhat arbitrary, and alternative groupings are possible and might be preferred by other scholars, Thus, “wheel” is in fick £0 (Aatian), but could also be pat into Field 9 (Basic aecions and ecchnolegy), and ‘clever’ is in field 16 (Emotions and values), bbur would equally fit into fleld 17 (Cognition). Nevertheless, these semantic fields are very useful for a first orientation, and they’ are used in the tables of all the lan- guage chapters. The fleld numbers are also used in che LWT’ meaning codes, 1. The Loanword Typology project and the World Laomword Dutabase 7 following Buck's and Key's usage. For example, the world has the code 1.1, the cat has the code 3.62, and cever has code 16.84, indicating they belong to fields 1, 3, and 16, respectively, In addition, we assigned each meaning to a semantic word dass, as shown in Table 3. ‘These are meant as very approximate categories, corresponding roughly to ‘things and enticies, tactions and processes, ‘properties, ‘manner and location’, and “grammatical meanings. The labels correspond to traditional part-of-speech labels, Inu this fs purely for convenience. There is no expectation that these ontological categories would necessarily match the parts of speech in a particular language (al- though this is often the case). As already explained, the list is a meaning list, not a list of words with syntactic properties, so some meanings may well have counter~ parts in different languages that belong, to different parts of speech (sce §5.1). Nove that the semantie word elas: “Function word” is broader than the field Mitcelfancour finccion words, as the latter only contains grammatical meanings not already in= cchaded in one of the other chapters, Table 2: Semantic fields of the LWT meaning list ‘Semantic field bbe Number of meanings T The physical wort A 2 Kinship, 85 3 Animals ls 4. The body 159 5 Fond and drink BL 6 Clothing and groom 7° 7 The house 9 8 Agsculture and vegetation a 9. Basic actions and techaclogy n 10 Motion 8 11 Posretion 4 12. Spatial relations * 13 Quanity 3B Vi Tine 3 15 Senwe perception 36 Emotions aad values és 7 Cognition 3 18 Specch and language 4 19 Social and pobicical re 36 20 Warfire and bunting 0 21 Lae 2 2 Religion and belief 2 23 Modsen world 7 24_Micellancous function words We 7 1560 & Martin Haspelmasls and Uri Taadmor ‘Table 3: Semantic word classes of the LWT meaning list Semantic word cis Nawnber of meanings m4 Adjecive” 120 “Adverbs” “Content wens” Grand total 3.3, Identifying the meanings Ic is importance to emphasize that, as mentioned. carlier, the Loanword Typology meaning list should be thought of ay a list af meanings chat is designed ta elicit words from the project languages. Ic is not a list of English words, though the meanings are most readily identified by human users through their English labels (for the computer, the LAV code is the primary identifies). Some LWT meanings are narrower than those of the English label; for instance, LWT 9.61, labeled to forge, is intended to refer to the action of making something from a piece of metal, not to the action of illegally copying something. And some LWT meanings are broader than those of corresponding English words; e.g. LWT 1.36, labeled the river or strectny 5 intended ta refer to a Mowing body of water of any size, a ‘meaning for which English does not have a non-circumlacutory expression, Con- triburors were made aware of dilferences between LWT meanings and the meanings of the English labels by means of typical context sentences andfor a ‘meaning descriptions when deemed necessary ‘The LWT meaning list thus consists of three pieces of information for each meaning, {in addition to the semantic field and semantic word class that we just saw): the label, a meaning description, and a typical context. “The LWT label consists of an English word or phrase and serves to summarize the intended meaning. In many cases, this summary is sufficient wo give a clear idea af the meaning (e.g. the apple, the coffe, long). Labels of nouns eantain the article the in order to make clear immediately that it is not the verb that is intended, andl similarly verb labels contain the infinitival marker ro (e.g. the fy, 1a fly, the bark, tosmell, the light, to igh). Where different meanings have the same English label, umbers in parentheses are used to differentiate the Tabels, so that all the labels are unique fexg, £0 ask (I)p i.e ‘inquire’, co ak (2), ie. ‘vequest’), Where the major varie~ ties of English have differene words, both are used in the libel, separated by a slash (eg, sickill, be maizelcorn, the autaunnifald “The meaning description field contains additional information aboue the mean~ ing provided by the editors in order to clarify oF disambiguate certain items. The 1. The Loanword Typology project and the World Loamoord Dasabase 11 editors, while others were optional, ‘The kinds of information in §5.1-5.4 were obligatory in principle, though “no information” was always an option, alsa for the obligatory fields 3.1, Word form in the project language Contributors were asked co provide the counterparts in the relevant language in their standard citation forms, even if the citation form contained a grammatical morpheme such as a case affig, an article, or an infinitive marker. For nouns, this meant singular forms in almost all cases. However, if a noun occurred 0: different number than the English label, chis was not a problem. Thus, 2 LT meaning expressed by an English plurale tantum such as aats could have a counter- pact that was a singular mass noun, and a singular could be rendered by a phurale santum; ¢g, German only has the plural form Geichwiiter ‘siblings, bus chis would bbe « suitable eounterpart of LWT 2.456 (tbe sibling). “The counterpart words in the project languages were provided in che spelling or transcription/transliteration that és most commonly used by linguists for the lan~ guage. When the language uses 2 non-Latin scripc, the form ia the indigenous ‘orthography could optionally be provided in a separate field. When there were oxo slightly different forms of the word, they could both be included in this Beld and separated by a comma, eg Indonesian mar, emay a coun terpare of LWT 9.64 (the gold). But when 2 meaning has (wo quite different counterparts, they counted as two distinct words and had to be entered in separate records. Homonyms were distinguished by indices in parentheses (e.g, Indonesian pusang (1) “bigh tide’, paveng (2) ‘pai. “The counterpare did noe have to match the word class of che English word used to comtey the LWT meaning (although it often did), As already mentioned, if the project language had a close semantic match that belonged to a different word class, i could constitute the counterpart, For example, LNT 5.14 (co be bungry) has an adjective counterpart in English (hugry), a verb counterpart in Gawwach (prt ‘be hungry’, and a noun counterparc in Swahili (njaa ‘hunger’. ‘The counterpart could be a single word, a compound, or a phrasal expression, Although the counterparts could be larger than words, we still relee wo them as “words” for the sake of simplicity (@ more precise term would be leseme, inthe sense of ‘lexical entry’). Phrasal expressions were only to be given if they were fixed and conventionalized. Contributors were specifically asked not co provide deserip- tions or explanations of the meaning as counterparts. For example, for LWWT 4.393 {ihe father) a language may have hud ‘hair of bird? as the best equivalent, but if chis vwas not a Fixed expression, it could nor be used as the language's counterpart, and the entry should have been left unfilled. By contrast, eo make love isa fixed expres sion in English, so it can be used as counterpart of LWT 467 (10 have sex). Similizly, contributors were asked not to enter compound phrases of two or more 12 Martin Hespelmails and Uri Taadmor hyponyms CA and/or B”). For example, in the Indonesian list, LWT 2.94 (we) is left unfilled, because there is no single word that corresponds to this meaning, but rather swo sub-counterparts that are already inchuded in the LWT" lists Aanté ‘we [ineclusive]’ (corresponding to LWT 2.941) and kita ‘we [exclusive] (corresponding co LWT 2.942) ished loanwords that were felt by the contributor co be part of the were to be provided, not nonce horrawings (instances of single word code switching). This distinction was often hard to make, especially when the language had no monolingual spcakerss but contributors were asked (0 try te make i as best they could. 5.2, Analyzability andl gloss In assessing the possible loanvord starus of a word, che first question was whether the word was analyzable (i.e. morphosyntactically comples) within the lraguage. If this was the case, it was almost certain that it was created hy speakers of the language rather than borrowed from some other language. Such words were not considered loanwords, even when they contained borrowed elements. Conesibutors were asked to indicate whether the word was (1) unanalyzable (af the Form could not be analyzed into ewo or more constituents); (2) semi-analyzable (ila constituent structure could be identified, but not all constituents had mcan~ ings, such asa “cranberry morph’; ar if the word was analyzable to linguists but not +0 lay speakers); (3) analyzable derived; (4) analyzable compound; (5) analyzable heasal. "Tor analable items, contributors were asked to give 1 morpheme-by- morpheme gloss, i.c. a hyphenation and a gloss in square brackets. For example, for Kanuri shidimd ‘wienest, the field contains the following: “shidi-ma [testimony- owner.off”. For abbeeviations of grammatical categories, contributors were referred to the Leipzig Glossing Rules 3.3, Borrowed Most importantly, of course, contributors were asked to indicate whether, to the best of their knowledge, the word was a loanword, ic. had been borrowed from another language at some point in the language's history. Provolanguages were also considered stages of the same language, so that 1 word borrowed into Proto-Uralic, for example, would count as a loanword in Saami, Five degrees of certainty were distinguished: 1. The Loanword Typology project and the World Laomword Dutabase 13 55.4, Borrowed elements nat comtinuting whole wards Ian analyzable word was derived from a leanword, this could be noted in a special field Ceoneains a borrowed base"). However, even if the borrowed element consti~ cuced the m: 4 neologism in che recipient language and nos in she donor language, a exception was made iF the added morphenses were pet of the bortowing process or pact of the word's normal citation form; such words were treated as loanwords, For example, Ket pasatber ‘to rescue’ contains the suffixal element -bed, but this element was re~ quired to integrate the loan verb (borrowed from Russian spasar’) into che language, so this was not counted as. neologism, root, the word was not treated as a loanword, since ic was created as, SSS. Frequency Frequency counts are important for the analysis of lexical borrowing, since it is generally assumed that levical stability inereases (and therefore borrowabilicy de creases) with frequency, However, frequency information was not made obligatory in the LWT pro se many languages do not have significant eepresentative corpora on which frequency counts can be computed. Contributors who had access to frequency information could enter the occurrence per million words in an op- tlonal field, Another field was provided for entering more impressionistic categories, Crery common”, “biely common”, of “not common”) for the benefit of coateibu- rors whe had strong intuitions about the relative frequency of words but did not have access to numeric frequency counts. For various reasons, very few conteibutors made use of the frequency fields, so this is one of the aspects of lexical borrowing chat will be lef for Future research. 55.6, Register Register is also of potential interest for lexical borrowing, and 2a optional field was provided for entering information about the wor!’s register: *Tormal’, “colloquial”, cor “general”, For example, when giving the English cauntecparts for item LWT 1.343 (che cape), promontory would be marked as ‘formal’ while peninsula would be marked ‘general. 6, Additional information for all loanwords Ifa word was considered a (certain or probable) loanword, there were several Further (obligatory or optional) data fields that the contribucors filled in. 16 Martin Hagpelmails and Uri Taadmor G1. Source ward and donor language The first question we asked about borrowed icems was their source, ideally their immediate source. For example, LW'T 23.31 (ohe president) has the Indonesian counterpart presidn, which is ultimately derived from Latin pracsidens. But Indone= sian borrowed che word directly fom Dutch president, so this was given as source word. However, sometimes the immediate source was not known, ia whi case an earlier known source word was given in a separate field. For example, the Bezhva counterpart of LAVT 5.821 (ihe chili pepper) is tiabe He appears to ubt- mately derive from Azerbaijani titi ‘capsicum’, but its immediate source word is unknawn (the etymon docs not seem to occur in AAvar, the expected intermediate donor language). In additian, it was sometimes clear fram a word's form that it was a loanword, but no source word (either ) could be identified. ‘The source word was given in the spelling or transcriprion/transditeration that was most commonly used by linguists for the language, The danor inguagey if known, was also. stared. If ic was questionable, ix could be marked as “uncertain”, Sometimes there were mare than one possible donor languages, which could be arrowed down to 2 small set of languages, €4g. a fumily of closely related languages, fof a set of wo or cheee languages. Therefore family names like “Mongolic™ were also aeceptable as donor languages, and it was also possible to give several donor language names, each marked “uncertain”, ‘The meaning of the source word was always provided, even iF it was identical vo the meaning of the loanword in che re- cipient language nediate or ca 6.2, Leanwords vi vis the recipient language When 2 ward is borrowed it has an effece on the lexicon af che recipient language. Te may replace an eirliee word of roughly the sume meaning, or simply be added to the lexicon where no earlier word with that meaning existed, or it may coexist with an earlier word of roughly the same meaning, Such information, as far as it was Known, was provided by the contributors (with the choices “replacemens”, “inser tion”, and “coexistence”) for all loanwords. Borrowing @ word offen entails a certain modification of the source word, 1¢- quired for the integration of the word into the recipient language. Conteibutors ete not required to measure the degree of phonological and morphological tnte~ gration of loanwords in a precise way. Instead, they were asked to rank the loanword impeessionistically as “highly integrated”, “inte grated”. As a rough guideline, unintegrared loanwords were those that kept nificant phonological and/or morphological peculiarities of the donor language and were therefore recognizable as loanword: also to speakers with no craining in linguistics. On the contrary, highly integrated lounwords were those that had no structural properties shat betrayed their foreign origin, Loanwwords that had some synchronic properties of the foreign language were marked “intermediate”. mediate”, or “uninte- 1. The Loanword Typology project and the World Laomword Dutabase 17 ‘The environmental salience of borrowed meanings was also noted. By this we mean the degree ta which a word's meaning is refevant to the speakers in thei environ~ ment, Three values could be chosen: “present in pre-contact environment” (for example, there were mountains in England even before the word motentain was borrowed from French); “present only since contact” (speakers af many South American languages borrowed the word for ‘horse’ fram the Spaniards who intro- duced ic to their environment); or “not present” (snow did nor exist in ‘Thailand either before or after the introduction of the Sanskrit loanword bimd ‘snow’, but the word itself is known and understood by speakers of Thai). By “contact”, we mean the first contact between speakers of the recipient language and the donor language. This contact could have been with speakers of the donor language, but it could also: have been with texts in the donor language. 6.3, Contact situation One of the goals of the LWT project was to make inferences on possible linguistic autcomes of different contact situations. Contributors were asked to provide a name for each contact situation that has led co lexical bosrowing in the language on which they were working, There was not always a one-to-one relationship between the number of donor languages and the number of contact situations, One donor language could be involved in more than one contact situation, and conversely more than one language could be involved in che same contact situation, For instance, English dish was borrowed from Latin discus in pre-Old English times, whereas English dius was borrowed from the same Latin word in the 17¢h century, In this case, we need to distinguish between two contact situstions: “Latin to West Ger- manic” and “Latin to English”, On the other hand, for the bocrowing of boomerang and kangaroo, we can assume basically the same contacr situation (‘Australian Aboriginal contact’), even if the two terms ate from two different donor Langu: ‘The opposite situation obrains with Javanese and Sundanese loanwords in Indonesian, Words were borrowed from both languages into Jakarta indonesian (including when i was still called *Malay”) when speakers of both languages poured into the city at the same time. This constituted a single contact situation. The various contact situations are explained! in some detail in the individual language chapters. 7. ‘The database template for data entry by the contributors For the purposes of daca entry, a database remplate was designed by Bredley Taylor FileMaker Pro, 2 cross-platform relational database application. This section turiefly describes the template to give the reader a concrete idea of the contributors’ task, The look and feel of the presentation of the data in the online World Loan word Database is different, but the basic database desiga is of course identical. 20 Martin Haspelmasls and Uri Taadmor class, along with percentage Figures, could be generated with the help of special tools, Another tool generated lists of all probable and clear loanwerds in cach pro- jece Fanguage, which were used for preparing the loanword appendices accompanying each ease study chapter. Finally, tem eustom fields were provided which contriburors could name and use as they saw fir, Most were used for privare ‘organizational and editarial purposes and were nos included in the published World Loanword Database. Others, however, contained impartint information such as reconstructions and bibliographical references, and were therefore retained. 8. Tables showing numbers of loanwords Exch of the language chapters contains at least avo tables with the most important quanciative information about the Toanwords in the language. These oxo tables only take into account level-4 loanwords (‘Clearly borrowed") and level-3 loan words (‘Probably borrowed”) (see §5.3 for these levels). One table gives the breakdown of loanwords by donor language and semantic word classes (semantic nouns, verbs, adjectives, adverhs, plus function wards), and another table gives it hy donor languages and semantic fields (the 24 fields of Table 2). When there was a large number of donor languages (with same donar languages contributing only very few loanwords), these were grouped into donor language groups, shown in the columns fastead of donor languages. The standard cables in the chapters show percentages rather than absolure numbers, in order to make the figures more comparable. It has to be borne in mind, however, that the absolute umber of loanwords is quite different across semantic Relds: 209% of loanwords ttanslates to many more in the Bld The body (159 meanings), the field Religion and belief (26 meanings) wing breakdowns of loanwords by donor language and semantic felds/word ightYorward, because loanwords ate aot allways uniquely associated with these kinds of information, While 94.9% of all Joanwords are associated with a single donor language, a minority of 3.5% are associated with evo or more donor languages (the remainder have no known donor language). When a foanwoed is associated with two donor Languages, it is counted half €0.5) for each of tke two languages for the purposes of the table, and when ic is associated with three lan guages, it is counted one third (0.33). Likewise, loanwords are not uniquely assoctaced with semantic flelds and word classes ~ these are propertics af LWT meanings, not of words in the database, ‘Thus, a word may correspond co ewo meanings that ace in different semantte elds, like Japanese nitu, which means ‘mest’ (Fead and drink) ot “flesh” (The bad). Or a word may correspond to two meanings that are in different semantic word classes, like 2imacantin ‘Trorail buro, which means ‘donkey’ (Noun) or ‘seupid’ (Aajecrive). Again, in such cases the word courts only balf (and one third when it is in ehzee different categories, and so on). Non-unique association with semantic fields is not common, with only 3.9% of all words belonging to wo or more fields, and non classes is not st 1. The Loanword Typology project and the World Loamoord Dasabase 21 unique association with semantic word classes is even rarer (only 1.0% of all words have this property). ‘This method of counting makes the figures a little more abstraet, but in this way we do not give undue weight co words that can be assigned to multiple semantic categories, and to words chat cannot be uniquely assosiated with a single donor language In assigning loanwords to donor languages for the percentage tables, we did not distinguish between certain donor languages and uncertain donor languages, 9. tory and future of the LWT project The Loanwordl Typology project was conceived in 2003 and officially launched in 2004. Contributors on various languages were added until 2006, Berween 2003 and 2007, several workshops took place at which the issues arising from this project were discussed and the contributors presented progeess reports and preliminary results, In 2008-2009 the database and case studies were edited for publication. Over these years, the original design of the project was changed and broadened considerably. Initially the amount of information requested. from contributors was limited, and they were allowed ¢o submis it any format, including simple text files, As the project progressed, and especially following workshops, more tion were added ca the Hist, and a decision was taken to design a custom template in commonly used database application, ‘The template itself underwent several major ws and many minor ones, cesulsing in che final formar which was rather com= plex but also user-friendly (§7). To the editors’ delight, contributors took all chese changes (which ofien entailed much additional work on their part) in stride, andl the vast majority of chose who volunteered to contribute a database and a book chapter did complete them. “The publication of this book, while constituting an important milestone for the Loanword Typology project, is by no mean its end. Scholars are encouraged t0 use the ease studies in the book as a basis for general and comparative research on lexi- cal borrowing. Morcover, with the launching of the online World Loanword Database (ORL: hupy//wold.livingsourees,org/, the project has been given 4 new life, The plan is to provide contributors with the possibility of updating their data- Inuses at regular intervals. Te is also envisaged that databases on aciditional languages would be added to the World Loanword Database which will gradually fill in wnin~ ended gaps in the language sample and give the database a stronger statistical foundation. ids informa- 2 Martin Haspelmails and Uri Taadmor Appendix: The Loanword ‘Typology meaning list 1 The physical world la tat Laz haa Law Las Lae haze Las Lt Las 126 lar 12s Lat La Laz Laz hat ae 13 La Lat feed 1a 138 1.352 Last Last Lae fey lar La Lae Lal Lat La world land soll due mod sand clio precipice pl, sally island reailand shore ln ugh @) foam Lake bay lagoon se cps de low tide high tide whip! sqeing orwell vamp wie wood fest wood 14 145 1st stone ot ack. rearchquake shy ghuning shunder bolt of ighining siishow Highe dress shade or shucow wind loud foe arctic igh to fvere sweater fice ah embers so burn (1) to bam @) to light to extinguish ‘match firewood charcoal 2. Kinship 21 person 221 man 222° woman 223 male () 2M female 6) 225 boy 2251 young man 226 gle 2261 young woman 227 itd (1) 228 baby 231 husband 232 wile 233 tomany 24 wedding MI divorce 235 fee 236 mother 237 pans 238 maried man 239 micied worman 2Al ion 242 daugher 243 child) 2H brother 244 older brother 2445 younger braher AS iter 2434 older sister 2455 younger ster 2456 sibling 2AS61 older sibling 2ASG2 younger sibling 2AS% twine 246 grinder 2461 old mn 247 grandmosher 2am 2471 2.48 249 250 251 251 asi 2a 22h ase 23s 2.54 asa 235 256 29 dot 261 a 242 2.822 2s 2a 24 Loa eit 2642 2m ae 278 2H 28 17% a 1. The Loanword Typology project and the World Laomword Dusabase 23, ld woman grandparents srandson sranddaugher grandchild unde smother’ brother fither’s brother aun mother's ter father’s sister eiew siblings chikd eseeranes father-in-law ofa man) father-in-law ofa woman) mother-in-law (ofa mas) smather-la-law (ofa worn) parents so0-ie-taw (ofa man} soorinclow (ofa woman) baaghter-ia-hwy (ofa mn) ddanghterin-tiw (ofa worn) cdulkl-in-bow lbliggsinlaw stepfither stepmother seepeon sepdrughier orphans widow widower 2.81 282 el family I you (singular) holst he she ve findlusine) swe feeclusive) yor (plac) shoy 3. Aniials St animal 212 male (2) sernle (2) livestock pasture herdsmman stable or sll cattle ball lf sheep Late le he-gaat ke hace ssallion foal oe cole 4M aa7 35 352 354 355 386 337 358 358 as 3583 3584 3585 3.386, aso 359 3595 3594 35% 37 3598 361 364 32 an 463 3 Bonn 3.653 3434 5455 est 368 3.664 3664 3.665 a7 an 373 donkey mule fowl coch/roaster hen chicken goose duck bind seagall heron eagle hawk vulture bat pare dave owl dog bbc opossum, fish fin porpoise or dolphin whale leeabwatee cel wolf ion bear 24 Martin Haspelmaslsand Uri Taadmor ah 478 aa ar 478 a8 Balt 3812 312 3813 3814 331s 3816 3a17 3818 3819 382 sat a2 3823 383 3.831 ase 3838 33H 2835 a4 435 3.862 3.863 2865 1.966 3.869 337 a7 3873 388 339 391 fox eet monkey elephant comme! bread lose body a ca cennipede scorplon cockroach spider spider web bee Deeswas eehive wip fy sandily of midge ee gat monquita prawns or shrimp fick snake hare quail equine relodcerfartbou elldmaoae bener hangaro ‘pagar Belly 2.913 chamekon 3917 butik 392 ately 393 gtasshoppee 394 sll 295 fog 3.96 tard 3.97 erecodile or alligator 398 tune 399 tpie 4 The body 411 Daly skin oF ide flesh, id hair 412 bead 4.144 andy hae 4.145 pbc hair M6 dandruit 4.15 blood 451 vetn or tery 416 bone 4.162 ib 417 horw 418 back spine head semples shalt Brain 4.208 fee 4.208 forehead 4.207 jane 4.208 cheek 4.208 chin f21 ge 4212 qebrow geld qelsh 438 43s 432 436 437 471 47 4374 438 432 nostril cal macus mouth break bp tongue tooth puns rmokar eooth neck nape of neck throat shoulder nlderblade collarbone angle tba hand pala of and Angee thusnls fingernall clave kg thigh alforkee ince 76 17 78 721 72 722 78 723 7230 724 725 7.28 79 731 732 138 139 148 72 722 143 144 245 746 29 18 131 12 753 754 7.85 156 77 761 7682 2a 764 165 7.86 1. The Loanword Typology project and the World Lamword Dusabase 27 men’s house Russian regjume; and the English word weekend is assigned the defaule masculine gender in French (le weekend). Loanword adaptation is sometimes indispensable for the word to be usable in the recipient language. In particular, languages with gender and inflection «las raced 0 assign each word to a gender and inileetion elass, so that ie ean occur in syntactic patterns which require gender agreement or certain inflected forms, Sinmi= larly, loanwords from Arabic have to be adapted orthographically in English because athenwise they would nor he readable. However, in many eases the degree of adaptation waries, depending on the age af a loanword, knowledge of the donor language by recipient language speakers, and their attizude roward che denor language. If the donor language is well-known and/or the loanword is recent, recipient-language speakers may choose not to. adapt the word in pronunciation, and they may borrow certain inflected forms from the donor language, In this way, English borrowed plural forms of words from Greek and Latin (phenomenonpbenomena, funguslfungi, cristlcrixs), and Geeman even bor rowed a few case forms (e.g, the genitive in das Leben Jou ‘the life of Jesus). And carthographic adaptation is not necessary to the extent that readers are familiar with the donor language's writing systems (thus, in Japanese and Russian, English words are not always orthographically adapted, because readers can be expected to be fix miliar with the Latin script). Complete aaptation of non-fitting loanwords may take a very long time, and frequently ae least a linguist who ts familiar with che language's usual phanotsctic pattems will recognize a ward as a loanword simply by its unusual shape (see also §6). "1 Other equivalent serms are acrommadation, asiaton nel narieaton I, Lestcel borrowings Concepts and iswer 43, Loanwords that are not adapted to the recipient language's system are typically recognizable as loanword, and they are sometimes ealled farcignisms (German teadi~ tionally makes a distinetion between Fromduérter ‘forcignisms’ and Lebyedrter ‘adapted integrated/established loanwords’, von Polent 1967, Keiee 1980). However, recognition af a word asa borrowing by speakers is a complex matter that «depends ‘on many different factors, and adaptation is only one of chem. Anather is mere novelty: Ifa ward entered the Language just receatly, many older speakers will re- ‘member an earlier stage of the language and will thus be aware of the word's young age, Innovating speakers may fice criticisin by okder speakers for using a loanword, and this contributes co the general awareness of the degree to which a word is an accepted and established part of the language. The dimension along w Fremdwrter anit Letmwérter differ is thus not identical to the degree of adaptation, and we may choose the term degree of integration for it, co keep the two dimen- sions separate, (However, in prictice linguists do not distinguish adaptation andl integration systematically along, these lines, and the authors of this book generally use inzqgration for ‘alaprarion’,) ‘The notion of forcignism is evidently close to thar af « single-word switch discussed in the previous section. We might sty that single word switches are exen less integrated than foreignisms, to che paint of not being (clear) members of the language's lexicon, Integration would thus be the degree to which a word is fele co be a fll member of the recipient Hanguage system Ifa large number of lounwords come trom a single donor language, then there is less need for adaptation, and instead the donor language patterns will be imported along with the words. Thus, Japanese borrowed many Chinese words that ended up ‘wich long vowels and diphthongs, so that naw these phonological patterns are Ente= gral parts of the Japanese sound system, However, Sino-Japanese words still form a separate stratum in contemporary Japanese, with grammatical behavior that differs from native Japanese words, and speakers are aware of the distinetion (ef, Sehrnide, this volume). Similarly, German borrowed the plural suffix -s along with words from Low German and English, and now this suffix has become an integral pare of the language which is aso extended to non-loanwoeds. ‘The precise ways in which the adaptation process happens are ofien complex anda matter of ongoing debate. [n phonological adaptation, the respective roles of phonetic constraints and phonological patterns are contentious (eg. Peperkamp 2005, Yip 2006). In gender assignment co loanwords, a multicude of fictors seem to play a role (e.g, Stolz 2009), The role of morphological adaptation in verb borrow~ ing is explored by Woblgemuth (2009: ch. 5-7). In this volume, loanword adaptation is not the focus of che authors’ interests, but most of the linguage chap ‘ers contain a section on adaptation (generally called “Integration of loanwards"). 6. Recognizing loanwords Linguists identify words as loanwords if they have a shape and meaning that is very similar to the shape and meaning of « word from another language from which i I, Lextcel Borrowings Concepts and isuer 43, ‘opposite. In this case, we simply do not know whether the Burushaski word oF the Sanskrit word was the source af the borrowing. However, there are a number of criteria available that often give us a clear indi- cation ofthe borrowing dircetion. Fist, if the word is morphologically analyzable in fone language but unanalyzable in another one, then it must came from the first language, For instance, German Grenze ‘border’ must have been berrowed from Polish granica ‘border’ rather than che other way cound, because ica is a well- recognized suffix in Polish, and the stem gran occurs elsewhere, whereas German Grenze is not analyzable in this way. Similarly, Sanskrit matanga~ ‘elephant come from a Munda language, because the element -éoy means ‘hand’ wi Munda, bur has no meaning in Sanskrit (Burrow 1946: 5). Second, phonological criteria are ofien available: If a word shows signs af phonological integration in language A but not in language B, it must come from language B Third, if the word is attested in a sister language of language B that cannot have hacen under the influence of language A, ix must come from language B, ‘Thus, Sanskrit jemati ‘eat! must come from Munda (e.g. Kurke jomse ‘eat), because the root is also artested in Mon-Klhmer languages which were not uneier Indic influence to the same extent as Munda languages (Burrow 1946; 5), Fourth, the meaning often helps: Sanskrit nakra- ‘crocodile’ is likely to be a loanword from Dravidian (e.g. Kannada neyar), because Indo-Aryan speakers com- ing from northern India would not have brought a word for ervcodile with them (Burrow 1946: 9). However, these criteria do not always give clear results, especially if the words are very old, and if they appear in languages from a number of different families in a particular area, Such words are sometimes called Winderwirter, and. Awagana 8 Wolff and Lahr & Wolff (in this volume, in their chapters on Hausa and Kanuri) call the phenomenon “areal roots". Even when a loanword is not very old, there may be several different possible donor languages, and it may not be decidable which language the word was bor- rowed fom. ‘This happens, in particular, when several relaced languages are donor candidates, as in the case of Romance influence on Germanic, ‘The Dutch word pip ‘pipe’ must have been borrowed from a Romance language, but whether it was French (pipe) of Italian (pipa) is unctear (van der Sijs, Dutch subdatabase). Thus, in the World Lounword Database, quite a few donor languages are in fiet “donor families”."" In other cases, several different donor languages are given as alternatives, so the relationship between words and donor languages is occastonally 4 one=to- Again, sometimes subtle phonological criteria are available for ng between different donor languages. Thus, Samoan. sapae ‘vobacco! vwas nor borrowed directly from English, but via Tongan zapaka (because Samoan * regularly corresponds ta Tongan k; Moscl 2004: 219). & cover term far linguages and Frills 8 lerguaid, so we sometimes til aout “donor langue” {C donor languages or “donee iste), $6 Martin Haspelmasls 7. Why do languages barrow words? Explaining why languages change is generally very difficult, and explaining, why languages borrow words is no exception. In fact, it is probably more difficult to explain lexical borrowing than most people chink. "This section will chus limit itself +0 maising and discussing a number of issues, rather chan propose or endorse specific explanations, A simple dichotomy divides lounwoxds into cultural borrowings, which desig mate a new concept coming from outside, and core borrowings, which duplicate meanings for which a native word already exists (Myers-Seotton 2002: 41, Myers- Scottan 2006: §8.3). For cxample, Imbabura Quechua borrowed arrusa ‘rice’ rij ‘clock’, and simana ‘week’ from Spanish (Gémex Rendén, subdatabase of the World Loanword Datalnise), all referring to cultural items that did not exise in the Ameri- cas before che European invasions, On the other hand, the Austroasiatic language Coq Wong borrowed bayuy ‘shadow’, havol ‘to cough’, and dalam: ‘deep’ from Malay (Kraspe, subdatabase of the World Loanword Database), all referring co concepts that must have existed befare the Ceq Wong came into contact with Malays."° 7.1. Cultural borrowings At first glance, explaining cultural loans is straightforward, and such loans have also licen called “loanwords by necessity". However, there is nothing necessary about a borrowing process. All languages have sufficient creative resources to make up new words for new concepts. As Brown (1999) documented in great detail, many North American languages do nor use foanwords for introduced concepts like ‘ice’, ‘clock’, and ‘week’, but instead make use of theic own resources. If a new concept becomes very frequent and the newly created expression becomes too cumbersome, there are always ways of shortening the expression. Por example, Witkowski & Brown (1983: 571) report that the word for ‘sheep’ in Tenejapa ‘Tzeltal (in Chiapas, Mexico) was originally ruim cih [cotton deer], bur that as sheep became more important to the people in highland Chiapas, the modifier tunint was simply omitted, so that (2 now ‘means simply ‘sheep’ (to designate a deer, che modifier «Pritt! ‘wild” has to be added), ‘This process is quite similar to simple semanti¢ change or extension, another frequently used mechanisin tor creating words for new concepts. For example, the words ralume, mouse, meray meniory, and bookmark have taken on rather new mean~ puter technology, and English has no need for any borrowing, "Tadmor (2007) proposes tre following explanation for che borrowing of bie words ia this anal sila eases: Speakers tied to assimilate ta the steangly domissuat Maly peaple, but had ery ltl access 69 the Makay language, so they borrowed what they could, the basic vocabulary that hey Iznew, ‘This we get the amiual cenlt that more huste than non-basic vorabulny i horeneed omme langoagen 48 Martin Haspelmasls ‘Thus, unless there are significant purist attitudes among the (influential) speakers, nev concepts adopted from another culture are the more likely tobe expressed by loanwords, the more widely the donor language is known, If only very few people speak the donor language, native neologisms and meaning shifts are more Likely to bbe used for the new concepts. In a very thorough comparative study, Brown (1999) shows that the North American languages whose primary European contact lan~ guage was English borrowed far fewer words than languages whose primary conract Ianguage was Spanish. He attributes this to the face that the indigenous populations hac! more access to Spanish (¢4g, through missionary schools) than co English dur~ ing the initial pesiod of European contact. 72. Core borrowings Explaining core borrowings (loamwords that duplicate or replace existing native words)'” is more difficult. Why should speakers use a Word fom another language if they have a perfectly good word for the same concept in theie own language? Here it seems that all we can say is chat speakers adopt such new words in order to bbe associated with che prestige of the donor language. Like “puristic arcieude”, “prestige” isa factor that is very difficult to measure independently, and a danger of ircularity exists, However, it seems co me undeniable that prestige is a factor with paramount importance for language change, going far beyond our euerent topic of Toanwords, The way we talk (or write) is not only determined by the ideas we want to get across, but also by the impression we want to convey on others, and by the kind of social identity thar we want co be associated with, Other terms such as *cul- rural pressure” (Thomason & Kaufman 1988: 77) of “loss of vitality (of the recipient anguage)” (Myers-Scotton 2006: 215) are often found, but these are even more vague and intangible than “prestige” the adoption of words for already exist It is peehaps easiest to us concepts in a sit speakers are bilingual in Hungarian, Elk this volume) or Tariflye Berber (speakers are bilingual in Moroccan Arabic, Kossmann this volume). When (almost) everyone also understands the other language, it does nat really matter which wards one uses ~ one will be understood anyway, More surprising is the borrowing of basic words like ‘star’ and ‘tura around’ by Ceq Wong (lam Malay, see Kruspe this vol- ume), even though bilingualism has not been common until quite recently. See note 15 far a possible explanation of this case. While the distinetion berween cultural and core borrowings is use, ft Is by no means always clear how to classify a loanword. If all languages had the same lexical meanings that have to be expressed by words, this would be straightforward, but of "*-rhstenm is potcelly mskng because it suggests cha cre boring conscen core vocabulry arly leis eetined here for lick of better alternative, and hecatse it was we prominent by Myer-Seotton 2002, 2006, and eaewhery). I, Lextcel Borrowings Concepts and isuer 49° course lesical meanings do not have to fit into predefined slots, For example, one might chink that the Sakha word for ‘roof, risa (from Russian kryia) must be a core borrowing, because the Sakha had roafs before the Russians arrived in Yakeutia However, as Pakendorf & Novgorodoy note in the Sakha subdatsbase: “The teidi- tional Sakha winter-house had a covering of earth and cow-dung like the walls, not a scparate roof like the modern Russian-style houses.” So although the Russians ‘would have called the Saksa-style coof lye, che Sakha may well have decided that the Russian-style roof was a different kind of ching, deserving a special word (thus a cultural borrowing). Another example is the word ricwaim ‘weather’ in Manange, borrowed from Nepali {in Hildebrande’s subdatabase). Of course Manange speakers talked abous the weather before Nepali contact, but they seem to have had no gen= eral word for weather. The ‘weather’ word i new to the Lnguage, but we can hardly say that the Manange learned a new cultural concept from the Nepali ~ chis word is thus not easily classifiable ac a core or cultural borrowing.” In the World Loanword Database, we categorized the effect of a loanword on the lexical stock of the recipient language as follows: insertion (the word is fasertesl into the vocabulary as a completely new itern), replacement (the ward may replace an earlier word with the same meaning that falls out of use, or changes its mean ing), or coexistence (the word may cocaist with a native word with the sume meaning). For each loanword, we asked the contributors to specify the effeet in these terms. Obviously, insertion refers to cultural borrowings, while replacement and coexistence sefer to core borrowings. Our contributors were offen unsure how to fill in these database fields, because the cultural/core distinction is somewhat problematic, as we just sav. Nevertheless, the information from these flelds may prove useful. The distribution of these chree effect types in our database is as fol~ lows: effect number of (clear) loanwords insertion 4823 replacement 1667 coexistence 2542 no lnformation 3443 othe lack of clarity about whar a sew concepr is also meses that iforenatlon about this is not cary to get. Nevertheles, the World Loanword Database has afield (Envionme tal salience") that ine Aicstes for leanne whether the phenomenon wa present before she contact or not. ‘The oreral result i for cleafy borrowed mors): pheaarenon present only since contact: sal phenomenon present in pre-conracteaslronsments 5524 Bhenarienon not present: 240 ho bnfarmasion/s applicable: 2140 These figures sccm te show this very large (perhaps surprisingly lange part of the loanword are core borrowings 50 Morsin Hlaspelmails 73. ‘Therapeutic borrowing, Borrowing of new words along with new concepts (cultural borrowing) and borrow ing for reasons of prestige (core borrowing) are the two most imporeant reasons for borrowing, bur borrowing has also been sid to occur for therapeutic reasons, when she original word became unavailable, ‘Two subcases of this are: (i Borrowing duc co word taboo: In some cultures, there are strict word taboo rules, eg rules hat prohibica certain word that oceurs in a deceased persons fname, of a word that occurs in the name of a taboo relative (e.g, ia Australian languages, Dixon 2002; 27, 43), In such cases, a language may acquire large parts of another language's basic lexicon, so that its genealogical position is recognizable only fiom its geammatical morphemes (Comeie 2000). Gi) Borrowing for eeasons of homonyimy avoidance (ef. Rédei 1970: 11): If-a word becomes to0 similar to another word due ko sound change, the homonymy clash might be avoided by borrowing. Thus, it has been suggested tha homonymay of earlier English bread (from Old English bree) ‘roast meat’ and bread (from OM English bread) ‘morsel, bread! led co the replacement of the first by a French Joan (reas, from Old French rer) (cf: Burnley 1992: 493), However, English borrowed many other wards from French, so whether the homonymy was a major reason for the borrowing here, and whether itis ever aan important reason, is questionable (cf, also Weinscich’s 1953: 58 cautionary remarks), 7.4, Adoption vs. imposition Finally, we should consider the distinction beoween adoption and imposition that was briefly mentioned ia §2 (Van Coetsem 1988, Guy 1990, Wintord 2005). For borrowed structural patterns, this distinction is very important: Some borrowed phonological and syntactic patterns are due to native speakers borrowing (= adopt ing) features from anather (dominant) language into their own language, and others are due co non-native speakers unintentionally retaining (© imposing) features af their native language on a language to which they are shifting (thus, imposition is 1988). Imposed par anew language sind shift to called “interference through shift” by ‘Thomason & Kaufe: terns survie only ifa large number of speakers acquis it, Thus, features of Indian Tanguages survive in Indian English, but not in British Znglish, where the number of speakers from India is not large enough to have an impace on che general language. Borrowing by imposition has also been called sub- strate or superstrace influence. It is well-known thar in imposition (or substeate/supersteate) situations, borrowing, primarily concerns the phonology and the syntax, whereas in ado; (or adsteace) situations, the borrowing affects the lexicon first, before it extends to other domains of language scructure, ‘This is understandable, because second-hnguage speakers cannot avoid phonological and syntactic interference from 52 Morsin Hlaspelmails References Aikhenvald, Alesandea, 2002, Language Contact in Amazonia, Oxford: Oxford University Press Bournans, Louls 8 Caubet, Dominique, 2000. Modelling intcascotential codesmitehing: A compararive study of Algerian/French in Algeria and Moroecan/Dusch in the Netherlands, Iu Owens, Jonathas (od), Arabic as « neizorty language, 113-180. Berlin: Mouton de Gruyter. Brown, Cecil H, 1989, Lesical acculuration in Nattoe American languages. New York: Oxford University Press Buck, Carl Dacling. 1949. | Dictionary of Selected Syeouyms lx the Principal Indo-European Languages. Chicago: The University of Chicago Press Burnley, David, 1992, Lexie and semantics. fn Blake, Norman (e.), The Cambridge hitory ofthe English language, Vol. 2: 1066-1476, 409-99. Cambridge: Cambridge University Press. Burrow, Thomas. 1964. Loanwords in 1964:1-30, Clyne, Michael. 2004, Dynamics of language contact Carnbridge: Cambeidge University Press 2000. Language contact, lexical borrowing, and semantic fields. In Gilbers, Dicky & Netbonne, John & Schacken, Jos (eds.), Languages im Consact anckrit. Th sion: of the Phillogical Sociery Comrie, Bernard, (Studies in Slavic and Gencral Linguistics 28), 73-86, Amsterdam: Rodopi. Crofe, William. 2000. Explaining language charge: An cvoleionary approach, London: Longman. Deroy, Le. 1988. Ltomprunt lioguitique, (Bibliotheque de Ia Faculté de Philosophie et leeres de T Universite de Ligge 141). Pars Dixon, RM. W, 2002, daseraline language: Cambridge: Cambridge University Press Grosjean, Frangois, 1982, Life with cw language: de ineraduation co bilingualism. Cambridge, MA: Harvard University Press Guy, Gregory. 1990, The sociolinguistic ypes of language change. Dlachronca 7:47-67. Haspelmath, Martin, 2008, Loanword rypology: Steps toward a systematic cross-linguistic study of lexical borrowabilry. In Stole, Thomas & Bakker, Dik & Salas Palomo, Rosa (Cals), pce of language contac New sheoreical, metbodlotcal and empirical findenge seit special fcus on Remaanctsatton proces, 43-62. Berlin: Mouton cle Grayece. 210-251 Haugen, Einar. 1950. The analysis of lingulstic borrawing. Language Hock, Hans Henrich & Joseph Brian D. 1996. Lenguage iver, language chang and Language relationsip, Belin: Mouton de Gruyter. Hiller, Manfred, 1981. Fir cine Ausgliederung der Kategorie ‘LehnschSpfung’ aas dem Bereich sprachlicher Enclehnung. In Packl, Wolfgang (ed.), Luropaiache Meiraprachighelt: Fascbri zum 70. Gebwresag ton Maria Wandrusctay 149-153, ingens Niemeyer. I, Lextcel borrowings Concepts and isuer 53, Johanson, Lars. 2002. Senuturad factors te Turki langeage contac, London: Cutan, Kier, Fernande, 1989. Lehaware und Freendwort isn Maltesischen, olla Lingwllea 14:1 79-184. Lehmann, Winfred P. 1962. Historical lagu An Intraduetion. New York: Hale, Rinehart & Winston. ‘Matra, Yaron & Sakel, Jeanctee. 2007. Grommetical Borrowing in Cross-Lingutstic Pengettre, Betlin: Mouton de Gruyter. Moscl, Ulrike, 2004, Borrowing In Samoan, In Tenty Jan & Geraghhyy, Paul (edssy Pacific perspective, 215-232. Camberea: Pacific Linguistics, ANU. Moller, Christen. 1933, Zur Methodit der Fremiluortkienle. Kabenhava, Muysken, Piecer. 2000. Bilingual speech, Cambridge: Cambeidge University Press Borrowing Mysre-Seotton, Carol. 1993, Dueling langweges: Gremunatical structure re codeswitcngs Oxford: Clarendon, Myers-Seotcom, Carol, 20002. Contace ling cntieme, Oxforel: Oxford University Pres. Myers-Seotton, Carol. 2006, Muliple voiza: An intreducion to bilingual. Malders, MAA: Blackwell Peperkamp, Sharon, 2005. A psjcholingustic theory of loanword alaprations. In Exlingee, M, & Fleischer, N. & Park-Doob, M. (eds. Prasadings of the 30eb Annual Meet the Berkley Linguitlex Society, 341-352. Berkeley, CAs The Society. Poplack, Shana & Sankofl, David. 1984. Borrowing: The synchrony of integration Lingus 22:98-136. Poplack, Shans Se Sankoff, David 8 Miller, Chiistopher. 1988, ‘The socal covelates and tic processes of lexical borrowing and assimilation. Lingus 2647—104 Billogual encoemters and grammatical Rédlei, Karoly. 1970. Die gypinichen Letmwérter im Wogalischen. The Hague: Mouton Ross, Malcolm, 1991. Refining Guy's sociolinguistic types of language change. Diachronlca BDL Poplack, Shana and Vansiarajan, a Tamil, Language vartzion and change Song, Jae Jung, 2008, The Korean language strectart, wse and vont, Londons Roald ge. Stewart, Thomas W, Jr- 2004, Lexical imposition: Okt Norse vocabulary in Scottish Gaelic, Diacbrontea 1Q)398-120. Seals, Christel. 2009. A different hind of gender problem: Maltese loan-svord gender froma a ‘ypologieal perspective. In Comrie, Bemard 8 Fabs, Ray & Hume, Eluaberh & Mifsod, Manwel & Stok, ‘Thomas & Vanhove, Martine (cds), Introducing Maltese Linguistics, 321-358, Amsterdam Ben smn Swadesh, Morris, 1953, Towards greater accuracy in lexicostaistic dating, Ling 2121-137, Chapter IIL Loanwords in che world’s languages: Findings and results Uri Tadmor The Lounword Typology (LWT) project has had severe! rangible results, including the ease studies in this volume (chapters I~d1) and the online World Loanword Database (WOLD, at hiep://nold livingsources.org), Together, they have mae it possible to conduct comparative investigations into vartous aspects of lesical bor rowing. This chapter presents some of the findings of the LWT project. Another result of the project, the Leipzig-Jakarta List of basic vocabulary, is also presented in this chapter (68). 1. Lexical borrowing across languages Lexical borrowing rates vary greatly among languages, but it is important to bear in mind thar the rates also eflect varying degrees of knowledge about each language. Some languages have a long written history and have been thoroughly studied for centuries; others were only recently documented, and very litle is known about theie histories. A low borrowing rate may thus indieate that a language has adopted few loanwords during its history, but it could also mean that linguists haxe not yet identified some of its loanwords. Moreover, aot all Languages in the sample are of the same age. Most are contemporaries, but Old High German records a much earlier stage of development, while the two creole languages (Saramaccan and Sey- chelles Creole) are only a few centuries old and have therefore not had much time to borrow words. Despite the difficulties thar such discrepancies present, itis still le to draw some basic conchisions. ‘The first point is obvious bur nevertheless important to makes lexical borrowing is universal. No language in the sample ~ and probably no language in the world ~ is entirely devoid of loanwards. ‘The average borrawing rate, at 24.2%, is substantial and higher than expected. Admittedly, there is a bias in the sample towards lan- guages with many loanwords, because specialists on languages with few loanwords ‘were less interested in joining the project. Bu very pervasive phenomenon, Whar makes a language particularly amenable to lexical borrowing? Figures for the total numbers of words and (certain of probable) loanwords in the LT pro- ject languages are presented in Table 1. Looking at the ten languages with the highest borrowing ratty it i clear that there is no one answer, as these languages is elear that lexical borrowing is a 56 Lit Tudor exhibit very different rypologieal and sociolinguistic types. The same ean be said for the ten languages with the lowest borrowing rates. Whatever generalizations are formulated), counter-examples can probably be found nor only among, the world’s thousands of languages, but even among the 41 languages in the sample. For exatn- ple, one may postulate that Seychelles Creole has a low borrowing rate (10.7%) because it is a new language that has only come into being in recent history, and therefore had not had time co borrow many words, But the other creole in the sample, Saramaccan, has the sixth highest borrowing rate (38.3%). So the most useful explanations appear to be language-specific rather than general, (Moreover, things like typological classification and sociolinguistic c stant = they may, and often do, change during a kinguage's history.) With regard to the discrepancy between the borrowing rates of the two creoles, the explanation is found in the fice chat Saramacean bis undergone partial relesifcation hy Portu- guese words (Good, this valume). Words of Portuguese origin are considered to be loanwords, and account for the high borrowing rate. Seychelles Creole did not un- dergo relexification (Michaelis, this volume), so the explanation ehat ix has noc had time to borrow many words still holds. With time, its borrowing, rate will surely rise t0-resernble that of non-creole languages. Table 1: Lexi borrowing rates in LAF project languages Borrowing ype Languages Loamwords Loanwords as 9% of otal Very Bigh ‘Selice Romani 98 a2% borrowers Taiiye Berber 789 517% High borrowers Gusind 3 $5.40 Rominian 3% 418% English ar 41.0% Saramaccan av 33% eq Wong, a1 37.08% iplcese 639 34.99% Indonesian 500 34.095 Rechte i? 31.806 ‘iin Saami 408 30.5% Tmbabura Quechua 350 30.2% Ae us 29,506 sak 409 29.0% ‘Vietnamese Als Ie. Swahili 47 27.8% age 166 265% Tei a3 26.1% Takia 21 25,98 IIH, Leenwoords tn the World's Languages: Findings and result 57 Borrowing wpe Lamanges ‘Toul words Leanwords Loanwords as % of ral ‘Average borrowsrs 16 a4 26 14s 3 Mapudungun 1236 White Hmong 1250 ar asia Malagasy 1526 Zinacancin Toor, Rr Wichi Geach Trge Kaisa Tuwalan rogen| Hu Gawwada Seychelles Create ‘Ocom Low borrowers — Ket Manage (Old High German darn “The languages in Table 1 can be coughly divided into four categories: Very high borrowers, with a borrowing rare of over 50% high borrowers, with a borrow rate of 25-S0%Gs average borrowers, with a borrowing cate of 10-25%; and low borrowers, with a borrowing rate of under 10%. The threshold for these categories vwas set at the lower end of what would appear warranted by the figures, to compen sate for the bias in the sample towards relatively high borrowers. This makes the categories applicable for classfjing other languages as well. A eategory “very low borrowers” is nat proposed because, as already mentioned, a low borrowing rate may reflect lack of knowledge as well as paucity of actual borrowing, It would impossible to discuss the circumstances behind exch language's bor- rowing rate within this chapter; the readers are referred to the case studies (Chapters t~41). The discusston will therefore be confined ro just rwo Linguages: Sclice Romani, the highest borcower, and Mandarin Chinese, the lowest borrowers. ‘The sociolinguistic and other determinants which have brought about their ex- sreme rates of borrowing will also be applicable, to some extent, to other languages. Selice Romani is a dialect spoken by about 1,350 people in a village in south western Slovakia (Elgik, this volume). Mandarin Chinese is spoken by almost a billion people in China and beyond (Wiebusch & Tadmor, this volume), Romani hnas aways been spoken asa minority linguage i linguistic sixuaions dominated by other Linguages, since its ancestors left India around che & of 9” century CE. II, Leenwoords tn the World's Languages: Findings and resuly 59) 2. Loanwords and semantic word classes To examine the rolationship between borrowing and word class membeesbip, items on the LWT lise were classified into one of the following categories: “nouns”, “verbs”, “adjectives, “adverbs” and “function words”, Ic is Important to remember, however, shat che list consists of lexical meanings rather than of words, (For practi~ cal reasons the meanings ate expressed in English, but in principle they could be expressed in any other language.) Since meanings have no syntactic properties, these designations should not be construed to refer to syntactic categories, Thus “no is used as convenient shorthand for “meanings of words denoting things or en tics”, “verbs” as shorthand for “meanings of words denoting events or actions”, and so forth, Since the word class of the LAV'T meaning and thar of the counterpart word in the project language did aot always coincide, Badings regarding word classes must be interpreted with some caution, 2.1, Borrowed content words vs. function words A generalization offen made in the literature, for which there is now strong eanpi cal evidence, is that content words are more borcowable than function words. Not only do the total figures indicate this (Table 3.1), bus individually coo most kan guages in the sample have a higher proportion ~ often much higher ~ of borrowed Content words 25 Compared to function words (Table 3.2). Three languages, hows ever, buck this trend. In White Hmong, 22.4% of the function wards are loanwords compared to 21.196 of the content words, a slightly lower proportion ‘The results for Hup are much mare eobust: only 11.1% of content words are loan- words compared to 16.6% of all function words. Wichi exhibits similar proportions: 15.5% of content words are loanwards compared to 21.5% of function words. It is interesting to contesst the situation in Wichi with that of Imbabura Quechua, where 32.5% of content words are loanwords compared 10 only 2.3% af Function words. Both languages borrowed predominantly fkom Spanish under broadly similar sociolinguistic circumstances, and it is not clear what brought about this great difference in borrowing behavior. Table 3.1: Borrowed content words and function words: total figures Cutegony Allwords —_Loanwords _Leunwords a» % of tol Goncent words Me Be 6 “92 Toul (all worl) 1938 60 Unt Tadmar ‘Table 3.2: Barrowed content words and function words by project language Language Loanwonts as %of Loanwosds a8 % of Loan content words 40 allcontent words all fiction words loan function words ratio. Tnbubari Quechua Ee 2a Ta Iraq fe Seyelles Cresle om Orogsn Leta Dusd 219% Romanian 50% Geqehi? 23% Howaian 25 English 9% Lower Sorbian 4.9% Manange 238 Mepadanges 63% Sala 6th Gorin 5% Beshex 19% Malipasy 70% Kanu Ao ech Last Kalla 706 Killin Seam 15.1% Seliee Romani 309% Ganieads 5.9% Hay 23% 1 Sw H9% 19 Indenesin Em a rom 62% 18 Ree 61% re] Zinacantin Tail 9.5% W Viecnamese 76% te Yaqui Ire Ls Japanese 248% ir Siramiccan 273% 4 Takia laste ua Derber 395% fey Thal 208% a Coq Wong 32.9% ua Whiee Hoong, 22.40 09 Hop loo or Wich 215% 7 ‘Mandarin Chinese 0.0% (Old High German om - 62 Unt Tadmar ‘Table $1 Loan nouns and loan verbs by project language Language Loan nouns Loan verbs ‘Loan rerun to loan verb ratio 9% of all mauned (a8 Mo ofall verbs) Zinacontin Teoril 2 ate WS Takia 377% 3.2% it Tragw 23.68% 21% 13 Wich Bl 2% 84 ron 708% 22% 78 Bakes aM 6.0% 15 Oragen 86% 28% 87 Kalas 21.1% 2686 38 (ld High German 0% 17% 3A Qeqehi? 250% 48% 48 Hausa 31M 7.0% Hsien 173% 5% Manange 12.3% 33% Yaqui 37.3% ro Gewads 169% 4.96 Archi 408% 17% Due 26.30% 75% Seychtles Creale Hats 51% et 13.6% 40% Lower Sos 307% 9.0% Malisisy 22.905 7.0% ‘Mapuduagun 313% sla 400% Kanusi 26.7% Inbaburs Quechua 8.1% Indonesin an Japanese 22% 22 Swaks 34.3% 21 Killin Saami 380% 20 Tha 22.9% 20 up 13.3% 7 Selier Romant T30h Ww Romanian 502% 6 English 48.0% re ‘Tariipe erber 36.1% ta Coq Wong Leta a Vietnamese 313% 3 White Hoong, 213% ul Goring 48% 10 Saramaccan 37.1% os Mandarin Chinese 1.3m = Toul BL 2 II, Leenwwords tn the World's Languages: Findings and result 63, ‘The borrowing of verbs as opposed to nouns is one area where structural con- straints may play a significant role. As discussed above, the more isolating the recipient language, the less morphosyntactic adupration is necessary for borrowing verbs as such} conversely, the more synthetic the language, the more adaption is required. Ic is therefore much easier to borrow verbs inta isolacing languages than it is into synthetic languages. For example, ‘Thai (Suthiwan & Tadimor, this volume) has borrowed many English verbs without any morphosyntactic modification, e. care (as thee) and cheer (as chia). The fit that these loanwords function (and have always functioned) as verbs in Thai is demonstrated, among other things, by their taking of the verbal negator méy. Compared 0 Thai, Hebzew is highly synthetic, especially in its verbal paradigms, where one verbal root can cake hundreds of trims. Moreover, in Modern Hebrew nouns do not have to belong to a particular noun chs, while verbs ean only be conjugated within one of seven verb classes. In other words, a noun can be borrowed without any marphosynesetie modifleation, but not a verb. In order to borrow a verbs, a consonant root must first be derived, and then ig has to fit into one of the existing verbal classes. Ie is then used in conjun ‘with a large number of complex discontinuous morphemes. To take a recent exatn~ pile, the English verb to char has been borrowed into Hebrew with the restricted yet commonly uscd meaning ‘to chat online’. First, a three-consonant root had to be derived since chat only has two consonants, the second consonant was reduplicared, resulting in the root ch-r-t. Second, a suitable verb class had to be chosen (in chis case the class known as gad, though this is by no means an abvious choice). Roots of Hebrew verbs cannot occur independently, so cb-t-+ must be used in conjunction ‘with various discontinuous morphemes, resulting in forms like fechotes ‘to chat’, chotdtmie ‘we chatted’, and techozeti ‘you (FEM.SG) will chat’, Nouns, on the other hand, can be borrowed without any morphosyntactic modification, as they only accur in two forms ~ singular ane plural (the dual is rarely usec! with loanword). The plural is expressed by «wo simple suffixes: -or (for lounwords ending in ~a, treated 5 feminine) and ~im (for all other loanwords, treated as masculine). "Th even a long noun stich as encyclopedia is easily borrowed as encihlopédya, with the equally easily derived plural entsiklopédyor, while endocrinoiogis: is borrowed as endokrinolég with che plural encokrinolégim, On the other hand, a veth like reanalyze would be very difficult to borrow as such into Hebrew, because itis unclear how to derive a consomantal root from it. Ie would either be borrowed as a noun (reanaliea ‘reanalysis? and the verb would be derived periphrastically (laasét reanaliza, lit. do a reanalysis}, or it would be ealqued (fenatéax mexadésb, lt. ‘to dissect anew). These examples suffice to demonstrate why ic is relatively difficult €o borrow verbs as such into synthetic languages, but quite easy to do so into isolating la guages. However, whether speakers of a particular language actually do borrow verbs depends on social rather than linguistic Factors. None of the Mandaris verbs in the sample are borrowed, even though Mandarin is a highly isolating language (see Wicbusch & Tadmor, this valume). On the other hand, Berber (Rossman, this volume) has borrowed 2 large number of verbs despite being highly synthetic, be- cause it has heen under heaxy pressure from Arabic far a fong time, 64 UH Tadmar 3. Loanwords and semantic fields Words belonging to different semantic felds display wildly varying borrowing rats. However, difterent languages display a remarkable degtce of eansisteney which re- gard to which fields are more or less affected by borrowing. While there ane certainly srass-lingwistic differences, most languages send to borrow more words into similar fields, and the stme fields curn up again and again as the ones most resistant to borrowing. A list ofall semantic fields in the LWW'T meaning list, along with the borrowing rate for cach one, can be found in Table 6, Bareonting by sem etd Religion and belch Clothing and grooming. The bovse 372% Law 34.3% Sock and poli relations 31406 feulcure and vegetation 300% Food and drink 29.3% Warfare nd hunting, 27.9% Possession 27.198 Anntmals 255% Cognition 242% sie sevions and eechnologg 23.8% Time 232% Speech and language 22.30 Quantiy 205% Emotions and valuce 19.9% The physics worl 19.495, Motion 173% Kinship 15.0% he body 14.2% Spatial ections 149% Sense perception, 1106 ‘All words 242% The semantic fields mast affected by borrowing are Religion and belie Clouhing andl grooming, and The howe, These semantic fields correspond to elomains which have typically heen most affeeted by intercultural influences, Examining the distribution and history of the world’s major religions reveals why religious terminology constitutes the most borronable pact of the lexicon’. The work's largest religions by far are Christianigy and Islam, Both came into be ' sPechnical vecabolary would probably show an even higher borrowing rate but For va ‘was not included in the projet, 66 Ut Tadmar hood. Each depree was then assigned a numerical score” by the editors, which en- abled us to compute the average “unborrowed score” of cach meaning. All the items ton the LWT list were then ranked by the unborromed score in descending order, The 100 most borrowing-resistant items, as determined by this ascthod, are listed in Table 7. Only five meanings have an unborrowed score of 1, meaning they have no counterpart in any language that is probably or clearly a loanword. ‘The least-borrowed items on this lise contain surpeisingly fev of the meanings traditionally associated with the notion of “basic vocabulary”, such as body parts and important nacural phenomena, Far more aumerous, especially in the highest rank ings, are functions words and deietics, especially ones related to spatial organization: in, at, bebind, above, wneler, outside, én front of, tis, that, here, chere, up, down, There are also several time deictics (today, yesterday, the day before yesterday, the day aficr somorrow ~ but interestingly not eomorrow), as well as various person deicties (pro~ nouns): 1, you (both singular and plural), heokelit, we (hor inchusive and exclusive), and others. Interestingly, all the interrogatives among the 1,460 items on the LWT meaning list are ranked in the cop 100 least-borrowed items: what, who, which, wher, where, bow, why, and how much. However, this unweighted list is problematic asa list of borrowing-resistant meanings, for a number of reasons. (i) Some lexical meanings are not represented by fixed lexical expressions in many or most languages (and have co be expressed by descriptive phrases). Quite a few languages do not have counterparts for meanings such as ‘day afier tomorrow’, ‘younger sister, or ‘married wornan’, Such items do not constitute good evidence for low borrowability because of their poor “representation” in the combined da~ abuse; the data are insufiicient to determine whether they are borrowable or not. (Quite 4 few meanings are represented in many languages, but not hy simple monomorphemic words. Rather, they comprise analyzable expressions such as complex words, compounds, and phrasal expressions, Such analyzable expres~ sions are almost by definition created in the recipient language and hence could not normally count as Toanwords. Therefore they are not relevant for studying borrowability, (Gil) Another imporcant factor that does not figure in Table 7 is age, “Phe longer a word exists in a language, the greater the opportunity ic has co he replaced by a loanword, Ifa word has existed in a language for thousands of years without be- ing replaced by a loanword, this clearly indicates high resistance to borrowing, On the ocher hand, #fa word has only exisced for a few years, it fs not possible to tell whethee itis buetowing-tesistant: given sufficient time, it might be replaced by a loanword. Therefore, old words constituce muci more relisble evidence for resistance to borrowing than new words, For cach word, contributors were asked to note the earliest date to which exch word could he attested or reconsteucted, The assigned scores were a follows: No exdence for horcomings 1.00: Very tle evidence For hor owing, 0.75; Perhaps bored, 50; Probably borrowed, 0.25; and Clearly borrowed, 8.00, BRRPLRRRE EL 3 42, IIH, Leenwsords tn the World's Languoges: Findings and results Seore Walch 78 we(inekse) 1.000 we (sn) as wher? che wade a9 dont Imicdwoman 8990 surge ons ae ons yar wnaton 0587 wos ost stinking oon hong oon: dayne yeelay 0581 shee oes wetedonn ett wosand dott fer 381 how? oes nn are band oars bine oars nore on cbc oars open oon fey ost tedawwace 697 thn von fa oon togo/raem home 0871 ‘Three additional fictors s 38 35 3 3 3 3 4l 4 a “ 4s 6 a Co a 50 50 50 3 3 3 36 * % 2 60 “ a 2 64 6 6 Unber. Rank Label wha rognp 1 co be hungry younger brother oi abe shot co listen under fire tote chitin phe hoe cole Cobo tmhert yeu and a Sikes coup yung oh vested Chote se cos Unber. Score oT 0970 wre 0970 0969 0.969 0.969 0.958 0.968 0967 0.967 0.966 0.966 0.968 0.965 0.965 0.964 0.964 0964 0.963 0.965 0963 0.962 0.962 0.962 0961 0.960 0360 0.958 0958 0958 0957 0357 100 most borrowing-resistant items on the LST meaning list Rank Lab @ © ° ® mW a zB B zB zB aad long. 0 hitbeat wide adder toalinb aried maa wher lov ‘wher? bight cotiy down, ok black Brewood wham Gateanstive) thick tome tochop to float ger ‘outside ay she beeaee co dofimake wool bow much? older sister 0.958 0.958 0.954 0.954 0.953 0.953 0.953 0.95 0.955 0.982 0.932 0952 ost 0.951 0.931 0.950 0.950 0.950 0.950 0.949 0.949 ons 0.948 or 0.94? 0.947 0.097 0.966 0.9 04s 0.945 0.945 “7 In order to take into account these fhetors of representation in the database, analy2~ ability/simplicity, and age, scores were computed for each of the fictors, with values between 1 and 0 (as for the unborrowed score). rocat « flesh arm + band leg + foo IH, Leenwoords tn the Worl head louse + body louse > to do + to make meat/flesh arm/hand leg/foar to do/make A few labels were also slightly edited for bresity, elatity, and consist che top 100 items were taken to produce the new basic vocabulary list. Ic is named the Leipuig-Jakarta lis, afier the locations where it was conceived and created. ‘Table 8: Rank ‘Word meaning ir cop oath taague Hood bone 254 pronoun bcos 8 » 0 2 4 B 25 26 w 28 28 u flesh/meae arm/hanad fy sight eck fr a do/maks hose oneftodk ber fowy tooth bir Unborrowed D968 0.973 0.968 0.909 ono 0.934 0.904 0218 0.958 0.944 0.968 oa 0.16. 0970 0915 0.950 0.884 osm 9.481 0.948 0931 0.896. 0.898 0.944 oar 0.93 0.895 0975 0.972 0.882 oa Age 7 1.506 03807 0926 0304 508 0590 0804 9393 one 0376 0.856 ons 075 0886 x61 8.804 0392 9.203 ans 0880 sss O81 0380 0x77 0376 0x0 or 837 87? x7 ‘The Leipzig-Jakarta lise of basie voeabulary ‘Simplicity Tas 0980 1987 nee 0954 1.000 osm 0933 097 0940 0967 0980 0936 0955 0968 0986 13966 p92 asd 296) 964 0948 0914 1988 089 028 ag7s osi7 Reprevenea- Ton Loo ‘00 Lo Lon 1.000 1.00 Loan Loo Loa Loan Loa L000 09% Loo 1.000 L000 1.000 1.000 Low ooo Loo Loon 1.00 1.000 Lo Loo 1.000 Loos 1.000 Fs Languages: Findings and result 69 ce, Finally, Composite ‘O90 0.864 0.832 O83 oxi? 0.208 o.s0s 0.805 0.798 0.798 0.756 0.783 0.762 07% or7 0.7% 077 70 Rank Ut Tadonor Word meaning Unborrowed Wie 89 one osm shot oe 282 peo 1.00 tohlvbeit vgs left ase been oso thi 100 fst bass yesey ose tndink 90 blak ost save ous total 0.981 tobe 864 back oats sind sooke what Chil in em) et cape to burn (inte) good tolaew bce sind tolaugh coher wi let rel ter coh shite conuck 0.940 me ones bewy ost Ape ORei 0493 0393 x7 9397 oa 0351 85 043 ox? ‘366 oR 0x47 61 xe 0.500) 0363 9804 ORs oie 978 on Rw x80 Rw 886 862 9844 oxi 9x83 R64 0337 on 0375 oe) ons 0x80 O74 Simplicity 0980 1.969 use Representa~ 000 1.00 |.000 0878 Lo 1.000 1.000 0.936 1.000 Loo L000 woo 1.000 1.000 Loo 1.000 Low Low 1.000 0.597% Loan Low Loo 1.000 0.878 1.000 Loon Loo ‘Loo Loan ost 1.008 Lon Loa 1.00 L000 1.000 Lo 1.000. Composite os 0.253 0749 0748 0748 0747 0.745 0.745 0.74 O74 O74 oat oat 0.738 0.736 0.736 0736 0734 0.732 0.730 0.728 0727 0727 0727 0.726 0.726 0728 0724 0.724 0733 073 0722 072 o72t 0720 one O78 078 on 07% 0716 MH, Leenwwords tn the World's Languogess Findings and resuly 71 Rank Word meaning Unborrowed ‘Age Simplicity Representa- Composite 7 wal 0900 One oan? 000 74 old 0.896 0x67 1920 1.00 7S eat 0.20. vx sas |.000 76 thigh 0.906 0.886 a918 L000 16 thick 0.950" x7 1906 Low 78 log 0.956 x24 nes 1.000 79 toblow 0.962 03s? 137s 0976 10 wood 0.860. 0371 oo Loo #1 nn 0.276 867 1.000 81 fal 0.948 903 Loo 15 ge 0.904 o9i8 L000 Hash 0.853 ose) woo 4 ull 0.883 ng73 1.000 84 dog. 0.838 0.960 1.000 87 tacaylneep os7l ost Loo #8 otic 0.879 0.948 1.000 89 tosee 0918 0.900 Low 19 omver oot ss? Low 2 rape 048 0.993 1.000 DI shude“Shadow 0.887 0931 Low 1 bind oe 0962 Loo on sale 088 2976 1.000 9 all 0.968 1966 Loo 96 wile 0.988 oss 1.000 97 see 0.30 us70 1.000 7 0 0.948 1943 0.902 99 hand nai 0903 1.000 100 wausl/gried 090 0.886 ooo The most important categories of meanings om she Leipvig-Jakarta fist are de scribed below. Body parts constiture the most prominent group. Items feom the semantic field ‘The body make up only about a tench of all the items on the 1yé60-item LW meaning list, bue fully a quarter of the items on the Leipaig-Jakaeta list of basic vocabulary. Most items represent external organs expecteel t@ be known co any normnal speaker in any society: mont, ear, nose, eye, armband, legfoot, and many othess. Universally present natural phenomena wihich ate of importance to humans are very heavily aver-represented an the lise in comparison to their overall distribution They include water, fire stonelrack, rain, night, star, wind, and others IIH, Leenwwords tn the World's Languages: Findings and result 73 Ir is interesting to see how the Leiptig-Jakurta list conypares with the Swadesh lise. In fact, there is a fair degree of correlation: 62 items on the lists overlap (Table 9.1), This means chat total of 38 items on the Leipzig-Jakara lise da not appear on the Swadesh list (Table 9.2) and vice versa (Table 9.3). Swadesh’ intuitions thus appear to have been not fir off the mark, although a 38% difference is substantial and can Iead to rather different lexicastatistical and other results. Moreover, our findings indicated that quice a few items on the Swadesh list are not very baste. For example rewid is ranked 371 on our lst, and person is ranked 526. Ic is probally not 4 coincidence that both these terms are represented by loanwords in English, At any mute, the major advantage of the Leipzig Jakarta lise is that it has a stzong em- pirical foundation and is thus a more reliable tool far scientific purposes. Trable 9.1: Items shared by the Swadesh list and the Leipsig-Jakarts li Leipnig- ‘Swadech lst Leiprig. Swadesh lise Ligmig- Sweex ist, Jolene ti Jala Jakarta list fawhand amt poo po wd wd ash ash baie it Foot root big big fouse louse sand sad bird bid rhe hear rosy oy wbice bite hhoen horn towe see back hick Iscprom skinfhide skin blood ood nee ee tinal rill bone bon rwhnow ow nok snake beat breasts laf ail earth coburn (ite) burn for tasand stand o-come come ser sae scr dog dog long soneliedk stone wdrink rink ‘mouth tail er ae name same this weit cat ack eck tongue rome 8s 68 new ew tooth tooth ge oe night right water water rs fre ose nose wha what? fish fish or not whet who? Aeshymeat flesh one ‘one 2s pran, thou co gie ee ait ea 74 Unt Tadmar ‘Table 9.2: Ieems on the Lé ptig-Jakarea lise but not on the Swadesh list a to davmake to hide it wore back toll home thulefthadow wile bier fie in vind 0 blow ty to lavgh swing, rociny: woes navel swe hil (kin seem) har oll yesterday cocrashlerind 386 pronoun rope sw ryfwcep heary worn Table 9.3: Trems on the Swadesh lise but nor on the Leipsig-Fakearta list a feagern ‘all perry ewe bork ay lie seed seal belly full ian sit we clos presse any sheep white cold green ‘aon sun ‘woman ie head rountaia sim pelle dy heart path thie feather bat persan ue 6. Conclusion This chapter preseated some of the findings and results of the LWT project. Part of its aims has been to provide an empirical foundation to long-held belts, such as that content words are more borrowable than function words and that nouns are more borrowable than words. Less expected ~ but not less important ~ has been the realization that borrowsbility in itself has limited (though interesting) applica= tions. It becomes much more meaningful when used in conjunction with other es, such as universality, stability, and simplicity. Thus used, it can be a usefial aid for diachronic as well as synchronic deseription and analysis of languages. An interesting issue which could not be explored due co insufficient data input was the correlation between frequency and borcowabilicy. Ie seems logical char fre- quently used words would also be highly resistant to borrowing, becanse more time and effort would be needed for the borrowing to become established. Similarly, itis possible that small spesch communities are more amenable to borrowing (and to language change in geneeal) chan Inege speech communities, because innovations could spread among the entire community more readily. Evidence fiom the LWT project, however, is inconclusive, These topics are left tor future research, inal, it should be emphasized that che publication of this book is by no means the end of the Loanword Typology project. The World Loanworl Database (WOLD) will be online for years to come, and will hopefully serve as a resqurce Chapter 1 Loanwords in Swahili’ Thilo C. Schadeberg L. ‘The language and its speakers Swahili is spoken by approximately 75 million people in eastern and central Aftica ‘The great majority of speakers live in ‘Tanzania, Kenya and Uganda where a stan- dardized variety of Swahili has the states of national language. Coastal dialects and lingua franca varieties used outside Tanzania may deviate considerably from East African Standard Swahili whieh is the variety documented in the LWT database Standard Swahili’ is based on Kiunguja, the dialect of Zantibar City. “However, whereas Kiunguja has retained its distinctiveness as a dialect, standard Swahili has continued to expand and to market itself as a radically modernized version of Kiunguja” (Mkude 2005: 2). There may be only two or three million speakers of Kiunguja and other coastal varieties of Swahili (such as Kimvita from Mombast, bbut the aumber of people for whom Standard Swahili is the first language chronological sense) or the primary language (in terms of competence, tmportance, and usage frequency) is many times larger and rapidly grawing in the urban centers. Standard Swahili fills a wide eange of functions. Ir is spoken at home, in the market and in shops, zt work, in school, at religious and political meetings and in parliament, and it is the normal daily language of radio and television broadcasts, Swahili newspapers, journals and books of all kinds (fiction and non-fiction) have hheen published for many decades. Swahili is the language of traditional buc still popolar poetry and music, as well as of all modern genres of pop and rap. Tn Taneanta, where Swahili has the status of offical language oa par with lish, the progress of standardization is monitored by the National Swahili Council known as BAKITA (Baruza la Kiswabidi la Taifi). BAKITA cooperates ‘with similar institutions in Kenya and Ugan ‘The Bast African Community and the African Union have recognized Swahili as tone of their working languages. Swahili scientifically documented snd analyzed. at academic institutes in Dar 5 Salaam and elsewhere in East Afica, It is also studied and taught worldwide at numerous universities in Europe, North America and East “The aubuatahase ofthe Warld Loanword Database that accompanies this chapter is aatble online hwoldliningsourcesong, Cs 2 sepwate ekxtroni publication that should he ld as file ‘Schadcberg Thilo C. 209. Swab wocabulay. In Haspelnath, Mann 8 Toe, Ua (el) World Loanvord Dasatace. Mush: Max Planck igh Library, 1625 entries. healed ningun oy vocals The external ac! inernal gyneullcalclsiication of Swahili ideale with atthe sar of §3 1, Loanwords tn Sosabtlh 77 Asia, Several of these countries provide Swahili broadcasts for listeners in Ease AF- rica. 2. Sources of daca Swahili is a well-documented Linguage, particularly with regard to its lexicon, ane loanwords have received much attention in the literature. ‘The Swahili database is based on the perusal of dictionaries and lo: ‘The documentation of Swahili lexicon (and grammar) started in the 19) wury with the works of Ludwig Krapf and Edward Steere. ‘The most complete dictionary representing pre-standard Swahili is Sacleux (1939), The same year also saw the publication of the first “Standard Swahili” dictionary by Johnson (19393, 19396). All these dictionaries attempted to mark loanwords, with varying degrees of precision in identifying the donor language and she particular source word. ‘Two ienportant book-length studies of Swahili loanwords ate Krumnen (1940) and Lodhi (2000), Zawawi (1979) has 2 strong bias towards postulating loanwords from Arabic vshich makes her claim some fanciful etymologies for Swahili words that hase un~ disputed Bancu origins. Geider (1995) provides a useful inteoduetion to che study of loanwords (and neologisis) with annotated bibliographical references. The monu= mental monograph by Nurse & Hinnebusch (1993) on the linguistic history of Swahili is probably unique for a language thac las only been written for « relatively short time, sword sead- ies: tory and contact situations is 4 Bantu language, Its closest relatives are the Sibaki hoguages, ica, Elwana and Pokomo spoken along the "Tana River in Kenya, the Mijskenda varieties (eg. Giryama) spoken in the immediate hinterland of Swahili towns and settle ments along the Kenyan coast, and Comorian spoken on the Comoeo Islands. A partial genealogical tree of Swahili is given in Figure 1 (based on Nurse 6 Hinnebusch (1993), chapter 1; see also their Map 1, p. 40). ® Dictionaries nor mentioned in the ual rest: (Sahil) H&imann wich Mhando (1963), Eftmana ‘with Herms (1979), Lepéve (1990), Sacleux (1948), TURI (IS81, 1986, 20013, Velten (1910, 1933) (Arabice) Dory (HSI), Groser-Crange (1993), Karimirdel (1846-160), Kiekeby (2000), Lane (1863-1893); (Hinds) Plies (1977), Wagenaar (1993); (Maly:) Wilkinson (1901, 1932); (Persian) Steingass (1692); forker Lingages Kishey (1906), Werms (1898); loanword stulies and other spe ialed studies not mentioned in the main text: Bal (1788), Batiba (1996), Brvornfild (1931, Chuva (1988), Eastman (1991), Gower (1952), Gromors (2000), Heles (2001), Knappeet (1972— 1973, 1983, 1989), Kram (1952), Lafon (1983), Lepése (2987), Lodhi (1994), Magangs & Schade~ berg (1992), Laccha (1959, MeCall (1969, Narse (198), Pasch 8 Sirauch (1998), Ricks (1983), Feubner (1974), Tecker (1946-1987), Whieley (2967), Zawawi (1978). 78 Thilo C. Schadeberg Sig, Kase Ka Lager, rao Mae, Samba, Nps Za van, Doon, Minds, Bail Comma Figure 1; From Proto-Bantu 10 Swahili Nurse & Hinnebusch (1993: 23) provide the following time frame: “An approxi- mate date around af slightly later than 1 A.D. would seem reasonable for PNEC [proto-Northeast Coast}, perhaps five hundred yeurs later lor PSA [prato-Sabslki), shortly after that for PSW [proto=Swahili]”. ‘The split of Sabaki into distinct societies, and the subsequent spreading out of Swahili and Comorian along the coastline and to the offshore islands of Ease Affica, from Somalia in the north to Mozambique in the south, appears to have been a development complered by about 800 CE. “The Swahili people never formed a single geopolitical uni bur stayed organized around their cities. These cities formed a network, competing with each other and sometimes even waging war against each other. ‘The linguistic diversity was consid- rable, but contact was never fost and, as economic and political power shifced thom ane city to another so shifted the currents of linguistic influence. Such intri~ ‘Swahili borrowings have noc been identified in the present study. Standard Swahili is based on the dialect of Zanzibar town, which gained prominence in the 19" cen- tury under Omani rule. We assume dae Kiunguja was formed in this period, and for the purposes of the present study I consider all contributing Swahili dialects as its ancestors, concentrating on lexical borrawings from other languages. 3A, Contact situations ‘Throughout its history, Swahili has been a contact language par excelence, and chis common history of extemal contacts is important for the identity of Swahili- speaking peoples. Table 2 is an attempt to identify the contact situations in which the adoption of numerous Toanwords occurred. Each sicuation is ilenwified by a umber fallowed by an estimated time period, a label characterizing the contact situation, and a lise of the relevane danor languages. Not all donor languages are documented within the database. (Languages named as donors in the literature but which have probably exerted their influence through some other Tanguage are put in brackets.) 80 Thile C. Schadeberg 2. Contact situation 2+ Hinterland neighborhood (800 - 2000) The coastal sertlements of the Swahili were nov isolated from their immediate hin terlnd. ‘The neighboring farming communities were mostly small and not politically organized on a large scale. ‘Their relations with Swahili rowns anit villages appear on the whole to have been peaceful and mucually profitable, We may assume that immigeants from the hinterland were constantly assimilated into Swahili communities. Identifjing loans from neighboring Bancu languages is not casy because they are, more often than not, indistinguishable from cognates. Ie is also difficale to identify the exact donor Language because of their resemblance to cach other. Sacleux (1939) is the major source indicating loans from such languages, often naming several lan- ages as possible donors, eg., mgono ishtrap" is marked as a loan fom Zeramo, Zigua, Bondei, or Nyika. I have wied to identify at lease one possible source form, which was often Sambea because of its superior lexical documentation (LangHeinrich 1921; Besha 1993). 313, Contact sieuation 3: Frctan Ocean readin network (800-1920) Swahili came into being in the context of a large Indian Ocean trading. nerwork, which appears to have been dominated ~ lly speaking, ~ by seafarers speaking Arabic, Other participants in this network were speakers of Hindi-related languages from the Indian subcontinent, possibly Persian, and = appearing more sporadically on che Bast African coast ~ Malagasy, Malay, and even Chinese, Shipping on the Indian Ocean depended on the monsoon winds, the Aus! blow- ing from the south from April to September, and the northerly kaskazi from November to March, A ship could sail from the mouth of the Indus River to the northern Swahili coast, do its trading and return home when the wind had eurned around, all within one year; it would be more difficult to do this from or to one of ‘the more southerly Swahili harhors. We therefore assume chat there was intensive trading berween the Swahili cowns along the coastline and on the islanc the northern towns functioned also as enteepars. This is how Swahili varieties (and Comorian), in spite of considerable linguistic differences, kept a certain unity and olien adopted the same loanwords, Arabic was the dominant language of the Indian Ocean teide, and Asabs were no doubt frequent visitors as well as permanent residents in East Africa, We can only guess at the kind of Arabie that was spoken on the dhows sailing co and from East Africa, but we have given the label “Indian Ocean Arabic” to words having 4 par- ticular affinity with items that we could trace to varieties of Arabic spoken on the southern caast of the arabian Peninsula and on the shores of the Gul and that