Vous êtes sur la page 1sur 20

FORENSIC LINGUISTICS: THE APPLICATION OF LANGUAGE

DESCRIPTION IN LEGAL CONTEXTS

Malcolm Coulthard
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
Éditions de la Maison des sciences de l'homme | « Langage et société »

2010/2 n° 132 | pages 15 à 33


ISSN 0181-4095
ISBN 9782735113170
DOI 10.3917/ls.132.0015
Article disponible en ligne à l'adresse :
--------------------------------------------------------------------------------------------------------------------
https://www.cairn.info/revue-langage-et-societe-2010-2-page-15.htm
--------------------------------------------------------------------------------------------------------------------

Distribution électronique Cairn.info pour Éditions de la Maison des sciences de l'homme.


© Éditions de la Maison des sciences de l'homme. Tous droits réservés pour tous pays.

La reproduction ou représentation de cet article, notamment par photocopie, n'est autorisée que dans les
limites des conditions générales d'utilisation du site ou, le cas échéant, des conditions générales de la
licence souscrite par votre établissement. Toute autre reproduction ou représentation, en tout ou partie,
sous quelque forme et de quelque manière que ce soit, est interdite sauf accord préalable et écrit de
l'éditeur, en dehors des cas prévus par la législation en vigueur en France. Il est précisé que son stockage
dans une base de données est également interdit.

Powered by TCPDF (www.tcpdf.org)


Forensic Linguistics:
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
the application of language description in legal contexts

Malcolm Coulthard
Professor of Forensic Linguistics, Aston University, England
r.m.coulthard@aston.ac.uk

Introduction
It is now over forty years since Jan Svartvik published The Evans Statements:
A Case For Forensic Linguistics in which he demonstrated that incriminating
parts of a set of four linked statements – purportedly dictated to police offi-
cers by one Timothy Evans and which incriminated him in the murder of
his wife and baby daughter – had a grammatical style measurably different
from that of uncontested parts of the same statements. It was later disco-
vered, after Evans had been convicted and executed for the double murder,
that both victims had actually been murdered by Evans’ landlord, John
Christie. Svartvik’s analysis marked the birth of a new discipline, Forensic
Linguistics. Little more happened for a quarter of a century, with the notable
exception of expert witness work by Roger Shuy in the United States (1993,
1998, 2002, 2005, 2008), but during the past fifteen years there has been
a rapid growth in the frequency with which legal professionals and courts
in a number of countries have called upon the expertise of linguists.
Forensic Linguistics has now come of age as a discipline. It has its
own professional association, The International Association of Forensic
Linguists, founded in 1993, a biennial international conference, which has
been held in Australia (2), England, Germany, Holland, Malta, USA (2)
and Wales and its own journal The International Journal of Speech, Language
and the Law, founded in 1994 and shared with its sister organisation the

© Langage et société n° 132 – juin 2010


16 Malcolm Coulthard

International Association for Forensic Phonetics and Acoustics. There


are three major introductory textbooks - Coulthard and Johnson (2007),
Gibbons (2003) and Olsson (2nd ed. 2008) - plus a growing number of
specialist monographs: Cotterill (2003), Eades (2008), Heffer (2005)
Heydon (2005) and Rock (2007) plus a recently published Handbook,
(Coulthard and Johnson 2010) and a second (Solan and Tiersma) already
in preparation. Modules in Forensic Linguistics, Language as Evidence
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
and Language and the Law are taught to undergraduate and masters level
students in a rapidly increasing number of universities worldwide while,
at the time of writing, there are three specialist Masters degrees at the
universities of Aston, Cardiff and Pompeu Fabra, in Barcelona, as well as
a growing number of Applied Linguistics and English Language Masters
which allow specialisation in forensic linguistics.
As will already be evident from the authors mentioned above, most
of the work so far has been done on English and in English speaking
countries which have adversarial legal systems, although there are
three notable European exceptions: Germany, Holland and Spain.
Germany benefits from having a state-funded group of linguists in
the Bundeskriminalamt and in Spain there is the growingly important
Forensic Linguistics Lab based in the Institut Universitari de Lingüística
Aplicada at the university of Pompeu Fabra, a university happily named
after the famous Catalan linguist.
So far, though, there has been little interest in the area, let alone expert
witness activity within France. This special issue of Langage et Société is a
significant step by Dr. Dominique Lagorgette towards first establishing a
group of forensic linguists in France and then hopefully stimulating research
and then introducing academic courses for students.

What do forensic linguists do?


Forensic linguistics can usefully be divided into three distinct areas of
investigation:
a) the language of written legal texts: here linguists are interested in both
the arcane vocabulary, complicated grammar and infrequent punctuation
which typifies many legal texts and the consequent problems lay readers
have with these texts (see Tiersma 1999; Stygall 2010);
b) the spoken language of the legal process: here linguists examine the
nature of police interviews with suspects, the specialised rules which govern
interaction in courts of law, the problems created for vulnerable witnesses
and the difficulties experienced by those who do not speak the language of
the court, (see Haworth 2010; Heffer 2005; Aldridge 2010; Hale 2010)
Forensic Linguistics 17

c) the linguist as expert witness: here linguists express opinions on the


confusability of rival trademarks, on the authorship of documents, on the
meaning of words and expressions and on the place of origin of asylum
seekers to name a few (see Shuy 2002; Coulthard and Johnson 2007,
chapter 6; Eades 2010).
As is evident forensic linguistics covers a very wide area and so for the
rest of this article I will leave on one side all work where linguists are simply
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
describing aspects of written and spoken legal language and I concentrate
on that subset of research and report writing where s/he is trying to use
description to act upon and possibly change the world. Even then I only
have space for a few examples.

The written language of the law: legal-lay communication


The right to silence
In English speaking jurisdictions those accused of a crime not only have
the right to remain silent but also the right to be advised of their right
to remain. This right is embodied in a Police Caution in the UK and in
Australia and in the eponymous Miranda Warnings in the USA. However,
these texts which were written to be read aloud or recited by police offi-
cers are, to say the least, not without their communicative problems. In
Australia, Gibbons (2001) reports having worked successfully with the
New South Wales police to improve the intelligibility of their Caution; by
contrast, in the UK, although Cotterill (2000) and Rock (2007) have both
published highly critical analyses of the version of the Caution as used in
England and Wales, this has so far been to no avail. A similar but even
worse situation is shown to exist in the USA where Shuy (1997) was one of
many to show the communicative problems and latterly Ainsworth (2010)
has demonstrated how the Supreme Court has worked actively to make it
progressively harder for those who have understood the communication
of their right to silence to actually claim it.

Pattern Jury Instructions


In jury trials the judge needs to instruct the jury about those specific aspects
of the law necessary for them to reach a decision on the basis of the evi-
dence presented to them. In the USA this instruction is typically achieved
through the judge reading out standardised prepared texts, called Pattern
Jury Instructions. The instructions while legally secure are often at best
opaque and at worst incomprehensible to the target audience; indeed
there are claims that some men in the USA have been wrongly sentenced
to death because the jury did not understand the relevant instruction,
18 Malcolm Coulthard

(Dumas 2000). See for example below the Pattern definition of reasonable
doubt which jurors need to understand in order to convict or acquit in a
criminal trial:
Reasonable doubt is that doubt engendered by an investigation of all the
proof in the case and an inability, after such investigation, to let the mind
rest easily as to the certainty of guilt. Reasonable doubt does not mean
a captious, possible or imaginary doubt. Absolute certainty of guilt is
not demanded by the law to convict of any criminal charge, but moral
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
certainty is required, and this certainty is required as to every proposi-
tion of proof requisite to constitute the offense. (Tennessee Pattern Jury
Instructions - Criminal, 4th ed. 1995, 7:14)

Following the perceived debacle of the acquittal in the OJ Simpson


murder trial California decided to revise its Pattern Jury Instructions on
Plain English principles and they invited a distinguished Forensic Linguist
and Law Professor, Peter Tiersma, to join their committee. A revised set
of instructions was published in 2005 including this new definition of
reasonable doubt:
Proof beyond a reasonable doubt is proof that leaves you with an abiding
conviction that the charge is true. The evidence need not eliminate all pos-
sible doubt because everything in life is open to some possible or imaginary
doubt. (California Plain Language Rewrite, 2005)

The Spoken Language of the Legal Process


Working with an Interpreter
There has been a great deal of valuable research into the problems of
interpreted interaction in courtrooms (Berk-Seligson 2002, Hale 2010),
although much less outside the courtroom (see Kredens and Morris
2010). The major finding seems to be that problems derive mainly
from the lack of sufficiently well-trained interpreters, a situation com-
pounded not infrequently by the lack of training for police and legal
professionals on how to work successfully with an interpreter. Sadly, it
appears that the barriers to professionalising interpreting are only partly
academic – poor remuneration works against attracting the best into the
field. However, insofar as they are academic, forensic linguists need to
involve themselves much more actively in training both the interpreters
and the legal professionals, with the latter group being the more difficult
to access. In an ideal world all police and legal professionals who work
in jurisdictions with a large proportion of non-native speakers should
have as an integral part of their training a short course on working with
an interpreter.
Forensic Linguistics 19

Vulnerable witnesses
A related communicative problem area is the difficulties experienced by
vulnerable witnesses in general and by children in particular. Here, as
Aldridge (2010) reports, there have been research-led changes in the ways
in which evidence is elicited. Innovations include the provision of screens
for certain witnesses and allowing some children’s evidence to be video-
recorded in advance of the trial and for them to give the rest of their evi-
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
dence by video link from outside the courtroom. Nevertheless, Aldridge
reports that sadly, in the main, these changes while ameliorating, have not
substantially improved the situation and continuing investigation suggests
that more radical measures are needed:
It seems that the adversarial system cannot easily offer justice for vulne-
rable witnesses and we must now turn our attention to research the popu-
lar contention that inquisitorial style criminal proceedings hold inherent
advantages for vulnerable witnesses (2010: 313).

Comparative studies
One important area in which there is currently a major gap is a compara-
tive study of jury trials in adversarial and investigative jurisdictions. Heffer
(2005) provides a good account of the English system which could usefully
be used for a detailed comparative study with the French system. Three
significant differences between the two systems are: Firstly, the accused’s
right to silence from the moment of arrest – there is no need to give any
evidence in ones own defence - which has long been embodied in the
Anglo American system. In Brazil the accused has only had this right since
2003, since when s/he has also the right to a defence lawyer (Andrade
2010). Secondly, what counts as evidence – in the adversarial system the
jury only considers what was said in court; in the investigative system
written reports of previous interviews are referable to as evidence. Thirdly,
in the adversarial system the jury deliberate and produce their collective
decision in isolation, chaired by one of themselves; in some investigative
systems the judge accompanies the jury into the jury room and elicits the
decision individually by a series of yes/no questions. It will require detailed
comparative description if Aldridge’s proposal above to move to an inves-
tigative system for vulnerable witnesses is to succeed.

The Linguist as Expert Witness


The role and position of the expert witness is rarely simple. In countries
with an investigative legal system, like most of Europe, the expert is usually
appointed by the court; in those operating an adversarial system experts are
20 Malcolm Coulthard

typically contracted and paid by one of the disputing parties. Not unna-
turally experts in the latter system are likely to feel some kind of loyalty,
consciously or unconsciously to ‘their’ side, as Solan (2010) eloquently
points out. This has led to some experts shredding earlier versions of reports
and working notes which might give lawyers on the opposite side the basis
for destructive cross-examination. In the UK the courts have now moved
to emphasise that the role of the expert in the adversarial system should be
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
no different from that in the investigative system and experts must now
overtly acknowledge that they know that their overriding duty is to the
court and not to the party calling them to testify. They are also enjoined
to indicate in their reports any analytical findings which do not support
their conclusions as well as those which do.
In this article I do not have space to talk, even briefly, about forensic
phonetics, but anyone interested in reading about this topic can get a
general overview from chapter 7 of Coulthard and Johnson (2007) and a
much more detailed and technical introduction from Hollien (2002) and
Rose (2002).
It will come as no surprise that there are no specific forensic linguistic
tools and that the best training for a forensic linguist is a course in des-
criptive and applied linguistics. However, each case will normally require a
different selection from the basic descriptive linguist’s toolbox. Chapter 6
of Coulthard and Johnson (2007) has examples of forensic analyses focu-
sing on morphological meaning, syntactic complexity, lexico-grammatical
ambiguity, lexical meaning, pragmatic meaning, speech-to-writing trans-
formation, narrative analysis and features of non-native language usage.

Trademarks
Roger Shuy (2002: 95-109) reports his contribution to the case of
McDonald’s Corporation v Quality Inns International, Inc, which revolved
around whether McDonald’s could claim ownership not simply of the
name McDonald’s but also of the initial morpheme ‘Mc’ and thereby
prevent its use in other trademarks. The case began in 1987 when Quality
Inns announced that they were going to create a chain of basic hotels and
call them McSleep. McDonald’s decided to challenge the McSleep mark,
claiming it was a deliberate attempt to draw on the goodwill and reputation
of the McDonald’s brand.
In supporting their case McDonald’s pointed out that they had deli-
berately set out, in one advertising campaign, to create a ‘McLanguage’
with Ronald McDonald teaching children how to ‘Mc-ise’ the standard
vocabulary of generic words to create ‘McFries’, McFish’, McShakes’ and
Forensic Linguistics 21

even McBest’. Fanciful as this linguistic imperialism might seem to lin-


guists and even to ordinary users of the language, particularly to those of
Scottish or Irish descent who would seem to be in danger of losing their
right to use their own names as trademarks, the lawyers took the claim very
seriously. Quality Inns’ lawyers asked Shuy to help with two linguistic argu-
ments, firstly, that the morpheme ‘Mc’ was in common use productively, in
contexts where it was not seen to be linked in any way to McDonald’s and
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
secondly, that such examples showed that the prefix, originally a patronymic
and equivalent in meaning to the morpheme son in Johnson, had become
generic and thus now had a meaning of its own, which was recognisably
distinct from both of the other major meanings, ‘son of’ and ‘associated
with the McDonald’s company’.
Shuy used a corpus linguistics approach and searched to find instances
of what one might call ‘Mcmorphemes’. Among the 56 examples he found
were general terms like McArt, McCinema, McSurgery and McPrisons,
as well as items already being used commercially such as the McThrift
Motor Inn, a budget motel with a Scottish motif and McTek a computer
discount store which specialised in Apple Mac computer products. On
the basis of such examples, Shuy argued that the prefix had become, in
the language at large, an independent lexical item with its own meaning
of ‘basic, convenient, inexpensive and standardized’. Rather than resort to
corpus evidence themselves, McDonald’s hired market researchers to access
the public’s perception of the prefix directly and to do so through interview
and questionnaire. Their experts reported unsurprisingly that their tests
confirmed that consumers did indeed associate the prefix with McDonald’s,
as well as with reliability, speed, convenience and cheapness. Faced with
this conflicting evidence, the judge ruled in favour of McDonalds, thereby
giving them massive control over the use of the morpheme Mc.

Investigative analysis of texts


Sometimes the police have no clues to the identity of a criminal and a
linguist may be asked to identify any sociolinguistic features which could
provide information about an author. Grant, in Coulthard, Grant and
Kredens (forthcoming) reproduces an extract about the death of Diana,
Princess of Wales, taken from an anonymous letter sent on headed note-
paper and purporting come from a UK police force:
She was an innocent girl who tried to do her best in a world governed by
old cruel farts. […] Sir, the whole system stinks. Sometimes, I am ashamed
to be white by the things others are allowed to say and do. Why did many
stand by and allow Diana to be killed? Surely, this cannot be right? Many
in this force are gutted by the things we have come to know and are told to
22 Malcolm Coulthard

keep quiet about. Sir, it is time to bring these shameful things out into the
open. Please. Don’t let our country go down the pan just to protect the
interests of a few bad-minded people. (highlighting in italic added to help
the reader).
There are a number of sociolinguistic clues in this letter which can help
provide a profile for the writer. The most notable sociolinguistic clues are
the dialect items, innocent girl. gutted, these shameful things and bad-minded
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
people. A simple Internet search shows that, although the writer claims to
be a white policeman bad-minded people is a phrase common in Jamaican
English. On its own this can only be taken as evidence that the writer has
had contact with Jamaican English but in this case the author was found
to be a black British man of Jamaican origin who had never served in the
police.

Theoretical basis of authorship assignment


Most linguists approach the problem of questioned authorship from the
theoretical position that every native speaker has their own distinct and
individual version of the language they speak and write, their own idiolect,
and the assumption that this idiolect will manifest itself through distinctive
and idiosyncratic choices in texts. Every speaker has a very large active
vocabulary built up over many years, which will differ from the voca-
bularies which others have similarly built up, not only in terms of actual
items but also in preferences for selecting certain items rather than others.
Thus, whereas in principle any speaker/writer can use any word at any
time, speakers in fact tend to make typical and individuating co-selections
of preferred words, (Hoey 2005). This implies that it should be possible
to devise a method of linguistic fingerprinting – in other words that the
linguistic ‘impressions’ created by a given speaker/writer should be usable,
just like a signature, to identify them. In reality, the concept of the linguistic
fingerprint is an unhelpful, if not actually misleading metaphor, at least
when used in the context of forensic investigations of authorship, because
it leads us to imagine the creation of massive databanks consisting of repre-
sentative linguistic samples (or summary analyses) of millions of idiolects,
against which a given text could be matched and tested. In fact such an
enterprise is, and for the foreseeable future will continue to be, impractical if
not impossible. The value of the physical fingerprint is that every sample is
both identical and exhaustive, that is, it contains all the necessary informa-
tion for identification of an individual, whereas, by contrast, any linguistic
sample, even a very large one, provides only very partial information about
its creator’s idiolect. This situation is compounded by the fact that many
Forensic Linguistics 23

of the texts which the forensic linguist is asked to examine are very short
indeed – most suicide notes and threatening letters, for example, are well
under 200 words long and many consist of fewer than 100 words.
Nevertheless, the situation is not as bad as it might at first seem, because
such texts are usually accompanied by information or clues which massively
restrict the number of possible authors. Thus, the task of the linguistic
detective is never one of identifying an author from millions of candidates
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
on the basis of the linguistic evidence alone, but rather of selecting (or of
course deselecting) one author from a very small number of candidates,
usually fewer than a dozen and in many cases only two.
An early and persuasive example of the forensic significance of idiolectal
co-selection was the Unabomber case. Between 1978 and 1995, someone
living in the United States, who referred to himself as FC, sent a series of
bombs, on average once a year, through the post. At first there seemed to be
no pattern, but after several years the FBI noticed that the victims seemed to
be people working in Universities and Airlines and so named the unknown
individual the Unabomber. In 1995 six national publications received a
35,000 manuscript, entitled Industrial Society and its Future, from someone
claiming to be the Unabomber, along with an offer to stop sending bombs
if the manuscript were published.
In August 1995, the Washington Post published the manuscript as a
supplement and three months later a man contacted the FBI with the
observation that the document sounded as if it had been written by his
brother, whom he had not seen for some ten years. He cited in particular
the use of the phrase “cool-headed logician” as being his brother’s termi-
nology, or in our terms an idiolectal preference, which he had noticed and
remembered. The FBI traced and arrested the brother, who was living in
a wooden cabin in Montana. They found a series of documents there and
performed a linguistic analysis – one of the documents was a 300-word
newspaper article on the same topic, which he had written a decade earlier.
The FBI analysts claimed major linguistic similarities between the 35,000
and the 300 word documents: they shared a series of lexical and gramma-
tical words and fixed phrases, which, the FBI argued, provided linguistic
evidence of common authorship.
The defence contracted a linguist, who counter-argued that one could
attach no significance to these shared items because anyone can use any
word at any time and therefore shared vocabulary can have no diagnostic
significance. The linguist singled out twelve words and phrases for parti-
cular criticism, on the grounds that they were items that could be expected
to occur in any text that was arguing a case:
24 Malcolm Coulthard

at any rate; clearly; gotten; in practice; moreover; more or less; on the other
hand; presumably; propaganda; thereabouts; and words derived from the
roots or ‘lemmas’ argu* and propos*.
The FBI searched the internet, which in those days was a fraction of
the size it is today, but even so they discovered some 3 million documents
which included one or more of the twelve items. However, when they
narrowed the search to documents which included instances of all twelve
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
items, they found a mere 69 and, on closer inspection, every single one
of these documents proved to be an internet version of the 35,000 word
manifesto. This was a massive rejection of the defence expert’s view of
text creation as purely open choice, as well as a powerful example of the
idiolectal habit of co-selection and an illustration of the consequent forensic
possibilities that idiolectal co-selection affords for authorship attribution.
For an accessible version of events, from a literary-linguistic scholar who
wrote a report on the language of the manuscript, see Foster (2001). The
full text of the Unabomber manuscript is available at: http://eserver.org/
cyber/unabom.txt/).

Mistakes and errors


It is a basic tenet of applied linguists that not only is language rule-gover-
ned, but that so also is its production in written or spoken form, although,
of course, any spoken or written text may display items which break the
rules of the standard language. We can divide such rule-breaking into two
categories: ‘performance’ mistakes, where the speaker/writer knows that s/
he has broken a rule, and ‘competence’ errors, where the speaker/writer is
working with a set of rules which, though non-standard, are rules which
s/he nevertheless follows consistently. In the short texts that are typically
the focus of the forensic linguist, it is usually only possible to focus on the
grammatical and orthographic rule-breaking, because in order to examine
characteristic vocabulary choices which are also idiolectal one needs much
more textual data than is typically available.
The most difficult author-identification cases are those involving ano-
nymous letters, because there is usually a fairly large number of potential
authors and only a small amount of written text to analyse. For this reason,
success is in the main limited to those cases which involve semi-literate
and/or non-native authors, who necessarily provide a comparatively large
number of idiolectal mistakes and errors in a comparatively small amount
of text. Obviously all intending anonymous letter writers should use a
word-processor spell-checker and also the style-improver options in order
to homogenise and thereby disguise their style.
Forensic Linguistics 25

Below I reproduce a few short extracts from a typed anonymous letter,


which the addressee-company suspected was written by one of its own
employees. I have highlighted the words which contain non-standard
features with italic; (there are many more instances in the rest of the letter
of these particular phenomena):
… I hope you appreciate that i am enable to give my true idenitity as
_ this wolud ultimately jeopardize my position…
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
… l would like to high light my greatest concern…
… have so far deened it unnecessary to investegate these issus…

There are several interesting non-standard features immediately appa-


rent, although one of the problems of dealing with typed text is that errors
and mistakes may be confused and compounded - one may not know,
for any given item, particularly if it only occurs once, whether the ‘wrong’
form is the product of a mis-typing or a non-standard rule - for instance if
a (British English) text includes the word ‘color’ is this a typing mistake or
a spelling error, or even worse the result of the computer user being unable
to change the spell-checker to British English?
In examining the non-standard items in the extracts above we note first-
ly, the writer is an inexperienced typist, the first person pronoun “I” appears
also as “i”, and the very unusual numeral “l”; secondly, some of the words
have metathesized (reversed) letters and others additional or omitted letters;
thirdly, the writer has serious problems when spelling words containing
unstressed vowels - thus we have the following spellings “enable” = “unable”,
“investegate” = “investigate” and elsewhere “except” = “accept”; fourthly, the
writer is unsure about when to write certain sequences of morphemes as a
single word and when as two separate words - thus “high light” and “with
out”. In addition, but not exemplified here, there are homonym problems,
“weather” appears for “whether” and “there” for “their”. Finally, the writer
has some grammatical problems: the frequent omission of markers of past
tense and of the 3rd person singular present tense and even of articles -
“have now (a) firm intention”. Collectively these mistakes and errors are
idiosyncratic and idiolectally distinctive and proved to be instanced in the
authenticated letters of only one of the eight employees who had access to
the information contained in the threatening letter. He turned out to be
the employee already suspected by the company.
The investigation of the authorship of text messages is a new area of
study. In a growing number of murder cases the mobile or cell phones
of people later found dead have sent messages after the time at which
26 Malcolm Coulthard

the police suspect they had already died. Text messaging is a very inte-
resting linguistic phenomenon because there is a great deal of freedom
in encoding, the abbreviations are not yet fixed and so even small
samples of usage can be distinctive and allow a linguist to express an
opinion on the probability that the dead person or the accused sent
one or more of the suspect messages. In my latest case, suspect text
messages included the items “I will”, “yes” “come” and “home” when,
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
in frequent messages sent in the previous three days the owner of the
cellphone preferred the forms “ill”, “ya”, “com”, and “hme”. (<http://
www.royalgazette.com/rg/Article/article.jsp?sectionId=60&articleId
=7d987ab3003000c>). It is usually virtually impossible to get access
to the full data in text messaging cases but see <http://www.thetext.
co.uk/victoria_couchman_case_texts.htm> for all the messages in a
very recent case.

Plagiarism detection
In most plagiarism cases that involve students, there is little doubt about
guilt, as these two examples of essay openings from Johnson (1997: 214)
demonstrate – all items which student B ‘shares’ with student A are
highlighted in italic:
A. It is essential for all teachers to understand the history of Britain as a
multi-racial, multi-cultural nation. Teachers, like anyone else, can be
influenced by age old myths and beliefs. However, it is only by having an
understanding of the past that we can begin to comprehend the present.
B. In order for teachers to competently acknowledge the ethnic minority, it
is essential to understand the history of Britain as a multi-racial, multi-cultural
nation. Teachers are prone to believe popular myths and beliefs; however, it
is only by understanding and appreciating past theories that we can begin to
anticipate the present.
Even these short extracts provide enough evidence of shared items to
question the originality of at least one of the essays. When this level of
sharing is also instanced in other parts of the same texts there is no room
for doubt or dispute. The case of essay C, however, is not as clear-cut (items
which C shares with one or both of essays A and B are highlighted):
C. It is very important for us as educators to realise that Britain as a nation
has become both multi-racial and multi-cultural. Clearly it is vital for tea-
chers and associate teachers to ensure that popular myths and stereotypes
held by the wider community do not influence their teaching. By examining
British history this will assist our understanding and in that way be better
equipped to deal with the present and the future.
Forensic Linguistics 27

Even though there is still quite a lot of shared lexical material here, it is
evident that the longest identical sequences are a mere three running words.
Even so, one would still want to categorise this degree of lexical overlap,
if instanced in other parts of the text, as unacknowledged, though more
sophisticated, borrowing and therefore as plagiarism, even if it doesn’t fit
easily within the observation on the University of Birmingham’s website
that ‘Typically, substantial passages are “lifted” ….’.
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
I will not discuss here the important question of whether a significant
proportion of student written texts, which technically fall within the textual
definition of plagiarism, are not the results of deliberate attempts to deceive
at all, but rather a consequence of what is coming to be known as ‘patchwri-
ting’, that is genuine but flawed attempts by students, who have somehow
failed to acquire the academic rules for acknowledging textual borrowing,
to incorporate the work of others into their own texts (see Pecorari 2008).
Johnson’s solution to the detection of this kind of student plagiarism or
collusion, was to move away altogether from using strings or sequences of
items as diagnostic features and to focus instead on the percentage of shared
individual lexical types and tokens as a better measure of derivativeness. An
automated version of this analytic method, produced by Woolls (2002), is
now available as the computer program Copycatch Gold). Intensive testing
has shown that this measure of lexical overlap successfully separates those
essays which share common vocabulary simply because they are writing
on the same topic, from those which share much more vocabulary because
one or more of them is derivative (see Woolls and Coulthard, 1998). For
example, in Johnson’s study, whereas essays A, B and C shared 72 different
lexical types in their first 500 words, a set of three other essays from the
same batch, whose authors had not colluded, shared only 13 lexical types,
most of which were central to the topic under discussion. Further work
(Woolls 2003) has shown that the most significant evidence is not the
mere quantity of shared lexis, but rather the fact that, in the case of some
shared items, both texts have both selected them and then only used them
once. As such, ‘once-only’ items, or hapaxes as corpus linguists call them
are, by definition, not central to the main concern of the text, otherwise
they would have been used more frequently. The chances of two writers
independently choosing several of the same words for single use are so
remote as to be discountable.
A version of Woolls’ Copycatch, is now being used by UCAS, the
agency through which all English students apply for university places. As
part of their application every student has to write a personal statement
and some statements are, sadly, not sufficiently personal. In 2007 an
28 Malcolm Coulthard

investigation found that 234 of the personal statements, submitted by


applicants for medical degree courses, reported a dramatic incident invol-
ving “burning a hole in pyjamas”, which was based on a model statement
on a website designed to help applicants. In 2008 one girl submitted a
personal statement to UCAS with the following opening - the items in
italic are her contribution, the rest of the text is borrowed directly from
the website.
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
Ever since I burnt holes in my dress after experimenting with my bro-
ther’s chemistry set when I was 10, I have always been passionate about the
sciences. Following several visits to the local hospital during my teenage
years as a result of minor accidents, the idea of a career that would help people
always made physiotherapy a natural choice.

Plagiarism in translation
There is a long tradition of people translating texts into other languages
without acknowledgement, which started well before there was a concept
of the ownership of ideas and the textualisations of these ideas and before
plagiarism came to be seen as an academic sin. It is obviously more difficult
to demonstrate plagiarism through translation than same-language plagia-
rism, although one looks first for the evidence of shared content and the
very similar sequencing of the content typical of same-language plagiarism.
Of more interest to linguists are cases where we have not one but two or
more translations of the same text for comparison purposes. One would
naturally expect more similarity between two translated texts that between
two original texts on the same topic, because the translations are neces-
sarily constrained by the wording of the original. Thus, for example, one
would expect translations of the same text to have more shared hapaxes
and even more shared phrases and consequently for it to be more difficult
to demonstrate plagiarism.
Turell (2004) discusses the case of one Spanish translator of
Shakespeare’s Julius Caesar accusing the author of a later translation of
plagiarism and outlines the linguistic strategies she used to demonstrate
it. She was fortunate in that there were also two earlier published trans-
lations in addition to the supposed plagiarised text and the one from
which it was said to have been plagiarised. She could thus compare all
four translations using Copycatch Gold, each with every other one, that is
a total of 6 pair comparisons, and then work out an expected baseline for
shared vocabulary, for shared hapax words and for shared hapax phrases.
Table 1 below is a summary of her findings
 Forensic Linguistics 29

 Table 1: Comparisons between 4 translated texts
 
  
  
 
    
 
       
 
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)


A plagiarised dictionary
If plagiarism in student writing or in translations of literary texts raises
interesting questions, matters get even more complicated in the case of
lexicography. Surprisingly, plagiarism in dictionary-making is rarely dis-
cussed in the literature (though see Williams 1992). Single words, with
the exception of trademarked ones, are not protected by copyright. Also,
crucially, defining the same lexical items for different dictionaries is bound
to involve similar word choices and all lexicographers consult previous
dictionaries, so the essential question that arises is the degree of creativity
possible within the editorial conventions and typographical constraints
characteristic of dictionary entries. In the case of bilingual dictionaries pla-
giarism of a significant proportion of entries cannot be detected let alone
demonstrated – for example, the majority of the names for plants, animals,
and geographical locations have only one equivalent in the target language.
All this is inevitably exploited by some dictionary-makers. Burchfield
(1984) discusses the case of the Australian Macquarie Dictionary (1981),
which he found to be based on the Hamlyn Encyclopedic World Dictionary
(1971), which in turn he traced back to The American College Dictionary
(1947). He evaluated the amount of material shared by all three dictionaries
at about 93 per cent and commented that “the exact wording and ordering
of senses has been carried over, and deemed appropriate, from an American
dictionary of 1947 to a British one of 1971 and then to an Australian one
of 1981” (1984:153).

Asylum seekers
As Eades (2010) describes, many governments use language analysis as an
aid in the determination of the origin and therefore the genuineness of
asylum seekers. Forensic linguists in several countries concerned about the
unreliability of some of the companies involved tried to alert governments
to problems in the methodology and in 2004 nineteen linguists and pho-
neticians produced a set of ‘Guidelines for the Use of Language Analysis
30 Malcolm Coulthard

in Relation to Questions of National Origin in Refugee Cases’ (<http://


www.lagb.org.uk/language-origin-refugees.pdf>). However, problems
continue. Some companies are not following the guidelines and forensic
phonetician Helen Fraser, suggests that what is needed is major research
program which would
investigate people’s actual abilities in recognising, discriminating and iden-
tifying accents under various sociolinguistic conditions; (b) collaboration
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
between LADO [Language Analysis for Determining Origin] agencies and
linguists to develop analysis and testing procedures; and (c) a system of
accreditation by an independent, international authority for the agencies
that carry out LADO (2009).
Obviously such a program could easily be expanded to include linguistic
issues as well and innovative work reported above on detecting first lan-
guages would be very relevant.

Concluding remarks
In the space available I have tried to give a flavour of the broad range of
work undertaken by Forensic Linguists. So far there is comparatively little
work in France and on the French language. I hope this special issue of the
journal will serve to generate more interest in the area in the short term and
in the longer term a significant body of research findings.

Bibliography

Ainsworth J. (2010), “Miranda Rights: Curtailing coercion in police inter-


rogation: the failed promise of Miranda v. Arizona”, in Coulthard M.
and Johnson A. (eds), p.111-25.
Aldridge M. (2010), “Vulnerable witnesses in the Criminal Justice System”,
in Coulthard M. and Johnson A. (eds), p.296-314
Andrade D. (2010), O Uso de Referenctes Pessoal e de lugar e o Uso de Formu-
lações em Interrogatórios na Corte, unpublished masyerts dissertation
UNISINOS, São Leopoldo, Rio Grande do Sul, Brazil.
Berk-Seligson S. (2002), The Bilingual Courtroom: court interpreters in the
judicial process, Chicago, University of Chicago Press.
Burchfield R. (1992), Unlocking the English Language, New York, Hill and
Wang.
Cotterill J. (2000), “Reading the rights: a cautionary tale of comprehension
and comprehensibility”, Forensic Linguistics, 7, p.4-25.
Forensic Linguistics 31

— (2003), Language and Power in Court: A Linguistic Analysis of the OJ


Simpson Trial, Basingstoke/New York, Palgrave Macmillan,
Coulthard M., Grant T. & Kredens K. (in press), “Forensic Linguistics”, in
Wodak R., Johnstone B. & Kerswill P. (eds), Sage Handbook of Socio-
linguistics, London, Sage.
Coulthard M. & Johnson A. (2007), An Introduction to Forensic Linguistics:
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
Language in Evidence, London/New York, Routledge.
Coulthard M. & Johnson A. (eds) (2010), The Routledge Handbook of Forensic
Linguistics, London, Routledge.
Coulthard M., Johnson A., Kredens K. & Woolls D. (2010), “Plagiarism”,
in Coulthard M. & Johnson A. (eds), p.523-538.
Dumas B. (2000), “US pattern jury instructions: problems and proposals”,
Forensic Linguistics, 7, 1, p.49-71.
Eades D. (2008), Courtroom Talk and Neocolonial Control, Berlin/New
York, Mouton de Gruyter.
— (2010), “Language analysis and asylum cases”, in Coulthard, M. and
Johnson, A. (eds), p.311-322.
Foster D. (2001), Author Unknown: on the Trail of Anonymous, London,
Macmillan.
Fraser H. (2009), ‘The role of “educated native speakers” in providing lan-
guage analysis for the determination of the origin of asylum seekers’,
International Journal of Speech, Language and the Law, 16, 1, p.113–
138.
Gibbons J. (2001), “Revising the language of New South Wales police
procedures: applied linguistics in action”, Applied Linguistics, 22, 4,
p.439–469.
— (2003), Forensic Linguistics: An Introduction to Language in the Justice
System, Oxford, Blackwell.
Hale S. (2010), “Court interpreting: the need to raise the bar”, in Coulthard M.
& Johnson A. (eds), p.440-454.
Haworth K. (2010), “Police interviews as evidence”, in Coulthard M. &
Johnson A. (eds), p.169-181.
Heffer C. (2005), The Language of Jury Trial: A Corpus-Aided Analysis of
Legal-Lay Discourse, Basingstoke/New York, Palgrave Macmillan.
32 Malcolm Coulthard

Heydon G. (2005), The Language of Police Interviewing: A Critical Analysis,


Basingstoke, Palgrave Macmillan.
Hoey M. (2005), Lexical priming: A new theory of words and language,
London, Routledge.
Hollien H. (2002), Forensic Voice Identification, London, Academic Press.
Kredens K. & Morris R. (2010), “A shattered mirror?’ Interpreting in
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
legal contexts outside the courtroom” , in Coulthard M. & Johnson A.
(eds), p.455-472.
Olsson J. (2008), Forensic Linguistics, (2nd ed.), New York, Continuum.
Pecorari D. (2008), Academic Writing and Plagiarism, London, Continuum.
Rock F. (2007), Communicating Rights: The Language of Arrest and Deten-
tion, Basingstoke, London, Palgrave Macmillan.
Rose P. (2002), Forensic Speaker Identification, London, Taylor and Francis.
Solan L. & Tiersma P. (in preparation), A Handbook of Language and the
Law, Oxford, Oxford University Press.
Shuy R. (1993), Language Crimes: the Use and Abuse of Language Evidence
in the Courtroom, Cambridge MA, Blackwell.
Shuy R. (1997), “Ten unanswered language questions about Miranda”,
Forensic Linguistics, 4, 2, p.175–195.
— (1998), The Language of Confession, Interrogation and Deception,
London, Sage.
— (2002), Linguistic Battles in Trademark Disputes, Basingstoke/New
York, Palgrave Macmillan.
— (2005), Creating Language Crimes, New York, Oxford University
Press.
— (2006), Linguistics in the Courtroom: A Practical Guide, Oxford,
Oxford University Press.
— (2008), Fighting over Words: Language and Civil Law Cases, Oxford,
Oxford University Press.
Solan L. (2010), “The expert linguist meets the adversarial system”, in
Coulthard M. & Johnson A. (eds), p.395-407.
Forensic Linguistics 33

Stygall G. (2010), “Complex documents/average and not-so-average


readers”, in Coulthard M. & Johnson A. (eds), p.51-64.
Svartvik J. (1968), The Evans Statements: A Case for Forensic Linguistics,
Göteborg, University of Gothenburg Press.
Tiersma P. (1999), Legal Language, Chicago, University of Chicago Press.
Turell T. (2004), “Textual kidnapping revisited: the case of plagiarism in
© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)

© Éditions de la Maison des sciences de l'homme | Téléchargé le 25/04/2022 sur www.cairn.info via Linnaeus University (IP: 194.47.188.28)
literary translation”, International Journal of Speech, Language and the
Law, 11, 1, p.1-26.
Williams J. (1992), “The question of plagiarism and breach of copyright in
the dictionary-making process (with particular reference to the UK)”,
in Tommola H., Varantola K., Salmi-Tolonen T. & Schopp J. (eds.),
Euralex ‘92 Proceedings, Tampere, Department of Translation Studies,
University of Tampere, p.561–570.
Woolls D. (2002), Copycatch Gold ; for more details see www.copycatch-
gold.com.
— (2003), “Better Tools for the Trade and how to Use them”, Internatio-
nal Journal of Speech, Language and Law, 10, 1, p.102-112.
Woolls D. & Coulthard M. (1998), “Tools for the Trade”, Forensic
Linguistics: The International Journal of Speech, Language and Law,
5, 1, p.33-57.

Vous aimerez peut-être aussi