Vous êtes sur la page 1sur 24

Language Studies

Language Studies:
Stretching the Boundaries

Edited by

Andrew Littlejohn and Sandhya Rao Mehta

Language Studies: Stretching the Boundaries,


Edited by Andrew Littlejohn and Sandhya Rao Mehta
This book first published 2012
Cambridge Scholars Publishing
12 Back Chapman Street, Newcastle upon Tyne, NE6 2XX, UK

British Library Cataloguing in Publication Data


A catalogue record for this book is available from the British Library

Copyright 2012 by Andrew Littlejohn and Sandhya Rao Mehta and contributors
All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or
otherwise, without the prior permission of the copyright owner.
ISBN (10): 1-4438-3972-8, ISBN (13): 978-1-4438-3972-3

TABLE OF CONTENTS

List of Pictures........................................................................................... vii


List of Figures .......................................................................................... viii
List of Tables.............................................................................................. ix
Introduction ................................................................................................. 1
Andrew Littlejohn
Section I: Concepts Considered
Chapter One............................................................................................... 10
Who is Stretching Whose Boundaries? English Language Studies
in the New Millennium
Sandhya Rao Mehta
Chapter Two .............................................................................................. 26
Language and Group Identity: Some Social Psychological Considerations
Itesh Sachdev
Chapter Three ............................................................................................ 43
Procedures for Translating Culturally Specific Items
James Dickins
Chapter Four.............................................................................................. 61
Proverb Translation: Fluency or Hegemony? An Argument for Semantic
Translation
Abdul Gabbar Al-Sharafi
Chapter Five .............................................................................................. 75
Dialogue Systems: Stretching the Boundaries of Pragmatics and Discourse
Analysis
Radhika Mamidi

vi

Table of Contents

Chapter Six ................................................................................................ 93


The Role of Forensic Linguistics in Crime Investigation
Anna Danielewicz-Betz
Chapter Seven.......................................................................................... 109
University English Studies in Multilingual Contexts:
What are the Prospects?
James A. Moody
Section II: Languages Considered
Chapter Eight........................................................................................... 126
How English Grammar has been Changing
Geoffrey Leech
Chapter Nine............................................................................................ 147
Digging for New Meanings: Uncovering a Postcolonial Beowulf
Jonathan Wilcox
Chapter Ten ............................................................................................. 162
"These words are not mine. No, nor mine now."
Poetic Language Relocated
Sixta Quassdorf
Chapter Eleven ........................................................................................ 177
Stretching the Boundaries of English: Translation and Degrees of
Incorporation of Anglicisms
Paola Gaudio
Chapter Twelve ....................................................................................... 190
The Arab Body Metaphor in Contemporary Arabic Discourse:
An Exploratory Study
Abdullah al Harrasi
Chapter Thirteen...................................................................................... 208
Students as Authors: Textual Intervention in Children's Literature
Rosalind Buckton-Tucker
Contributors............................................................................................. 218
Index........................................................................................................ 221

CHAPTER SIX
THE ROLE OF FORENSIC LINGUISTICS
IN CRIME INVESTIGATION
ANNA DANIELEWICZ-BETZ

Abstract
This paper considers the extent to which forensic linguistics can be
considered a science, and outlines some ways in which it is useful in legal
proceedings, including voice identification, the interpretation of policesuspect interaction, verification of police reports (including the illegal
practice of verballing) and cross-cultural insights into speech patterns in a
courtroom context. The paper provides a closer examination of one
particular area, that of authorship attribution, particularly in SMS
messages, and concludes by raising some ongoing controversies in forensic
linguistics and by discussing future prospects.

Keywords: forensic linguistics, authorship detection, authorship attribution,


voice identification, forensic text types

Forensic Linguistics: An Introduction


Forensic linguistics, as an emerging sub-discipline of forensic science,
is an interdisciplinary field of applied/descriptive linguistics which
comprises the study, analysis and measurement of language in the context
of crime, judicial procedures or disputes in law. The interface between
language, crime and the law can be detected, for instance, in the analysis
of courtroom discourse, courtroom interpreting and translating, the
readability/comprehensibility of legal documents, the comprehensibility of
the police caution issued to suspects, and authorship attribution.
Although it is, at present, far from being as accurate as DNA testing,
forensic linguistics uses the expertise of descriptive and applied linguists

94

Chapter Six

in the unravelling of legal puzzles, so to say. Informed use of forensic


linguistics requires familiarity with the broader application of linguistics
as a social science, including phonetics and phonology, morphology,
syntax, and semantics, discourse analysis, pragmatics, psycholinguistics,
neurolinguistics, sociolinguistics, dialectology, computational linguistics,
and corpus linguistics.
The forensic linguist applies linguistic knowledge and techniques to
the language implicated in legal cases or proceedings and private disputes
between parties which may result in legal action. In this paper, I wish to
first consider the extent to which forensic linguistics can be considered a
science. I will then provide an overview of some of the areas in which
forensic linguistics has a significant role to playincluding voice
identification, interpretation of police-suspect interaction, verification of
police reports and cross-cultural insights into speech patternsbefore
turning to a closer examination of one particular area, that of authorship
attribution. I will conclude by raising some ongoing controversies in
forensic linguistics and discuss future prospects.

Is This A Science?
The primary difference between forensic and non-forensic methods in
linguistics is the scientific approach. In forensic linguistics, the scientific
method requires hypothesis testing and a litigation-independent testing of
the method for its accuracy. These tests are performed with robust controls
regarding data quantity, data sources, and analytical objectivity.
Restrictions in applying linguistic expertise in the context of law are
due to varying degrees of acceptability in the courtroom, varying degrees
of reliability related to shortcomings such as the brevity of documents,
small data samples, general characteristics of language (for example,
generic language features of suspects), and the intrinsic nature of language
as something in constant change. The quality of evidence from this
emerging field also depends considerably on the experience and
knowledge of individual linguists involved in a given case. Courts in many
countries admit forensic evidence but have differing criteria. In the United
States, for example, the so-called Daubert standard rule of evidence
regarding the admissibility of expert witnesses testimony in federal legal
proceedings states that evidence based on innovative or unusual scientific
knowledge may only be admitted after it has been established that it is
reliable and scientifically valid. The Daubert test is based on peer review,
error rates, testing, and acceptability in the relevant scientific community.

The Role of Forensic Linguistics in Crime Investigation

95

Is there a linguistic equivalent of an individual fingerprinta


linguistic fingerprint? This is indeed an attractive notion, which would
certainly give forensic linguistics a more secure status as a science.
However, although it is often claimed that each human being uses
language differently and that this difference can be observed as easily and
as surely as a fingerprint, it is, in reality, impossible to compile a
collection of markers which would stamp a particular speaker/writer as
unique. For the present, therefore, the notion of linguistic fingerprint
appears essentially flawed and there is little hard evidence to support it.
Accordingly, it is better to focus on the distinctive style of a given person,
as detected in a set of known and suspected texts within an inquiry. This is
something which I will take up further in my section on authorship
attribution. Before doing this, however, it is useful to see some of the ways
in which forensic linguistics can be of use.

Forensic Linguistics: Some Areas of Application


Forensic Phonetics
Phonetic techniques are primarily used in the analysis of the voice as
applied in criminal investigation. This comprises technical voice
comparisons, lay voice recognition, transcription of spoken language,
speech signal enhancement, and the authentication of recordings. Forensic
phoneticians conduct speaker identifications, resolve disputed content
recordings, and transcribe spoken texts. They are also involved in the
setting up of so-called voice line-ups or parades in which not eye- but earwitnesses are asked to take part in order to identify a suspect. The typical
questions asked in this context are: Was the anonymous caller the same
person as the known speaker? Are the two samples from the same
dialect/accent? Is the pronunciation of phonemes similar across the known
and questioned voices?
The fundamental problem with voice line-ups, however, is that despite
the fact that, in a threatening situation, we may be capable of storing more
features, generally speaking, our memory for voices fades rapidly in
comparison to our memory for faces. Voice identification, therefore, needs
to be conducted without delay and treated with extreme caution.
For the forensic record, spoken textsbe it interviews, oral statements,
or interrogationshave to be transcribed into written form, which often
causes problems, as some information might go missing or there may be
inaccurate relay of the nuances of the oral text (partly due to lack of
contextual information and paralinguistic features). In addition, written

96

Chapter Six

discourse differs considerably in mode of expression from spoken


discourse which is strongly context-dependent, as discussed below.

Language in Authority and Power Relations


In the United States, the Supreme Court in Miranda v. Alabama (1966)
set down the requirement that, prior to the arrest or interrogation of a
suspect in a crime, that person must be told that they have the right to
remain silent, the right to legal counsel, and the right to be told that
anything they say can be used in court against them. Instances of the
application of this requirement serve well as an illustration of how speech
acts performed by police officers may lead to the apparent consensual
nature of searches, how questioning can be interpreted as coercive, and
how the relationship between authority figures and a suspect/defendant is
asymmetric. Consider the following examples, discussed in Solan and
Tiersma (2005, pp. 35ff) which on the semantic level cannot be interpreted
as directives, yet pragmatically speaking, given the authoritarian context,
appear precisely as that:
Does the trunk open?
You dont mind if we look in your trunk, do you?
Why dont you put your hands behind your back, all right?

The level of coerciveness increases in requests such as:


Would you mind if I took a look around here?
Well, then, you dont mind if I look around in the car, do you, or would
you?

The police usually lack the authority to make promises such as Well
go easy on you if you confess, yet this is implied in their requests to
comply. The problem is, as Solan and Tiersma (2005, p. 38) point out, that
people who are stopped by the police tend to interpret ostensible requests
as commands or orders, yet, in contrast, their own indirect wishes to get a
lawyer often go unnoticed (for example, Maybe I should talk to a
lawyer). This problem is further exacerbated due to problems related to
the comprehensibility of the Miranda warning and other police language
for many suspects, including defendants who may be (semi-)illiterate,
speakers of another language, or too young or mentally-challenged to
understand their rights to remain silent and seek legal advice.
In any case, the asymmetric nature of the relationship between
authority figures (the police) and the defendantwho may be disadvantaged

The Role of Forensic Linguistics in Crime Investigation

97

in some waycan result in a text (such as a record of interview, video or


audio recording or written statement) which is considerably at variance
with what the suspect would have said had he/she been given the
opportunity to make a statement in a non-coercive or less threatening
environment. This leads to the conclusion that despite the necessity of
strong contextual reliance in the interpretation of speech acts, courts may
habitually use out-of-context inferences and entailments to reach
decisions.

Discrepancies in Police Reports


When establishing the accuracy of police reports and alleged suspect
statements one has to consider the relationship between the documents
exhibited and the events they purport to describe. What is the time frame?
When were the incident notes taken? Is there a chronology and accuracy in
recalling the events? Too many common features between the statement
and the incident notes, coupled with chronological inconsistency and
frequent use of characteristically written rather than spoken discourse, may
raise suspicion as to authenticity of the police record of an interview or a
statement. For this reason, videotaping, recommended by Solan and
Tiersma (2005), has been the law for many years in the UK and Australia,
yet in the US it is required in only a few states.
Police officers typically use so-called police speak, which is
relatively easy to detect. It is characterised by efficient and compact set
phrases, dense wording in an impersonal, official style, with precise
renditions of time, place and sequence, as well as precise descriptions of
objects, such as weapons. A very revealing expression, otherwise
uncommonly used, is I then + verb as in I then threw the weapon into the
river. The alteration by the police of a defendants utterances, such that
they include damaging remarks, is referred to as verballing. This illegal
practice may be done, for instance, in order to match a defendant to a
certain racial profile. Racial profiling refers to the use of an individuals
race or ethnicity by law enforcement personnel as a key factor in deciding
whether to engage in enforcement, e.g., make a traffic stop or arrest. (For
further, detailed discussion of the language of interrogation and
statements, see Ollson (2009, pp 100ff).

Cross-Cultural and Cross-Linguistic Differences in Testimony


Linguists, and sociolinguists in particular, study differences in varieties
and dialects within a given language, and across cultures and languages.

98

Chapter Six

Unfortunately, this cross-cultural linguistic research may not be taken into


account by law enforcement authorities passing crucial judgements related
to someones guilt or innocence. In this relation, Eades (2008), for
example, examines the social consequences of courtroom talk through
detailed investigation of the cross-examination of three Australian
Aboriginal boys in the case against six police officers charged with their
abduction. In her study of Australian courtroom discourse, she discovered
that yes/no questions are not considered coercive in Australian Aboriginal
interactions, but rather are understood as an invitation to explain or
elaborate. Further, the difference in cultural meaning attached to silence
can also impact judgements in the courtroom: whereas silences longer than
a few seconds are hardly tolerated in Western English-speaking societies,
Eades courtroom data reports common Aboriginal silence up to 23
seconds.
Tag questions can also be a source of misunderstanding in testimony to
be interpreted. Whereas negative tag questions in English require a
negative answer to deny an accusation (e.g., You took the money, didnt
you? No, I didnt.), tag questions in many other languages, including
Spanish and some Asian languages, can be answered either negatively or
affirmatively with relatively no alteration in meaning. Another example
refers to the incorrect interpretation of auxiliaries in the testimony of Rosa
Lopez during the highly publicised trial of O. J. Simpson (an ex-American
football star and sports announcer, accused of the murder of his ex-wife
and her friend). The interpretation contributed to a more coercivesounding cross-examination in Spanish than in the original English. As
one can appreciate, ultimately, a person may be deemed guilty due to
cross-cultural differences in utterance interpretation as expressed, for
example, in syntax, prosody or even non-verbal signals involved in
producing a statement.
In the next part of the paper, I would like to focus on one area where
forensic linguistics is particularly relevant: that of authorship attribution.

Authorship Attribution
Authorship attribution is the science of inferring characteristics of the
author from the characteristics of documents produced by that author. The
key task is to establish who said or wrote something which is to be used as
evidence. Attribution is facilitated by measuring word length average,
average number of syllables per word, article/determiner frequency, and
type-token ratio (a measure of lexical variety). Furthermore, punctuation

The Role of Forensic Linguistics in Crime Investigation

99

in terms of overall density, syntactic boundaries and the measurement of


unique words in a text, contribute to solving the task. Both Chaski (1997,
2001) and Kredens (2000) stress the importance of taking the relative
frequency of various syntactic markers into consideration. Generally
speaking, it is easier to eliminate someone as the author than pinpoint
someone with certainty.

Forensic Text Types


A forensic text is any kind of text, a written document or an audio or
video recording, which is the subject of police investigation or of criminal
procedure. The investigative linguist may be called upon to analyse a
variety of documents. The text types may include emergency calls, ransom
demands and other threats, such as hate mail, aimed at victimising others.
In this case the genuine or false nature of the call has to be determined to
detect or eliminate a hoax, for example. The same differentiation applies
to suicide notes or letters. Last statements, on the other hand, may throw
some light on the guilt or innocence of a convicted person, if a death row
inmate decides to utter their last words:
Well, I dont have anything to say. I am just sorry about what I did to Mr.
Peters. Thats all.

Death row statements either (explicitly or implicitly) may confirm


commitment of a crime, or deny it, leaving an impression of innocence
behind. They may also denounce witnesses as dishonest or criticise law
enforcement as corrupt.

Text Message Analysis


Text messages (or SMS) may be analysed for authorship attribution in
cases of crimes where, for example, the perpetrator is suspected of sending
text messages from the victims phone, purporting to be written by the
victim. In this case, the forensic linguist attempts to determine the
consistently used stylistic features. Statistical analysis of a specialised
language database of thousands of text messages from a corpus sampler
may facilitate such analysis. The key question here is how to determine the
point at which a style change within the texts became evident (the socalled cut-off point). This has to be accompanied by compilation of a
sociolinguistic profile of the purported author in terms of gender, age,
origin, as well as social, educational, and professional background. It is

Chapter Six

100

also important to link the messages at hand by means of cohesive and


coherent devices to specify the order in which they were sent. Consistent
or inconsistent dialectal features may include, for example, the use of
pronouns (my/myself v me/meself). Crucial stylistic features include
formation of clusters of words (e.g., want2go) and their average length
and character (phrases/clauses v single words). Length of texts and word
length average, punctuation, spacing, etc. play an important role as well.
One should also consider individual words and phrases that can be
written in more than one way (e.g. av, hav and ave for have), as
well as alternative lexical choicesmorphological, alphanumeric, letter
replacive, orthographic (homophonic and punctuation-related, lower/upper
case), or orthographic/phonic reduction, as in:
4u2 fone

gr8t!

r u goin?

However, one should bear in mind that a persons style of writing or


texting is not always consistent and it may change, for example, due to
changes in life circumstances, the text type, or addressee relationship.
Moreover, a language feature which occurs in a small sample cannot be
treated as a constant for variation in larger samples. In addition, mobile
phone texts sometimes use mixed styles (cf. Olsson 2009: 57ff). On
numerous occasions thorough linguistic analysis of the SMS messages
sent from a victims phone have led to the capture of the perpetrator due to
certain idiosyncratic features, such as spacing, non-contraction of positive
verbs, using Im/Im or owing to inconsistencies in texting styles (e.g.
cu vs cya, my vs me, Im not vs aint). (See Amos, 2008, for
an interesting account in this regard.).

Variation in Author Texts


There are two types of author variation: within and across texts. The
former, so-called intra-author variation, refers to the ways in which one
authors texts differ from each other. This may include variation in
vocabulary, depending on genre, text type, fiction v non-fiction, private v
public texts. However, one has to take such factors into consideration as
time lapse between two communications, possible disguise, change in
personal circumstances (e.g., language of trauma), cultural changes that
may influence, for example, the texting language), etc. Moreover, all
authors exhibit variation in genre, text type, and the like, and that variation
in short texts can be extreme. Inter-author variation deals with the ways in
which different authors vary from each other due to widely different social

The Role of Forensic Linguistics in Crime Investigation

101

backgrounds, levels of education, geographical origin, different types and


levels of occupation/profession, and so on. There exists also the short text
stability problem: in short text analysis we usually find high intra-author
variation and low inter-author variation if the texts are of the same type.

Scientific Methods of Authorship Detection


Authorship methods which focus on linguistic characteristics currently
have accuracy rates ranging from 72% to 95%, within the computational
paradigm. Chaski (2005) presents a computational, stylometric method
which has obtained 95% accuracy and has been successfully used in
investigating and adjudicating several crimes involving digital evidence.
Computer crime investigations, where it is crucial to determine who
actually pressed the key on the keyboard, range from homicide to identity
theft and many types of financial crimes.
Evidence in these cases can be collected using several methods, such
as biometric analysis of the computer user, qualitative analysis of any
idiosyncrasies in the language in questioned and known documents, or
quantitative, computational stylometric analysis. Naturally, the higher the
rate of accuracy, the better, but questions related to the likelihood of the
contested documents belonging to another suspect have to be answered as
well.
Chaski and Chemylinski (2005a) have developed a method for
decomposing the data into smaller chunks so that a larger set of variables
can be used for the discriminating analysis. Chaski and Chemylinski
(2005b) also obtained similar results using these variables with logistic
regression, that is part of a category of statistical models called generalized
linear models. Logistic regression allows one to predict a discrete
outcome, such as group membership, from a set of variables that may be
continuous, discrete, dichotomous, or a mix of any of these.
Stamatatos (2009), on the other hand, presents recent advances of the
automated approaches to attributing authorship, examining their
characteristics for both text representation and text classification. The
focus is placed on computational requirements and settings rather than on
linguistic or literary issues. He also discusses evaluation methodologies
and criteria for authorship attribution studies.
An important question is how to discriminate between the three basic
factors: authorship, genre, and topic. Are there specific stylometric
features that can capture only stylistic, and specifically authorial,
information? The application of stylometric features to topic-identification
tasks has revealed the potential of these features to indicate content

102

Chapter Six

information as well (cf. Clement and Sharp 2003; Mikros and Argiri,
2007). It seems that low-level features like character N-grams
(subsequences of n items from a given sequence, for example, phonemes,
syllables, letters, or words) can successfully be applied in stylistic text
analysis (cf. Keselj et al. 2003; Stamatatos 2006; Grieve 2007). A crucial
need is, however, to increase the available benchmark corpora so that they
cover many natural languages and text domains. It is also very important
for the evaluation corpora to offer control over genre, topic and
demographic criteria.

SMS Authorship Attribution


In the face of increasing amount of digital evidence available on
cellular phones and, consequently, the necessity to detect SMS (text)
authors in criminal persecution cases, Mohan, Baggili and Rogers (2010)
propose an N-grams based approach for determining the authorship of text
messages. The method shows encouraging results in identification of
authors. A token is generated by moving a sliding window across a corpus
of text where the size of the window depends on the size of the token (N)
and its displacement is done in stages, each stage corresponding to either a
word or a character.
Since SMS messages are normally very brief and lack many syntactic
features, in the forensic analysis of these messages there is a need for high
processing speed because, frequently, someones life may be at stake. An
N-gram approach for an SMS corpus seems to find application under such
conditions and is said to predict the author with an accuracy of 65-72%
when the samples of SMS messages are small and the number of possible
authors is comparably large.

Forensic Linguistic Controversies


In the final section of this paper, I will turn to some of the
controversies remaining for the forensic linguistics, and the future prospect
for the science.

Speaker Identification
One of the controversies discussed in, for example, Hollien (2001), is
the disagreement in the so-called scientific community on the degree of
accuracy with which examiners can identify speakers under all conditions.
Surprisingly, many suspects will voluntarily give a sample of their voice

The Role of Forensic Linguistics in Crime Investigation

103

for comparison purposes. Vocal disguises, however, can be very difficult


for the examiner to deal with and the probability of determination is lower
than with normal voice samples. To prevent problems, investigators need
to request that the court order specify in detail that the suspect give a
sample of his or her voice, repeating the phrases of the questioned call, in
a natural conversational voice (or in a similar disguise, if that is the case)
and that such sample shall be given at least three times and to the
reasonable satisfaction of the investigator. Voice specimens obtained with
such specific instructions are usually very satisfactory for comparison
purposes. There is presently, however, no universal standard for the
number of words required for identification. It does vary from a minimum
of 10 for some agencies and 20 for others.
According to Hollien (ibid.), spectrographic voice identification
assumes that intra-speaker variability (as discussed above) is discernible
from inter-speaker variability (differences in the same utterance by
different speakers); however, that assumption is not adequately supported
by scientific theory and data. Viewpoints on actual error rates are presently
based only on various professional judgements and fragmentary
experimental results rather than from objective data representative of
results in forensic applications.

Testimony
Controversies also arise in relation to witness/police testimony. All the
cases of second-hand verbal (apparently verbatim) material (cf. I dont
know exactly what he said, but I know he said he did it in Solan and
Tiersma, 2005: 98) can be considered unreliable since, as discussed below,
human memory is incapable of retaining the exact wording even after a
couple of seconds, not to speak of months or years. Moreover, reproduced
utterances may be presented in isolation, lacking the original paralinguistic
and situational (pragmatic) context. There also remains a great deal of
research to be done to increase our insight into the effect of estimator
variables on speaker identification by ear witnesses. It should for the time
be treated with considerable caution.
Scientific criteria for court admissibility of testimony still pose a
problem as they differ from country to country and from state to state (as
in the case of the US). Required qualifications of examiners and presenters
of forensic linguistic materialso-called forensic expertshave not yet
been clearly specified, either.

104

Chapter Six

Impressionistic Likelihood and Veracity of Statements


As already mentioned above, one may question the admissibility of
witnesses oral evidence and statements, as well as judges decisions based
on impressionistic linguistic witness evidence (e.g., reliability of memory,
statements deprived of context and pragmatic implications, etc.).
Veracity refers to truthfulness of a spoken or written testimony. When
defendants feel challenged in this respect, they may suddenly become
conscious of their pronunciation (or hyper-correct, in sociolinguistic
terms). Despite the fact that some witnesses claim that they can remember
exact words of a defendant months or even years later, it is doubtful if this
is ever accurate. This seems even less likely, when more than one person,
for example, a number of police officers, quote a suspect verbatim after a
considerable time lapse. Hence a question arises: how long, in reality, can
one can remember what someone else has saidword for word? As
Clifford and Scott (1978) state, the upper limit for short-term memory is 79 items, beyond which meaning may be retained but not the actual
wording. Moreover, an average recall level is about 30-40% already after a
few seconds. In addition, the usage of generic language or an incongruous
register when a specific register is normally used leads the forensic
linguist to raise doubts about the genuineness of a given statement.

Can Forensic Linguistics Establish Guilt or Innocence?


By meeting scientific forensic criteria and presenting convincing
linguistic evidence in court, forensic linguists can certainly contribute to
pronouncing someone innocent. They can also prompt admittance of guilt.
Forensic linguists may be asked to investigate recorded police
interrogations to decipher whether or not a person knowingly admitted
guilt, underwent just interrogation or understood the conversation
conducted throughout the interview. Since recorded interviews can be
admitted in court as evidence, dialogue analysis may be carried out to
(dis)prove guilt and determine potential inconsistencies in the interviewing
process, making recordings inadmissible in court. The defence can
therefore show that the recorded language does not necessarily indicate the
defendants guilt.

What Is a Reliable Sample?


Author identification is a very interesting and potentially useful area in
determining guilt, but it is restricted by the fact that documents in a

The Role of Forensic Linguistics in Crime Investigation

105

forensic setting (ransom notes, black mail, etc.) are usually much too short
to make a reliable identification. Moreover, which linguistic features are
reliable indicators of authorship, and how reliable those features are,
remains to be discovered. As Tiersma (ibid) points out, research is
ongoing, and the availability of large corpora of speech and writing
samples suggests that the field may advance in the future (although the
typically small size of the documents in most criminal cases will always
be a problem).
It is therefore crucial for the attribution methods to be robust and
applicable to a limited amount of short texts. However, several important
questions remain open in relation to the authorship attribution, the most
important issue being the required text-length. Despite the fact that various
studies have reported promising results with short texts (with less than
1,000 words; cf. Sanderson and Guenter, 2006; Hirst and Feguina, 2007),
it has not yet been possible to define a text-length threshold for reliable
authorship attribution.
In the final section of this paper, I want to turn to some of the future
challenges for forensic linguistics and possible ways towards scientific
legitimisation of the discipline.

Future Prospects of Forensic Linguistics


Will forensic linguistics ever become an established discipline, on a
par with scientific forensic methods of providing criminal evidence? From
the perspective of its international development the following challenges
for the discipline emerge, before such a status can be achieved:

the integrated study of forensic linguistics/language and the law across


different judicial systems and geographical boundaries;
the development of replicable methods of analysis to be used in expert
witness evidence in order to ensure internal and external validity in
research;
extensive detailing of codes of good practice and conduct;
cooperation of International Association of Forensic Linguistics (IAFL)
with other associations and societies of forensic sciences;
certification of forensic linguistics as a scientific discipline, i.e.,
universal acceptance of linguistic evidence along other forensic
evidence (e.g. as fulfilling the Daubert standard in the USA).

106

Chapter Six

It seems that the future of forensic linguistics lies with corpus-driven


approaches (cf. Kniffka 2007). The forensic linguistic community also
needs to bring together relevant scholars and linguistics experts of nonEnglish backgrounds with those of English-speaking backgrounds.
Kniffka (ibid.) implies that the English-speaking work on forensic
linguistics has not always been aware of work published in German, or
other languages for that matter. Kniffka claims that the forensic linguistic
work in Germany was already well advanced when it was only just
beginning in English-speaking contexts.

Summary and Conclusion


The present paper has offered a brief overview of the interdisciplinary
field of forensic linguistics and illustrated some of its applications, such as
pragmatic analysis and various scientific methods of authorship
attribution, serving the law and law enforcement. The difficult role of
linguists in court testimony is discussed by, for example, Solan and
Tiersma (2005) who state that, although not always permitted in court in
the end, the linguistic evidence may be helpful to law enforcement in
investigating a crime or to lawyers preparing for trial. As a matter of
caution one may add that testifying linguists should not, however, state
conclusions that indicate more than the evidence presented. No matter how
strongly the linguist is convinced that the defendant is innocent, he/she
should restrict their opinion to only stating the degree of probability of, for
example, a confession being verballed by police officers. Moreover,
forensic linguists need to stay impartial at all times, as they serve the law
in the role of experts and cannot under any circumstances side with
defence or prosecution.
Despite the fact that linguistic expertise has been frequently favourably
compared to fingerprint or DNA evidence, the current state of the art in
practices such as voice identification and authorship attribution has not yet
reached the same level of reliability. At most, linguistic expertise
facilitates elimination of a suspect as the perpetrator, but is not in a
position to identify one with certainty (cf. Solan and Tiersma, 2005: 242).
Yet, advances in technology and science, as demonstrated above, allow
experts to compare documents and voice recordings more quickly and
more easily than before. Computer assistance, such as the Federal Bureau
of Investigations Communication Threat Assessment Database (CTAD),
makes it possible to break forensic linguistic data into numerous
categories and to make rapid assessments. These developments promise
continued expansion of role of forensic linguistics.

The Role of Forensic Linguistics in Crime Investigation

107

Bibliography
Amos, O. 2008. The text trap. The Northern Echo. Retrieved January 5,
2012
from http://www.thenorthernecho.co.uk/features/leader/207
6811.the_text_trap/
Chaski, C.E. 2005. Empirical evaluations of language-based author
identification techniques. International Journal of Speech, Language
and the Law, 8 (1), pp. 1-65.
. 2005. Whos at the keyboard? Authorship attribution in digital
evidence investigations. International Journal of Digital Evidence 4
(1), pp. 1-13.
Chaski, C. E., and H. J. Chmelynski. 2005a (pending publication). Testing
twenty variables for author attribution by discriminant function
analysis.
Chaski, C. E., and H. J. Chmelynski. 2005b (pending publication). Testing
twenty variables for author attribution by logistic regression.
Clement, R., and D. Sharp. 2003. N-gram and Bayesian classification of
documents for topic and authorship. Literary and Linguistic
Computing, 18 (4), 423-447.
Clifford, B.R. 2009. The role of the expert witness. In G. Davies, R. Bull
and C. Hollin (eds.). Forensic Psychology. New York: Wiley.
Eades, D. 2008. Courtroom talk and neocolonial control. Berlin and New
York: Mouton de Gruyter.
. 2000. I dont think its an answer to the question: Silencing aboriginal
witnesses in court. Language in Society, 2000 (29), pp. 161-195.
Grieve, J. 2007. Quantitative authorship attribution: An evaluation of
techniques. Literary and Linguistic Computing, 22 (3), pp. 251-270.
Hirst, G. and O. Feiguina. 2007. Bigrams of syntactic labels for authorship
discrimination of short texts. Literary and Linguistic Computing, 22
(4), pp. 405-417.
Hollien, H. 2001. Forensic Voice Identification. London: Academic Press.
Keselj, V., F. Peng, N. Cercone, and C. Thomas. 2003. N-gram-based
author profiles for authorship attribution. Proceedings of the Pacific
Association for Computational Linguistics, pp. 255-264.
Kniffka, H. 2007. Working in Language and Law: A German Perspective,
Basingstoke: Palgrave Macmillan.
Kredens, K. 2000. Forensic linguistics and the status of linguistic evidence
in the legal setting. Unpublished Ph.D. dissertation. University of
Ldz.
Leech, G. 1983. Principles of Pragmatics. London: Longman.

108

Chapter Six

Mikros, G. and E. Argiri. 2007. Investigating topic influence in authorship


attribution. Proceedings of the International Workshop on Plagiarism
Analysis, Authorship Identification, and Near-Duplicate Detection, pp.
29-35.
Olsson, J. 2009. Word Crime: Solving Crime through Forensic Linguistics.
New York and London: Continuum International Publishing Group.
. 2008. Forensic Linguistics. New York and London: Continuum
International Publishing Group.
Sanderson, C. and S. Guenter. 2006. Short text authorship attribution via
sequence kernels, Markov chains and author unmasking: An
investigation. Proceedings of the International Conference on
Empirical Methods in Natural Language Engineering, pp. 482-491.
Morristown, NJ: Association for Computational Linguistics.
Solan, L.M. and P.M. Tiersma. 2005. Speaking of Crime: The Language of
Criminal Justice, Chicago and London: Chicago: The University of
Chicago Press.
Stamatatos, E. 2009. A survey of modern authorship attribution methods.
Journal of the American Society for Information Science and
Technology, Volume 60, Issue 3, pp. 538556.
. 2006. Ensemble-based author identification using character n-grams.
Proceedings of the 3rd International Workshop on Text-Based
Information Retrieval, (TIR06), pp. 41-46.
Svartvik, J. 1968. The Evans Statements: A Case for Forensic Linguistics.
Gothenburg Studies in English, 20.
Tiersma, P. M. 2009. What is language and law? And does anyone care?
In F. Lorz, A. and D. Stein, (eds.) Law and Language: Theory and
Society. Loyola-LA Legal Studies Paper No. 2009-11.

Vous aimerez peut-être aussi