Académique Documents
Professionnel Documents
Culture Documents
Lauren Porter
Introduction
The aim of this research is to use corpus analysis to identify different linguistic
cultural representations of the concept of love, as expressed using the word love, in
British and American Literature. Using both quantitative and qualitative analysis, this
research will examine the different representations of love in both F. Scott Fitzgeralds
The Great Gatsby (published in 1925), and Jane Austens Pride and Prejudice (published
in 1813). While universal ideas and realities (such as the abstract representation and
actual physical expression of love), exist across all cultures, cultures express these
universal ideas in different ways. That said, this research aims to answer the question,
linguistically represent love differently? If so, how? In more general terms, this study
aims to elicit linguistic cultural differences that exist with shared abstract ideas, as
These two texts were chosen for corpus analysis as they are both well-read
literary classics, which are representative not only of the time period in which they were
written, but are also widely-read still today. Additionally, both texts share the themes of
love, class, and courtship, which allows for overlap of themes between the two texts. In
order to understand the results of the corpus analysis, and quantitative and qualitative
findings from it, it is important to understand a brief background and context of each
novel.
F. Scott Fitzgeralds The Great Gatsby was published in 1925 in America. The
book follows Nick Carraway, as he moves East to Long Island to work in the bonds
business. There, he befriends Jay Gatsby, who is in love with Nicks cousin, Daisy. The
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 3
novel follows Nick as he participates in the New York social scene, and we learn of the
historical context, the book is set in 1920s America (the Roaring 20s), which was a time
of prosperity, wealth, jazz, extravagance, and prohibition before the Great Depression.
Many people valued wealth and affluence, and because of the economic boom,
Jane Austens Pride and Prejudice was published in 1813 in England. The novel
follows the Bennets, who have five daughters. The book follows the courtship of the
daughters by various suitors, but mainly follows two of the sisters- Jane and Elizabeth.
Elizabeth meets Mr. Darcy, whom she finds arrogant, but after a series of twists and turns
in the plot (and more proposals), she accepts his proposal and marries him.
In terms of historical context, this book was set in 19th century England, and was
written during the Romantic period, when literature was marked by the emphasis on
emotion, individualism, and nature. Some of these themes and feelings were a reaction to
the Industrial Revolution and modernization of England that was occurring during that
time.
Considerations
While this study aims to study cultural representations of love in both literary
works, this variable was not isolated in this study. The variable of culture exists because
one book is American and one is British, however there are two other significant
variables that need to be considered, and which couldve affected results. The first is the
date of publication, as the books were written in two different centuries. That said, corpus
analysis results may represent not only cultural differences, but also time-period
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 4
differences. A more accurate representation would be to choose two novels from the same
time period. The second important consideration is that one novel was written by a male,
and the other by a female. It is possible that gender differences also contributed to
variable may have affected the results. In order to isolate the variable of culture most
effectively, it would be pertinent to choose novels written in the same period as well as by
Literature Review
Stubbs (2001) provides a good introduction to corpus analysis- what it is, and how
it can be useful in understanding language. Stubbs (2001) describes that our interpretation
read or heard in the past. Stubbs says, This means that individual texts are interpreted
against an intertextual background of norms of language use. These norms, which are
analysis of large corpora (p. 304). Stubbs (2001) proceeds by describing how
comparisons using corpora can help to understand text cohesion, intertextual relations,
and the extent to which linguistic competence includes knowledge of norms of language
use (collocations). In regard to the current study, Stubbs information is helpful mostly in
terms of intertextuality. While the corpus of these two novels was not compared to a more
general corpus, they were compared to each other, and corpus allowed for the comparison
(intertextually) of not only collocations, but many other linguistic features, using the
Corpus analysis has many applications, and was used by Baker et al. (2008), in
news articles. The study used collocations and concordance analysis to identify
representative texts to carry out qualitative analysis on the topic as well. This article is
helpful for this study because the researchers combined quantitative and qualitative
analyses, as this study does as well. The studies differ as the present one does not
power, ideology, and domination. Instead, the current study focuses on the cultural
representation of the abstract concept of love, but is not concerned with power or
ideological relations.
The Baker et al. (2008) study is helpful for the current one because one of the
research questions was, what attitudes towards RASIM emerge from the body of UK
newspapers seen as a whole? (Baker et al., 2008, p. 276), which indicates that
quantitative data was used to make qualitative inferences, similar to the current study.
The research also aptly describes that while corpus linguistic methods allow for a
reasonably high level of objectivity, that researcher subjectivity via subjective researcher
input is typically involved in every stage of the analysis (Baker et al., 2008, p. 277). This
is important to note because, during the transition from quantitative to qualitative analysis
(or the qualitative implication of quantitative data), there is subjectivity on the part of the
researcher in determining what the quantitative data might mean, qualitatively. Like the
present study, Baker et al. (2008) supplemented collocation findings with concordances,
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 6
which allow the researcher and analyst to see the words in a larger context. As Baker et
al. says, Concordance analysis affords the examination of language features in co-text,
while taking into account the context that the analyst is aware of and can infer from the
context (2008, p. 279). Viewing concordances within larger contexts was important for
this study, and, as Baker et al. wrote, helped with making inferences. Once again, it
should be noted that inferences are subjective by nature, and thus subjectivity is a part of
this study.
A shortcoming of the Baker et al. (2008) study was that while the corpus
linguistics used the whole corpus, because of time and money constraints, the critical
discourse analysis was not able to use the whole corpus. Instead, the researchers had to
choose a sample of texts from the corpus to use for the critical discourse analysis.
However, given that the current study is not concerned with critical discourse analysis,
Corpus analysis a large area of applied linguistics, and in addition to being used
alongside critical discourse analysis, has been used to analyze literature previously, as
well. Fischer-Starcke (2010) has used corpus linguistics in literary analysis, specifically
with Jane Austen and other contemporaries of the author. Fischer-Starckes (2010) book
provides an introduction to corpus analysis and shows its application in the corpus
analysis of literary texts, specifically of Austens novel Northanger Abbey, corpora of her
other novels, and corpora of texts that are Austens contemporaries. The analysis focuses
on the impact of quantitative keywords, phraseological units, and frequent words as they
helpful because it is another example that demonstrates corpus linguistics wide range of
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 7
applicability of use. Additionally, it demonstrates that there is interest in the work of Jane
Austen outside of this study, which indicates that is an area where, though research has
begun, more can be added. Similar to Baker et al. (2008), Fischer-Starcke (2010) also
addresses issues regarding subjectivity and objectivity in corpus analysis, which supports
this study because a consideration of the current research is fallibility of results that
subjectivity can contribute to. Unlike Fischer-Starcke (2010), this research will not focus
on literary meanings or structural organization. Instead, the current study does quite the
opposite, as it aims to focus on cultural meanings that existed outside of the text.
The study used corpus analysis to reveal meaning in fiction, whereas previous studies
targeted non-fiction work. Fischer-Starckes (2009) study used keywords and frequent
phrases in Pride and Prejudice to reveal literary meanings that were not apparent with
this study is that it provides evidence for a potential of corpus analysis in literature. This
is important to the current study, as corpus analysis was not only used with two pieces of
fiction, but helps bring credibility to the use of corpus analysis with fiction work.
However, unlike Fischer-Starcke (2009), the current study is not concerned with literary
meaning, and also aims to compare two works of fiction, instead of the focus on one.
This study contributes to the field of corpus analysis with fiction work as analysis
is carried out with two pieces of fiction. The research is new because the corpus analysis
is not intended to be used for literary analysis. Instead, this work combines and bridges
the fields of literature and sociolinguistics via the use of corpus analysis.
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 8
Method
This study uses corpus data collected from Jane Austens Pride and Prejudice and
F. Scott Fitzgeralds The Great Gatsby in order to identify cultural representations of love
via the use of the word love in both works. In order to achieve this, both quantitative
Using the corpus software AntConc, both texts were analyzed for the total
frequency of the lexical item love in both texts. Then, the software was used to
determine n-grams of love within each text. Afterward the software was used to locate
collocations of love, with a minimum frequency of 2 times in the novel, and with a
range of 3R-3L, in both novels. Finally, each instance of love was located in its
concordance line, and larger context, for both texts, and this is where the data was
analyzed qualitatively because this is where the lexical item love was able to be
This qualitative analysis was coded based on the quantitative data, which derived
from the collocations and concordance lines of love. The data was coded based on the
collocates parts of speech. The parts of speech were coded in any position within the
collocation. In other words, it did not matter where in the range of the collocation the
frequent collocate occurred, it was analyzed based on its frequency. For ease of coding,
these parts of speech were put into three categories: 1) article or preposition 2) pronoun,
lines were analyzed in relation to the larger context of the paragraph in which they
occurred. This allowed for a qualitative interpretation of the lexical item in context of the
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 9
literature, and then this information was used in order to make inferences about the
For this project, both texts were located first as PDFs on the web and then
Results
Tables 1-5 (below) display the results from the corpus analysis for frequency, n-
grams, and collocations. Tables 6-7 (below) display the quantitative data that was coded
and then used for qualitative analysis. For preservation of space and ease of reading, the
Table 1 (below) shows the total frequency of love in both works, as well as the
normed frequency of love, which allows for a comparison of the frequency of use in
each text.
Table 1.
Frequency of love
Table 2
N-Gram Frequency
love with 4
love to 3
love you 3
love belongs 1
love daisy 1
love every 1
love her 1
love him 1
love himpossibly 1
love it 1
love nest 1
love new 1
love through 1
love, but 1
love, nick 1
love, nor 1
love, of 1
Table 3 (below) presents the n-grams/clusters of love in Pride and Prejudice.
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 11
Table 3.
N-Gram Frequency
love with 17
love to 5
love him 4
love of 4
love, and 4
love in 3
love, i 3
love and 2
love as 2
love before 2
love her 2
love me 2
love; and 2
love; for 2
love a 1
love by 1
love can 1
love each 1
love for 1
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 12
love it 1
love merely 1
love mr 1
love must 1
love now 1
love or 1
love which 1
love without 1
love you 1
love!" "i 1
love' is 1
love, ardent 1
love, flirtation 1
love, from 1
love, has 1
love, it 1
love, rather 1
love, ring 1
love, should 1
love, tell 1
love, their 1
love, though 1
love," said 1
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 13
love. as 1
love. of 1
love. wherever 1
love." "it 1
love." "was 1
love; but 1
love? is 1
love?" "i 1
love?" "oh 1
Table 4 (below) shows the collocations with love in The Great Gatsby, with a
Table 4.
i 13 10 3
you 9 3 6
in 6 6 0
and 6 3 3
to 5 1 4
with 4 0 4
of 4 2 2
me 4 0 4
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 14
t 3 2 1
she 3 1 2
your 2 1 1
wife 2 1 1
too 2 0 2
their 2 2 0
the 2 1 1
more 2 1 1
it 2 1 1
her 2 0 2
had 2 2 0
gatsby 2 1 1
but 2 0 2
all 2 0 2
Table 5 (below) shows the collocations with love in Pride and Prejudice, with a
Table 5.
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 15
in 42 36 6
of 24 14 10
to 19 10 9
i 18 8 10
with 17 0 17
you 15 4 11
and 15 2 13
her 14 5 9
my 11 9 2
much 10 9 1
the 9 4 5
be 8 5 3
as 8 2 6
him 7 1 6
for 7 3 4
but 7 5 2
very 6 5 1
that 6 4 2
not 6 4 2
is 6 2 4
so 5 3 2
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 16
it 5 1 4
his 5 5 0
he 5 1 4
from 5 2 3
been 5 3 2
all 5 3 2
a 5 1 4
was 4 2 2
mr 4 1 3
me 4 0 4
love 4 2 2
violently 3 3 0
they 3 2 1
s 3 2 1
really 3 3 0
must 3 1 2
if 3 2 1
friend 3 0 3
fall 3 3 0
darcy 3 0 3
world 2 1 1
well 2 0 2
though 2 0 2
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 17
there 2 0 2
their 2 1 1
than 2 0 2
still 2 1 1
should 2 1 1
other 2 0 2
or 2 0 2
one 2 0 2
object 2 1 1
now 2 1 1
nothing 2 2 0
no 2 1 1
may 2 1 1
make 2 2 0
lydia 2 2 0
herself 2 2 0
have 2 1 1
half 2 1 1
had 2 0 2
falling 2 2 0
everything 2 1 1
each 2 0 2
can 2 1 1
CORPUS ANALYSIS OF LOVE IN BRITISH & AMERICAN LIT. 18
by 2 0 2
better 2 0 2
being 2 2 0
before 2 0 2
at 2 1 1
ardent 2 1 1
The qualitative analysis was conducted using the data from each contextual
instance of love in both texts. The coding of the concordances, which was used for the
qualitative analysis, is presented in Tables 6 and 7 (below) for each text. Table 6 and 7
represent the negotiated data based on two raters. Two raters were used for the coding of
this data to account for inter-rater reliability. The first rater is a candidate for a Masters in
English, and the second rater has a Masters in Microbiology, but is well-read and well-
speech, or because certain words can be classified into different parts of speech based on
usage.
Table 6.
Frequency 33 39 6 0
Table 7.
Discussion
research has indicated. That said, this discussion is a combination of objective data and
Pride and Prejudice uses love approximately 1.5 times more frequently than
The Great Gatsby (see Table 1). In general, this means that Jane Austens work has love
as more of a central theme than The Great Gatsby. While this could be attributed to the
authors gender or time period in which the novel was published, culturally speaking, the
British representation of love with the use of love is more frequent than the American
representation of love. This finding is not too surprising, as Pride and Prejudice is
centered on the courtship and engagements of multiple characters, while The Great
Gatsby approaches love in a different way. The Great Gatsby examines love more in
regard to peoples aspirations- many of them aspired money and wealth, and prioritized
In terms of the collocations, it can be seen that in The Great Gatsby, love
collocates most frequently with articles and prepositions and pronouns, nouns, and
possessives. In fact, only 8% of the collocates in The Great Gatsby are adverbs,
adjectives, or verbs. This could mean that love is expressed with less vigor and emotion
in The Great Gatsby (and consequently in American culture) than in British culture.
Instead, love is seen as something directly attached to someone (i.e. my love, your
Pride and Prejudice displays a more even range of collocates, and a percentage
that is twice as high (16%) for collocates that were coded as adverbs, verbs, or adjectives.
This can be taken to mean that there is more emotion, passion, and description in the love
that is represented in the novel (and consequently in British culture). Examples include
ardent, violently, and very. These are engaging and inspiring words, which would
Overall, as previously mentioned, there are many considerations with this study in
regard to the novels abilities to represent a cultural expression of love. While time period
and authors gender need to be factored in, so does the fact that literature cannot claim to
be representative of a culture as a whole. However, these novels are classics, and because
they have maintained popularity, they are one window through which to view both
American and British cultural representations of love. If nothing else, this work
demonstrates how different works of literature (and different cultures) can represent the
same idea in very different ways, and how corpus analysis can be used as a tool to
References
Doi: 10.1177/0957926508088962.
Fischer-Starcke, B. (2010). Corpus linguistics in literary analysis: Jane Austen and her
contemporaries. Continuum.
Fischer-Starcke, B. (2009). Keywords and frequent phrases of Jane Austens pride and
523.
Fitzgerald, F. S. (1925). The great gatsby. New York: Simon & Schuster.
Stubbs, M. (2001). Computer-assisted text and corpus analysis: Lexical cohesion and
communicative competence. In D.S. Editor , D.T. Editor, & H.H. Editor. The