Vous êtes sur la page 1sur 16

PROJECT FOR THE SOCIOLINGUISTIC STUDY OF SPANISH

FROM SPAIN AND AMERICA (PRESEEA)


A CORPUS WITH A GRAMMAR AND DISCOURSE BIAS

Francisco Moreno-Fernández
University of Alcalá – Instituto Cervantes. Spain

1. INTRODUCTION

The very ambitious aim of gathering a sociolinguistic Corpus of spoken Spanish already
exists and it is called "Project for the Sociolinguistic Study of Spanish from Spain and
America" (in Spanish, Proyecto para el Estudio Sociolingüístico del Español de España
y de América).1 The acronym PRESEEA (Sp. presea ´jewel; present’) tries to express
the project’s general goal: to become something as valuable for the forthcoming
knowledge of the Spanish language, as useful for the people concerned with its study.
The goal is to coordinate sociolinguistic researchers from Spain and the Hispanic
America in order to make possible comparisons between different studies and materials,
as well as a basic information exchange.
The project’s basis is collaboration: one offers his/her own information to receive
information from other researchers. It is necessary to collect spoken language materials
from a community according to a previously determined methodology, in order to
receive materials gathered in other areas with the same method. This sociolinguistic
project requires a coordination that is settled in the University of Alcala (Spain).2
Universities and institutions contributing information and spoken language materials
constitute associate centers.
Materials provided by the associated centers, following a general guideline, would
constitute PRESEEA Corpus. In order to create a spoken language Corpus with
sufficient guarantees and rich in terms of linguistic information – phonetics, grammar
and discourse – it would be necessary to attend the following tasks:

1. A basic sociolinguistic methodology. Associated centers are committed to


collecting sociolinguistic materials according to that methodology; only in this way the
gathering of homogenous samples could be guaranteed and, therefore, comparable.
2. Materials’ edition and publication. All linguistic materials gathered by
associate centers would be edited in a previously determined way.

1
In April 1993, within the 10th International Conference organized by the Latin American Association of
Linguistics and Philology (ALFAL), a meeting of its “Commission of Sociolinguistics” was held. It was
decided to start a research project for the sociolinguistic study of the Ibero-American and Iberian
Peninsula’s cities. The ALFAL “Commission of Sociolinguistics” decided to start up a project including
three activities: 1ª.- To build up a sociolinguistic Database for Latin America and Iberian Peninsula (in
Spanish and Portuguese). 2ª.- To create a Spanish Language Sociolinguistic Corpus (PRESEEA). 3ª.- To
create a Portuguese Language Sociolinguistic Corpus (PRESOPO). The first phase of the project will
finish in 2010.
2
The PRESEEA coordinator assume the following commitments for the Spanish sociolinguistic corpus:
1ª.- To establish contact with centers interested in participating in PRESEEA.
2ª.- To distribute information on the basic sociolinguistic methodology to follow by the associate centers.
3ª.- To render technical and methodological assistance.
4ª.- To develop the necessary instruments for communication between the project’s researchers.
1
In the following pages methodological guidelines for gathering and editing of
PRESEEA sociolinguistic Corpus will be introduced. At the same time, it will be
explained why the project has a grammar and discourse bias.

2. BACKGROUND ISSUES

Knowledge of the Spanish language, about its use in Spain, in the American continent
or in the African and Asian Spanish-speakers territories, has reached an unusual
dimension during the last half century and especially during the last twenty years. In
1964 Lope Blanch affirmed:

No creo que resulte excesivamente exagerado afirmar que el "español de


América" sigue siendo un ilustre desconocido. Ya el mismo nombre, de
aplicación global, con que se designa indiscriminadamente a tantas y tan
diferentes modalidades del habla hispánica — español de América — es buena
prueba del estado de ignorancia relativa en que nos encontramos.

I do not believe that it is excessively exaggerated to affirm that "Spanish of America" continues
being an illustrious stranger. Even the name in itself, of global application, -- Spanish of
America -- used to designate indiscriminately so many and so different Hispanic speech
modalities, it is a good demonstration of the relative ignorant state in which we were.

But the moans coming from several generations of linguists, because of the lack of first
hand information about America, Guinea, The Philippines, or about the urban speeches
of Spain, have weakened. Some of those linguists decided to go from the complaint to
action, from the libraries’ shelves to questionnaires, recordings and documents. Thanks
to these works, the Hispanic Geolinguistics and Sociolinguistics have experienced a
worthy advance in recent years.
In the field of Hispanic linguistic geography, the publication of atlases has been
continuous throughout the second half of the 20th century, in spite of the enormous
deficiencies remaining from first half of the century. In Spain, after the first volume of
the Atlas Lingüístico y Etnográfico de Andalucía, some important works appeared; the
first volume of Atlas Lingüístico de la Península Ibérica, the Atlas Lingüístico y
Etnográfico de las Islas Canarias, Atlas Lingüístico y Etnográfico de Aragón Navarra y
Rioja, Léxico de los marineros peninsulares, Atlas Lingüístico y Etnográfico de
Cantabria, Atlas Lingüístico y Etnográfico de Castilla-León, all of them by Manuel
Alvar, and the on-line version of Atlas Lingüístico (y etnográfico) de Castilla-La
Mancha, by García-Mouton and Moreno-Fernández. Contributions to the Atlas
Linguarum Europae, Atlas Lingüístico del Mediterráneo, and Atlas Linguistique Roman
must be added to those works.
In the Hispanic America, numerous geolinguistic projects have been conducted; to the
old Navarro’s map collection (El español en Puerto Rico), it is necessary to sum up the
only volume of Atlas Lingüístico y Etnográfico del Sur de Chile, the complete Atlas
Lingüístico y Etnográfico de Colombia and Atlas Lingüístico de México, in addition to
other in course works, like Atlas Diatópico and Diastrático de Uruguay. But no doubt
the project allowing for the first time to obtain a general linguistic landscape of all the
Spanish-speaking America is the Atlas Lingüístico de Hispanomérica, that so far is
integrated by the works El español en el Sur de los Estados Unidos, El español en la
República Dominicana, El español en Venezuela, El español en Paraguay, all of them
by Manuel Alvar, and other forthcoming volumes.
2
The Sociolinguistic research developed in the last thirty years took the concept of
“Sociolinguistics” in its wider sense and embraced all the works concerned with the
relationship between language and society. Hispanic Sociolinguistics would include the
next research lines:

a) Social Dialectology. The earliest study, it developed as a prolongation of


dialectological studies and represents a great tradition and prestige within the
Hispanic world. In outline, these studies take care of diverse social factors, and
work with techniques of data collection likely used in Geolinguistics. Within this
scope, the works by Manuel Alvar in Las Palmas de Gran Canaria’s speech are
included, as well as studies handling sociolinguistic material from regional
atlases. Research conducted by Jose Pedro Rona may also be included in this
category. Both authors contributed significantly to the later development of
Hispanic Sociolinguistics.
The project for the study of the literate norm of the main cities of Ibero-America
and the Iberian Peninsula (PILEI project), promoted by Lope Blanch in the mid-
1960s is a foremost expression of Social Dialectology. This great project has led
to the publication of materials of accurate speech from several cities; Madrid,
Seville, Lima, Santiago de Chile and San Juan de Puerto Rico, among others. As
a part of the project, a questionnaire was also used in the surveys to get lexical
information from literate urban speakers. It must be considered, however, that, in
spite of the importance of gathering a vast quantity of comparable materials and
the works derived from them, data offered by this magnificent project apply only
to the literate, accurate and educated uses and have, therefore, a very limited
sociolinguistic value.

b) Sociology of Language. Studies by Fishman and Gumperz, as well as Charles


Ferguson’s ideas, have had an important effect in the Hispanic world,
specifically regarding the study of bilingual situations. In Hispanic-America,
Sociology of language has primarily focused on the relations between Spanish
and indigenous languages. Assuming that any selection is unfair, studies by
Yolanda Lastra on Mexican bilingual situations or by A. Escobar on
bilingualism in the Andean regions deserve a special mention. No comment this
time about many publications on Sociology of language, in general, and on
bilingualism, in particular, taking care of Spanish in the United States.

c) Variationist Sociolinguistics. This research field follows the sociolinguistic


guidelines marked by William Labov and his collaborators in the USA and
Canada. Notables such as Humberto López-Morales, Carmen Silva-Corvalán,
and Beatriz Lavandera are among the Hispanic pioneers in this field. The list of
researchers has grown in an important way, with those focused on Caribbean
Spanish (Orlando Alba), on Argentinean and Uruguayan Spanish (for example,
Adolfo Elizaincín), and on Peruvian Spanish (Rocío Caravedo). Most of the
variationist works deal with phonetic features, although grammar analyses have
been elaborated as well, especially on flexional morphology.

In general terms, Hispanic Sociolinguistics has arisen from Dialectology schools


and it includes the culture of the Social Dialectology. Later, the influence of
North American research biased the methodology of many specialists, causing

3
them to vacillate between Variationist Sociolinguistics and Sociology of
language.

Everything reviewed up to now, specifically regarding Geolinguistics and the


Sociolinguistics, lead to the following conclusions:

1. An abundance of studies and research surfaced in the second half of


the twentieth century. For the first time, researchers have a serious and
sufficiently wide knowledge of the reality of the Spanish in Spain and
America.

2. Completed research projects and those in progress constitute a set of


antecedents that allow us to think from solid foundations, in a new
research line. The needed experience has already been achieved to start
other necessary projects with guarantees enough. It is true that not all
areas in the Spanish-speaking world have the resources or researchers to
support such projects; yet the means are available in other areas where
studies are conducted.

3. Sociolinguistic research thus far has primarily paid attention the


phonetic aspects of the Spanish language, and secondly to questions
relating to grammar or lexicon. Aspects relating to pragmatics and
discourse have only been analysed when specific data collection had
been gathered. No previous discourse analysis prepared from general
spoken language corpora exists.

4. The PILEI project, coordinated by Lope Blanch, is the clearest and


closest antecedent of the project introduced here, although the differences
are obvious: PRESEEA is not limiting the data collection and analysis to
the literate uses; it tries to know Spanish urban varieties in the most
profound way possible. The complete sociolinguistic study of Spanish
from Spain and America must be a step ahead for the research in
Linguistics.

Hispanic America is the most urbanized among the less developed regions in the world.
Practically three quarters of the population live in cities and it is expected that in the
2025 approximately 85% of the Hispanic American population will be urban. It assumes
that a detailed knowledge of the Spanish of America is necessary for sociolinguistic
study of its urban communities.

3.- METHODOLOGICAL ISSUES.

3.1 Speech communities

PRESEEA fits theoretical principles mostly based in the variationist sociolinguistic


proposals; those are our points of departure for theory (Moreno-Fernández 1998). The
project’s guidelines are very general; in other words, they are flexible enough to allow
its application in any Hispanic urban community according to certain conditions.
Specific characteristics for an urban community included in the PRESEEA’s scope

4
should not be very exclusive: it should be a Spanish-speaking urban community --
monolingual or bilingual with a population, or a part of it, traditionally settled in a place
and with certain sociological diversity.
Reasons to justify these minimum requirements are easy to explain. First of all, the
project’s general goal is to achieve a synchronic sociolinguistic Corpus of Spanish
language; PRESEEA’s speech communities may be monolingual in Spanish or
bilingual, although in this last case Spanish must be a frequent use language in the
community, and the bilingual speakers should be able to use Spanish in functionally
similar conditions to monolingual use. Obviously a suitable and complete study of
bilingual cities leads to the consideration of many other elements not receiving further
considerations right now: linguistic attitudes towards each language, sociostylistic
distribution and social functions of languages. In these cases, researchers directly
responsible for the study of each bilingual city will determine in what way the common
methodological criteria can be combined with some other specific criteria depending of
the cities’ particular profiles.
Also, it is convenient to work with communities with a well-established population to
ensure that a conscience of speech community exists with a well-known sociostylistic
configuration recognizable by their own speakers. On the other hand, in order to be
confident that the research effort produces results it is proposed to work with
communities offering an internal variety and sociological wealth.
PRESEEA’s methodological guidelines are applied in speech communities associated
with concrete cities. It is possible that, in many cases, a speech community exceeds the
limits of a determined city, but it is realistic to work on entities delimited with certain
objectivity. PRESEEA’s data collection can be done in any speech community fitting
the conditions already stated, although it is important not to banish the study of the
biggest cities of Spain and Hispanic America. In Table 1 the number of inhabitants of
the most populated cities in each Hispanic American is listed.

______________________________________________________________________
Buenos Aires (Ar) (1991) 2,965,403 (urban conglomerate: 10,911,403)
Córdoba (Ar) (1991) 1,179,067
La Paz (Bo) (1992) 711,036
Santa Cruz de la Sierra (Bo) 694,616
Santa Fe de Bogotá (Co) (1993) 5, 726, 957
Cali (Co) (1993) 1,783,546
San José (CR) (1992) 302,574
Alajuela (CR) (1991) 158,276
La Habana (Cu) (1989) 2,077,938
Santiago de Cuba (1990) 405,354
Santiago (Ch) (1992) 5,180,757
Concepción (Ch) (1992) 330,448
Guayaquil (Ec) (1990) 1,508,844
Quito (1990) 1,100,847
San Salvador (ES) (1992) 422,570
Santa Ana (ES) (1992) 202,337
Madrid (Es) (1991) 3,084,673
Barcelona (Es) (1991) 1,681,132
Ciudad de Guatemala (Gu) (1995) 1,167,495
Escuintla (Gu) (1995) 123,048
Tegucigalpa (Ho) (1989) 608,100

5
San Pedro Sula (Ho) (1988) 321,197
México D.F. (Mé) (1990) 8,235,744 (Ciudad de México: 18 747 400)
Guadalajara (Mé) (1990) 2,178,000
Managua (Ni) (1985) 682,111
León (Ni) (1985) 100,982
Ciudad de Panamá (pan) (1990) 584,803
San Miguelito (Pan) (1990) 243,025
Asunción (Par) (1992) 502,426
Ciudad del Este (Par) (1992) 133,893
Lima (Pe) (1993) 6,434,328 (incluye Callao)
Arequipa (Pe) (1993) 633,428
San Juan (PR) (1990) 437,745;
Bayamón (PR) (1990) 220,262
Santo Domingo (RD) (1989) 2,200,000
Santiago de los Caballeros (RD) (1989) 467,000
Montevideo (Ur) (1985) 1,311,976
Salto (Ur) (1985) 80,823
Caracas (Ve) (1990) 1,822,465 (área metropolitana: 2,784,042)
Maracaibo (Ve) (1990) 1,363,873
Table 1.- Most populated cities and number of inhabitants. Source: Almanaque mundial
1996, Editorial Televisa, 1995.

The list in Table 1 is just a general frame regarding the type of city fitting the
PRESEEA guidelines. It does not mean other communities with a smaller size cannot be
studied through this sociolinguistic methodology. So far (January 2004) the Hispanic
communities incorporated in PRESEEA to bring on spoken language samples are the
following (Table 2):
______________________________________________________________________
Argentina
Neuquén (Patagonia)
Colombia
Barranquilla (Caribean coast)
Bogotá
Guatemala
Guatemala
México
Culiacán (Sinaloa)
México DF
Puerto Rico
San Juan de Puerto Rico
Spain
Alcalá de Henares (Madrid)
Cádiz (Andalucia)
Las Palmas (Canary Islands)
Lérida (Catalunya)
Madrid
Málaga (Andalucia)
Valencia
Zaragoza (Aragon)
Venezuela
6
Caracas
_____________________________________________________________________
Table 2.- Hispanic cities incorporated to PRESEEA (jan. 2004). Visit
<http://www.linguas.net/preseea>

The importance of urban centers becomes of special importance in South America


because, for instance, the urbanization rate in Venezuela is 91%, in Uruguay 89%, and
in Buenos Aires 86%. The population of large cities is continually increasing rapidly
and gathering most of the population in each country. Such areas are home to a highly
diverse demographic with various levels of integration within urban life. This is why
problems arise in identifying the cities’ typical features, distinguishing them from the
more anecdotal or circumstantial aspects, although they could turn into identifying signs
in the future. The task is to define a proper concept of the “bogotano” speaker,
“caraqueño” speaker, “madrileño” speaker, or “sanjuanero” speaker. In this respect,
although PRESEEA clearly shows preference for typically urban spoken Spanish,
researchers responsible for each team decide what is essential and what is circumstantial
or accessory in each community. They decide which is the absolute and which one the
relative universe to handle in the surveys, attending the common methodological
prescriptions and they make all kinds of decisions relating to the more open aspects in
the methodology. Like a general reference, it is proposed that speakers should have
been born in the city, have arrived there before they were ten years old or have lived
there for more than twenty years, as long as its linguistic origin was not noticeably
different.

3.2- Sampling

Samples must be representative of the universe which sociolinguistic study is focusing


on. Since the communities in which PRESEEA methodology is potentially applicable
are vast and diverse, a prototype sample, as a primary target, has to allow the speech
collection using some common parameters from a sociological and stylistic point of
view. This implies to work on just one type of relative universe shared for all the
Hispanic speech communities, guaranteeing possible comparison between spoken
language materials. Local researchers are freely able to increase the profile of this
relative universe, attending their particular interests, and to work with wider relative
universes.
The general proposal is to prepare samples by quotas with uniform affixation. They
consist of dividing the relative universe in subpopulations, layers or quotas -- taking
care of determined social variables -- and in assigning an equal number of informants to
each one of those quotas. This system is optimal as opposed to a random or a
probabilistic sample because quotas allow an easier statistical comparison between them
and between different samples. In addition, researchers commit themselves to look for
speakers beyond their influence circles or interview convenience.
It is reasonable to create quotas for three social variables: gender, age, and educational
level. In a post-stratification stage, it would have taken care of other factors, like
profession, economic incomes, or housing conditions, for instance. In a provisional and
experimental way, the possibility of handling the "way of life" variable deserves some
consideration. All information about these secondary variables would be collected
through a questionnaire from each speaker. For the “way of life” variable, information

7
would be collected through questionnaires in addition to interviews. Crossing data from
those variables, it would be possible to work with another post-stratification variable:
socio-cultural level.
The proto-type sample is proposed in the following table.

___________________________________________________
Generation 1 Generation 2 Generation 3
M W M W M W
___________________________________________________
Educ. Level. 1 11M 11W 12M 12W 13M 13W
___________________________________________________
Educ. Level. 2 21M 21W 22M 22W 23M 23W
__________________________________________________
Educ. Level. 3 31M 31W 32M 32W 33M 33W
___________________________________________________
Table 3.- Proto-type sample by quotas. (M: men; W: women)

In relation to the sample size, it is reasonable to count four speakers for each cell in
Table 3, given a uniform affixation is proposed. This sample size would consistently be
72 informants, which represents 1/25000 proportion for a city of around two million
inhabitants and is much more suitable for those of a smaller population.3
As said before, social variables used to divide the universe are gender, age, and
educational level. All of them allow a quantitative sociolinguistic processing (Moreno-
Fernández 1998): regarding the convenience and interest of working with the age
variable, it is not possible to comment on anything beyond arguments given in
sociolinguistic literature -- it is simply an essential variable in any work of this field. It
is proposed to distinguish three generations: 20 to 34 years; 35 to 54 years; and 55
years and older. In this respect, it is important to take into account that life expectancy
in Hispanic America oscillates approximately between 60 and 75 years.
In regard to inclusion of "gender" and "educational level" in the samples, it is necessary
to remember that very few sociolinguistic studies have been conducted without them,
although gender is a factor of little explanatory capacity in a good number of analyses.
In order to facilitate comparisons with results from tens of researches, it seems suitable
to maintain gender as a stratification factor.4 Alternately, post-stratification variables
allow comparisons with results of previous researches and they are useful as reference
points. Variants (or factors) proposed for those post-stratification variables (or group of
factors) are incomes,5 housing conditions,6 and profession.7

3
Only half a dozen of cities included in Table 1 would be below the usually considered canonical level of
representativeness (0,025), although it is true that they use to be the cities with a bigger sociolinguistic
prestige and socio-economic weight (Buenos Aires, Lima, Madrid, México, Santa Fe de Bogotá and
Santiago de Chile). The criterion of local researchers is decisive in order to collect materials through
partial studies (the most representative districts, the most populated neighbourhoods, ...) or increasing to
five (90 informants) or six (108) the number of speakers by sample’s quota. In similar terms, for urban
communities with a number of inhabitants lower than 500000, the number of speakers by quota could be
reduced to three, so the sample could consist of 54 informants (1/9250).
4
Variants distinguished in the variable "educational level" are the following ones: 1. Illiterate, without
studies; primary education (up to 10-11 years old approx.); approximately 5 years of schooling; 2.
Secondary education (up to 16-18 years old approx.); approximately 10-12 years of schooling; 3. Higher
education (university, college) (up to 21-22 years old approx.); approximately 15 years of schooling.
5
It is recommended to distinguish five categories, with exclusively local validity.
6
1, House with sanitary and access limitations; 2, Modest house or apartment; 3, Elegant and spacious
house or apartment, with many amenities.
7
1, Travelling pedlars,/hawkers and salesmen, no specialized urban workers, farmers, domestic service,
no specialized services; 2, Small retailers, secretaries and clerks, specialized workers, craftsmen,
mechanics, salesmen in stores, collectors, technical assistants, policemen and guards, soldiers; 3,
8
The” Way of life" variable was introduced by Højrup and developed by James Milroy,
and it allows linking small dimension social networks to other structures or social
groups.8 Considering the utility that the concept of "way of life" may have in a project
like PRESEEA, it is important to think that the three typical ways of life for the Western
world are sufficiently common or regular to find them in practically all the Hispanic
speech communities. These ways of life gather some basic features of socio-cultural and
socio-economic levels handled in other studies and they can avoid several serious
problems, like the virtual non-existence of "middle-class". The hope, therefore, is that
they are explanatory variables for linguistic behavior. As a favorable argumentation for
the use of the "way of life" variable, it could be added that each research group is able
to include other different “ways of life” not included in other communities, but perhaps
indispensable in the study of certain societies. However, it is important that each feature
of the” way of life” , common and specific,– and the socio-cultural guidelines
associated to them, are described in a complete and detailed way. Essential information
to assign one speaker to a way of life or another could be collected using personal data
sheets and information gathered during the interviews.
Regarding to analysis of gathered materials, researchers can handle, as explanatory
variable, either independent post-stratification variables or these same variables
combined in sociocultural or socio-economic levels. It is not advisable to treat a set of
these as independent variables because the overlappings would be inevitable and the
quantitative analyses would be affected.
In order to conclude this chapter, it is important to highlight that all these
methodological criteria and norms are a minimum seeking equivalence or comparison
between materials proceeding from different research centers. Those are bases for a

University professionals, teachers of secondary and primary education, small industrialists and producers,
intermediate leaders, technicians, supervisors; 4, University professionals of free exercise, public and
private sector’s managers, military men with graduation, medium industrialists and producers, college
students; 5, High civil employees of the legislative, executive, and judicial authority, high officials of the
Army, great private industrialists, great landowners, executives directors of public and private sectors.
8
The way of life obeys to a model in which several ethnic groups or classes are represented as elements
internally structured and related to other groups. In this model, linguistic behaviour obeys more to the
networks’ power of determination and structures that to the attributes perceived as typical of certain social
groups. Besides there are networks with capacity to impose their sociolinguistic patterns to weaker others.
Priority is given to types of job and familiar activities, and to the speaker relations with other members of
the group, on certain characteristics or qualifying attributes. Groups are considered like a consequence of
fundamental structures of the society, dividing population in substantially different ways of life. The ways
of life proposed by Højrup and Milroy - and proposed by us to handle within our project in an
experimental, provisional, and absolutely voluntary way - are the following ones:
Way of life 1.- Primary units of production (agriculture, fishing, small services). Cooperative relationship
among workmates. Family implied in the production. Self-employment. Little free time: the more one
works, the more one gains. Narrow social networks.
Way of life 2.- Work in a production system that is not controlled by workers. One works to gain a pay
and enjoy periods of free time. Labour relations separated from familiar scope. Certain work mobility.
Narrow networks of solidarity with companions and neighbours.
Way of life 3.- Qualified profession, able to control the production and to direct the works of other
people. Time of vacations dedicated to work. One works to ascend in the hierarchy and to acquire more
power. Competitive attitude with colleagues.
The ideological features that would characterize these ways of life would be "the family" for way 1, "the
leisure" for 2 way and "the work" for way 3. It must be valued, however, that the concept of "way of life"
is fundamentally structural; profile characteristics of a group are determined in contrast with those from
the other ways. On the other hand, relations between the three cultural ways of life and practices
associated to them do not have why to be exactly the same in all the countries, reason why, in a
contrastive study, it is important to describe them with detail.

9
common use, but nonexclusive; the aim is that local researchers feel free to go further
on methodological requirements: nothing stopping to carry out linguistic attitudes
questionnaires or another tests of different types. To increase the number of speakers by
quotas is also possible, as well as making recordings in different contexts or situations.
It is also legitimate to analyze 14 and 19 years old speakers, to proceed with ways of
life out of the previously described, or to include post-stratification variables in addition
to the predicted ones. There is no doubt, however, about the significance of materials
following the common methodological guidelines for PRESEEA.

4. DATA COLLECTION: LOOKING FOR GRAMMAR AND DISCOURSE


MATERIALS

Collection of PRESEEA materials is achieved through recorded interviews. Researchers


hold conversations with speakers having prearranged characteristics in appropriate
contexts within each speech community. In order to obtain a minimum stylistic
uniformity, to make a comparison possible, and to preserve usefulness of materials
proceeding from different speech communities, it is recommended that data collection is
made in an accessible place, perhaps representative of an institution with some level of
familiarity for the town’s people. For instance, it would be appropriate at a local office,
a school classroom, or a room in a cultural or leisure center. Otherwise, there is a risk in
obtaining interviews from heterogeneous places with different kinds of listeners, and
recordings with high levels of noise, which make it enormously difficult to form an
appropriate stylistic interpretation of the speech materials. It is true that journeys to
"official" or far away places, and distance from familiar or place of work surroundings,
can lead to a loss or reduction of speech spontaneity, but what is lost in this case could
be gained in terms of recording quality and stylistic homogeneity. It is a certain
probability that no sociolinguistic interview can be considered truly spontaneous.
Considering all these circumstances, people in charge of each research group have to
value whether these kinds of urban contexts are the most suitable places for interviews,
or whether the speakers’ homes or work places are appropriate for the project’s aims.
PRESEEA Corpus materials are gathered through interviews held and recorded with at
sight tape recorder. Recordings are made with the best quality means available in each
circumstance (opened tape, cassette tape recorder, DAT, etc.). Conversations with each
informant last a minimum of 45 minutes, although 90 minutes are recommended.
During its development, researchers belong, whenever possible, to the same speech
community as the speakers and try not to interrupt the speakers. After the conversation
is recorded, the researcher verifies the quality of the tape and completes a questionnaire
to gather the speaker’s personal data having to do with the post-stratification variables.
One of the most important goals of PRESEEA project is to gather spoken language with
usefulness enough to proceed to the grammar and discourse analysis. The idea is to
present a Corpus with elements for the linguistic and sociolinguistic analysis beyond
phonetics and morphology. Nevertheless, difficulties related to the obtaining of variants
from the syntactic level are well known, not to mention the discourse variants.
“Variants” are interpreted as different ways to say the same things.
Thinking about the analysis of grammar variation, it is important to have in mind which
are the proper variables to be studied from a general Corpus of spoken language. Those
variables should be correlated to certain linguistic and social independent variables. In

10
spite of the problems related to this end, a list of morpho-syntactical variables has been
elaborated9 (Moreno, Cestero, Molina & Paredes):

1. - Presence and position of subject.


2. - Arguments’ order.
3. - Address forms, with special attention to the use of tú/usted and vosotros/ustedes.
4. - Verbal uses and values, with special attention to indicative – subjunctive uses.
5.- Semi–auxiliar verbs of epistemic, dynamic, and deontic modality: haber de/que,
tener que, deber (de), poder, ser capaz de.
6. - Variation in reflexive constructions (concordance and frequence)
7. - Uses of ser and estar.
8. - Uses of haber and estar.
9.- Direct and indirect style.
10.- Pleonastic Deictics
11.- “Leísmo”, “laísmo”, “loísmo”
12. - Clitics’ duplication.
13. - Clitics’ position.
14. - Direct and Indirect Objects’ position.
15.- Verbal concordance in impersonal uses.
16. - Impersonality (marks se, tú).
17. - Uses of haber and hacer (hay/hace niebla...)
18. - Use of periphrasis.

Different strategies can be used during the interview to elicit these kinds of variables.
Some of them appear along the conversations in a natural way so no special type of
question must be managed. Moreno, Cestero, Molina and Paredes explain that variables
with numbers 1, 2, 6, 9-14, and 18 fall in this side. Other variables, however, require the
existence of certain discourse type to appear more easily, given that its use can be
pragmatically conditioned. That is the reason why PRESEEA methodology proposes to
handle a list of themes or thematic modules along the 45-90 minutes of conversation.
Interviews are structured considering the next thematic modules:

1. Greetings
2. Weather
3. Place where one lives
4. Family and friendship
5. Customs
6. Danger of death
7. Important anecdotes in the life
8. Desire of economic improvement
9. Final

All that spoken language is supposed to be sufficient enough to allow grammatical and
discursive analysis regarding important sociolinguistic factors. Along the interview, the
modules’ order can vary according to the circumstances of the conversation. Some
directions for dealing with modules are set out next. The aim is that the set of materials
from different communities offer balanced samples for each module.
______________________________________________________________________
1. GREETINGS

9
See also the works by C. Silva-Corvalán (1994: 399-415) and P. Martín-Butragueño (1994: 29-75).
11
Introduction
How do you want to be addressed? (Sp. tú – usted – vos)
The truth is sometimes addressing people is a problem; you never know how to address some kind of
people? For example, how do you address your friends? What about if they are elder people? What about
a young and unknown person? How do you address an elder person, man or woman, for example to ask a
question, when you meet him/her in the street? What about your doctor? And what about not very well
known neighbors? How do you prefer to be addressed? What do you think when a younger person
addresses you using tú? It is a problem; I sometimes do not know what to do. (To ask in case they try to
avoid the asymmetric addressee system. To ask in what situations it seems better to him/her).
How are you doing? Are you excited?
In these days all we are a little (what is this word:) extrange, I believe that it is caused by the weather.
True?

2. WEATHER
Today it is cold/hot!
I do not like summer/winter. What do you prefer?
This year cold/hot/rain is harder than the last year, right?
I think the weather is changing, at least by this zone, what do you think? Why is that so?
Do you remember last years weather during this season? (rain, cold, wind, storms). What about last
winter/summer?
They say that the Earth’s climate is changing. What do you think will happen if there is less rain and
drought continues for several years?

3. PLACE WHERE ONE LIVES


Where do you live? How is your house? Could you describe it? What part of your home do you like
more? Why? It has been always thus or you have made some changes? How it was before?
Have you been living there for a long time? Where did you live before? (Place’s description)
Do you like to live here? Where would you like to live? Why?
How would you like to live in a/another big city?
Do you know many people in the nearby? How is your relation with them? How would you like your
neighbours were? What kind of relation would you like to have with them? What do you do to maintain or
to improve that relation?
I suppose this neighbourhood/area has changed a lot since you came to live here. Mine also: the people,
the buildings. I remember when I was a kid ... How has your neighbourhood changed? What does it have
now not existing before? How do you like more, now or before? What do you remember from your
childhood?
If your neighbourhood has changed, the city has changed much more. Do you remember how it was some
years ago? What differences do you see right now? Do you think those changes are good? Why? How
would you have preferred it? How do you think it will be in a few years from now?
Now there are more places to have fun and amusement. Right? What do you do when you go out to have
fun? How is your city with regard to fun? What would you like it was? What would you do if you could
organize things in your city for people having fun?
And, people of other ages? Your parents, your children..., what do they usually do when they go out on
weekends? Nevertheless, people is not going out very much at night due to the fear something unpleasant
happens (hold-ups, abuses, fights...) How do you see delinquency in the city? And what about your
neighbourhood? What do you think it could be ended?
Have your ever heard something about a crime in your neighbourhood/city? What happened?

4. FAMILY AND FRIENDSHIP


Are you married? How did you meet your husband/wife? Do you have/wish to have children? Tell me.
What about that experience? (For women) What do you think you had done if your life or your child’s life
were in danger? (For men) What would be your advise for your wife/sister pregnancy?
What do you think about problems like anorexia or euthanasia?
How is your son/husband/father... (Physical description)
How would you like they were?
Who is your better friend? What a friend is for you? How a person should be or how it is supposed a
person must be to become a good friend of you? What is the difference between childhood friends and
adult friends? What about your childhood and youth friends?
What do you do to live?
What would you like to work in? What would you like to study?

12
(In case of having a profession already) That is what you really wanted to do? Why did not you do
another thing? How do you imagine your life if you had been/done... What do you usually do in a normal
day? (Description of a normal day since one wakes up until one goes to bed)
Are you satisfied with your lifestyle? Why?
What will be your family/husband/children/parents doing right now?

5. CUSTOMS
Now holydays are coming, what do you usually do in Christmas/summer? What will you do on next
vacations?
The other important vacations are Christmas/summer. Right? They are special because of the food and
meeting the family. What is the typical food for Christmas in your region? How is it made? Do you know
how to cook? Do you think it is necessary to know how to cook? What is the typical food in your
region/city/country? How is it made? Why do you use such ingredients? When are they added?
Do you think Christmas is just/mainly a religious celebration?
What plans do you have for next holydays? If you could choose, what would you like to do?

6. DEATH DANGER
Travels are always a little bit scary, because of the accidents and the bad news you usually heard. Tight?
Have you ever been in death danger? What happened? (physical description of narrative characters and
places) What had it happened if... And if... What would you do if you are again in a similar situation?
What did some people taking part in the affair say?

7. IMPORTANT LIFE ANECDOTES


Tell about some of the more important or peculiar things happened to you, some robbery, some prize,
some special trip...
What do you think it should be done in a situation like that? What would you do if that happened again to
you or if you had the opportunity to return to the past?

8. DESIRE OF ECONOMIC IMPROVEMENT


Do you play lottery or any other game? Why?
I suppose you would like a big prize went to you. Right? What would you do if you earned 1 million
dollars?
What do you think when you hear somebody earn a big amount of money?

9. END
Finally... I believe all of us are lucky in certain way. Right? Let’s enjoy good things of life.
I should buy newspaper. Do you know where a newspaper store is? Could you explain to me how to reach
that place?
Good, it was a pleasure to talk to you. Thank you very much. I hope we may do it again. Right?
______________________________________________________________________

Handling of these thematic modules along the conversations is supposed to bring out
materials with usefulness enough to proceed to the grammar and discourse analysis.
From the discourse points of view, modules are thought to elicit different types of texts
such as descriptions, arguments, narratives, and evaluative commentaries, beside
conversation itself and many different acts of speech. Some of the most important relate
to address forms (nouns and pronouns), greetings, farewells, and tag questions, for
instance.
From the syntactical point of view, different modules may facilitate elicitation of
different grammar variables. As explained by Moreno, Cestero, Molina and Paredes,
PRESEEA interviews can collect data about verbal uses and values. All modules
include questions regarding: hypothetical situations, which are very useful to study
indicative-subjunctive/conditional variation; past events (present/past tense variation);
forthcoming events (present / future tense variation); with temporal, causal, or finality
references (clauses introduced by cuando, para que). At the same time it could elicit
verbal uses like haber de/que, tener que, deber (de), poder, ser capaz de to study their
meaning or their possible time and aspect marks. The uses of ser and estar elicited in
13
the Corpus by means of physical and personality descriptions, as well as the variation of
haber and estar (description of some places’ inside) are interesting points of the Spanish
language.. Other aspects are especially remarkable also, like verbal concordance in
impersonal uses, elicited while opinions are given, the use of impersonality, by means
of se or tú appearing frequently within the costumes module or when a recipe is
explained), or the use of haber and hacer (hay/hace niebla...) talking about the weather.
All these considerations are taken into account while interviews are in progress. That is
why it is affirmed that PRESEEA is a Corpus with a grammar and discourse bias,
without leaving the interest for other linguistic levels or sociolinguistic aspects.

5. CONCLUSION

Once recordings are made, the materials’ transcription and storage process is
achieved.10 Relating to the system followed to transcribe the recorded materials, it
seems logical to propose the use of an international system, admitted in industrial means
and used in a considerable number of countries. It is proposed that PRESEEA follows
the TEI (Text Encoding Initiative) international conventions.11 This transcription system
is introduced in the guide "Standard Generalized Markup Language" (SGML).12
Although the textual labels used to mark the materials of the spoken language could be
many, it is proposed to make use of only some of them. Distribution of information and
communication between PRESEEA groups are effective through the project’s website
(HTTP://www.linguas.net/preseea) and mailing list (preseea@colmex.mx).
At the end of these pages, it is important to insist on one of the PRESEEA signs of
identity: associate researchers can feel completely free to extending their objectives and
their study techniques, as long as the common guidelines are respected. The freedom is
absolute for analysis and interpretation of the linguistic materials. Through the
guidelines presented here it is possible to get in all the sociolinguistic tasks summarized
by Carmen Silva-Corvalán in 1992: description of phonetic, morphological and
syntactical variation in Spanish; study of linguistic changes in progress; description of
pragmatic elements and different discursive styles of Spanish.
The main goal of PRESEEA is to gather comparable materials, well transcribed, well
identified by their origin, in the best technical conditions and in the most effective way.
Beyond that, PRESEEA wants to be a way to narrow the relations between linguists

10
In order to obtain the fastest transcription possible and the easiest to correct, it is recommended
transcribing in ordinary spelling, using text processor (applications “Word”® or “Wordperfect”® are
recommended). Tape reproduction made trough a pedal control dictaphone is recommendable.
Transcriptions must be registered in ASCII (Texto DOS) and corrected by a minimum of two people.
11
TEI was born in a Conference organized by the Association for Computers and the Humanities held in
Poughkeepsie, New York, in 1987. It is the biggest international project to this purpose and the mentioned
association, as well as the Association for Computational Linguistics, the Association for Literary and
Linguistic Computing, the National Endowment for the Humanities, United States, the DG XIII,
European Union Commission, the Canadian Social Science and Humanities Research Council, and the
Andrew W. Mellon Foundation, sponsor it. The TEI’s goal of is to develop and to spread a well-defined
format in order to facilitate the text interchange between researchers interested in the processing of the
natural language.
12
It basically consists of a series of marks or labels of < > type reflecting the diverse features of the
transcribed texts. Each transcribed text must go preceded by a series of identification labels. The heading
labels are of three types. In the first place, those for general identification of electronic text appear.
Relating to the transcription, labels indicating the beginning of a structural element are inserted within <
>; labels marking the end of that element are inserted within </ >. The transcription itself must go
immediately preceded by the label <texto>. At the end the label </texto > appears.
14
proceeding from both sides of the Atlantic looking for a better knowledge of the
Spanish language.

REFERENCES

ALBA, O. (1990) Variación fonética y diversidad en el español dominicano de Santiago. Santiago:


PUCMM.
ALINEI, M. e. a. (1983) Atlas Linguarum Europae, 1, Assen, Van Gorcum.
ALVAR, M. (1952) Atlas Lingüístico y Etnográfico de Andalucía. Cuestionario, Granada,
ALVAR, M. (1961-1973) Atlas Lingüístico y Etnográfico de Andalucía, Granada, C.S.I.C.
ALVAR, M. (1963) Atlas Lingüístico y Etnográfico de Aragón. Cuestionario, Sevilla, C.S.I.C.
ALVAR, M. (1972) Niveles socio-culturales en el habla de Las Palmas de Gran Canaria. Las Palmas,
Excmo. Cabildo Insular.
ALVAR, M. (1974) Atlas Lingüístico de España y Portugal. Cuestionario, Madrid, C.S.I.C.
ALVAR, M. (1975-1978) Atlas Lingüístico y Etnográfico de las Islas Canarias, Las Palmas, Excmo.
Cabildo Insular de Gran Canaria.
ALVAR, M. (1979-1983) Atlas Lingüístico y Etnográfico de Aragón, Navarra y Rioja, Madrid, Inst.
Fernando el Católico, C.S.I.C., La Muralla.
ALVAR, M. (1984) "Proyecto de un atlas lingüístico de Hispanoamérica", Cuadernos
Hispanoamericanos, 409, pp. 53-68.
ALVAR, M. (1986-1989) Léxico de los marineros peninsulares, Madrid, Arco/Libros.
ALVAR, M. (1995) Atlas Lingüístico y Etnográfico de Cantabria, Madrid, Arco/Libros..
ALVAR, M. (2000) El español en el Sur de Estados Unidos. Estudios, encuestas, textos, Alcalá de
Henares, Universidad de Alcalá-La Goleta.
ALVAR, M. (2000) El español en la República Dominicana. Estudios encuestas, textos, Alcalá de
Henares, Universidad de Alcalá-La Goleta.
ALVAR, M. (2001) El español en Venezuela. Estudios, mapas, textos, Alcalá de Henares, Universidad de
Alcalá-La Goleta-AECI.
ALVAR, M. (2001) El español en Paraguay. Estudios, encuestas, textos, Alcalá de Henares, Universidad
de Alcalá-La Goleta- AECI.
ALVAR, M. and. A. QUILIS (1984): Atlas Lingüístico de Hispanoamérica. Cuestionario, Madrid,
Instituto de Cooperación Iberoamericana.
ANDER – EGG, E. (1987) Técnicas de investigación social, México, Ateneo.
ARAYA, G., C. WAGNER, C. CONTRERAS and M. BERNALES (1973) Atlas lingüístico-etnográfico
del Sur de Chile, I, Valdivia, Universidad Austral de Chile Andrés Bello.
Atlas Lingüístico y Etnográfico de Colombia (1982) Bogotá, Instituto Caro y Cuervo.
Atlas Linguistique Roman (ALiR) (1996), Volumen I Présentation; Atlas Linguistique Roman (ALiR),
Volumen I Cartes; Atlas Linguistique Roman (ALiR), Volumen I Commentaires, Istituto
Poligrafico e Zecca Dello Stato, Roma.
CARAVEDO, R. (1991) Sociolingüística del español de Lima, Lima, Pontificia Universidad Católica.
ELIZAINCÍN, A (1979) "Métodos en sociodialectología". Estudios filológicos. 14: 45-58.
ESCOBAR, A. (1978) Variaciones sociolingüísticas del castellano en el Perú. Lima.
FERGUSON, Ch. (1959) "Diglossia", Word, 15: 325-340.
FISHMAN, J. (1979) Sociología del lenguaje, Madrid, Cátedra.
GARCÍA FERRANDO, M. (1994) Socioestadística. Introducción a la estadística en sociología, Madrid,
Alianza Universidad.
GARCÍA MOUTON, P. and F. MORENO FERNÁNDEZ (2003) Atlas Lingüístico y etnográfico de
Castilla-La Mancha. Madrid. On-line version <http://www.uah.es/otrosweb/alecman>
GUMPERZ, J. and D. HYMES (1972) Directions in Sociolinguistics, New York, Holt, Rinehart and
Winston.
IDE, N. and C.M. SPERBER – MCQUEEN (1995) “The TEI History, Goals, and Future”, Computers and
Humanities, 29, 1, pp. 5 – 15.
LABOV, W. (1994) Principles of Linguistic Change I. Inner Factors, Oxford, Blackwell.
LABOV, W. (2001) Principles of Linguistic Change II. Social Factors, Oxford, Blackwell.
LASTRA, Y. (1992) Sociolingüística para hispanoamericanos. Una introducción. México, El Colegio de
México.
LAVANDERA, B. (1984) Variación y significado, Buenos Aires, Hachette.

15
LOPE BLANCH, J.. (ed.) (1977) Estudios sobre el español hablado en las principales ciudades de
América. México, UNAM.
LOPE BLANCH, J. (1986) El estudio del español hablado culto. Historia de un proyecto. México,
UNAM.
LOPE BLANCH, J. (1990-2000) Atlas Lingüístico de México, México, El Colegio de México.
LÓPEZ MORALES, H. (1983) Estratificación social del español de San Juan de Puerto Rico. México,
UNAM.
LÓPEZ MORALES, H. (1993) Sociolingüística, 2ª. ed., Madrid, Gredos.
MARTÍN BUTRAGUEÑO, P. (1994) “Hacia una tipología de la variación gramatical”, Nueva Revista de
Filología Hispánica, 41, 1, pp. 29-75.
MILROY, J. (1992), Linguistic Variation and Change, Oxford, Blackwell.
MONTES GIRALDO, J. J. (1995) Dialectología general e hispanoamericana. Orientación teórica,
metodológica y bibliográfica, 3ª ed., Bogotá, Instituto Caro y Cuervo.
MORENO FERNÁNDEZ, F. (1996a) “Metodología del ‘Proyecto para el estudio sociolingüístico del
Español de España y de América’ (PRESEEA)”, Lingüística, 8 pp. 257-287.
MORENO FERNÁNDEZ, F. (1997) Trabajos de sociolingüística hispánica, Alcalá de Henares, Universidad
de Alcalá.
MORENO FERNÁNDEZ, F. (1998) Principios de Sociolingüística y Sociología del lenguaje, Barcelona,
Ariel.
MORENO FERNÁNDEZ, F., A.M. CESTERO, I. MOLINA, and F. PAREDES (2000), “La sociolingüística
de Alcalá de Henares en el «Proyecto para el Estudio Sociolingüístico del Español de España y
América» (PRESEEA)”, Oralia, 3: 149-168.
MORENO FERNÁNDEZ, F., A.M. CESTERO, I. MOLINA, and F. PAREDES (2001) La lengua hablada
en Alcalá de Henares. Corpus PRESEEA – Alcalá. Hablantes de instrucción superior, Alcalá de
Henares, Universidad de Alcalá.
NAVARRO TOMAS, T. (1948) El español en Puerto Rico. Contribución a la geografía lingüística
hispanoamericana, Río Piedra, Universidad de Puerto Rico.
NAVARRO TOMAS, T. (1962) Atlas Lingüístico de la Península Ibérica, I, Madrid, C.S.I.C.
PINO MORENO, M. y M. SÁNCHEZ SÁNCHEZ (1998) “El subcorpus oral del banco de datos CREA –
CORDE (Real Academia Española): procedimientos de transcripción y codificación”, Oralia, 2, pp.
83 – 138.
ROMAINE, S. (1980), “What is a speech community?”, Belfast Working Papers in Language and
Linguistics, 4, 3 (1980), pp.41- 59.
ROMAINE, S. (1984) The language of Children and Adolescents: The Acquisition of Communicative
Competence, Oxford, Blackwell.
RONA, J.P. (1958) Aspectos metodológicos de la dialectología hispanoamericana. Montevideo.
SILVA-CORVALÁN, C. (1994) “Direcciones en los estudios sociolingüísticos de la lengua española”, Actas
del Congreso de la Lengua Española, Sevilla, Instituto Cervantes, pp. 399-415.
SILVA-CORVALÁN, C. (1997) “Variación sintáctica en el discurso oral: problemas metodológicos”, en F.
Moreno Fernández, Trabajos de sociolingüística hispánica, Alcalá de Henares, Universidad de
Alcalá, pp. 115 – 135.
THUN, H. (dir.) (1998) Atlas Lingüístico Diatópico y Diastrático del Uruguay (ADDU), Tomo I:
Consonantismo y vocalismo del español, Kiel, Westensee-Verl.
VAN HERWIJNEN, E. (1994) Practical SGML, Boston, Kluwer.
WEIJNEN, A. (1976) Atlas Linguarum Europae. Introducción, Madrid, Comisión Española del ALE.

16

Vous aimerez peut-être aussi