Académique Documents
Professionnel Documents
Culture Documents
Francisco Moreno-Fernández
University of Alcalá – Instituto Cervantes. Spain
1. INTRODUCTION
The very ambitious aim of gathering a sociolinguistic Corpus of spoken Spanish already
exists and it is called "Project for the Sociolinguistic Study of Spanish from Spain and
America" (in Spanish, Proyecto para el Estudio Sociolingüístico del Español de España
y de América).1 The acronym PRESEEA (Sp. presea ´jewel; present’) tries to express
the project’s general goal: to become something as valuable for the forthcoming
knowledge of the Spanish language, as useful for the people concerned with its study.
The goal is to coordinate sociolinguistic researchers from Spain and the Hispanic
America in order to make possible comparisons between different studies and materials,
as well as a basic information exchange.
The project’s basis is collaboration: one offers his/her own information to receive
information from other researchers. It is necessary to collect spoken language materials
from a community according to a previously determined methodology, in order to
receive materials gathered in other areas with the same method. This sociolinguistic
project requires a coordination that is settled in the University of Alcala (Spain).2
Universities and institutions contributing information and spoken language materials
constitute associate centers.
Materials provided by the associated centers, following a general guideline, would
constitute PRESEEA Corpus. In order to create a spoken language Corpus with
sufficient guarantees and rich in terms of linguistic information – phonetics, grammar
and discourse – it would be necessary to attend the following tasks:
1
In April 1993, within the 10th International Conference organized by the Latin American Association of
Linguistics and Philology (ALFAL), a meeting of its “Commission of Sociolinguistics” was held. It was
decided to start a research project for the sociolinguistic study of the Ibero-American and Iberian
Peninsula’s cities. The ALFAL “Commission of Sociolinguistics” decided to start up a project including
three activities: 1ª.- To build up a sociolinguistic Database for Latin America and Iberian Peninsula (in
Spanish and Portuguese). 2ª.- To create a Spanish Language Sociolinguistic Corpus (PRESEEA). 3ª.- To
create a Portuguese Language Sociolinguistic Corpus (PRESOPO). The first phase of the project will
finish in 2010.
2
The PRESEEA coordinator assume the following commitments for the Spanish sociolinguistic corpus:
1ª.- To establish contact with centers interested in participating in PRESEEA.
2ª.- To distribute information on the basic sociolinguistic methodology to follow by the associate centers.
3ª.- To render technical and methodological assistance.
4ª.- To develop the necessary instruments for communication between the project’s researchers.
1
In the following pages methodological guidelines for gathering and editing of
PRESEEA sociolinguistic Corpus will be introduced. At the same time, it will be
explained why the project has a grammar and discourse bias.
2. BACKGROUND ISSUES
Knowledge of the Spanish language, about its use in Spain, in the American continent
or in the African and Asian Spanish-speakers territories, has reached an unusual
dimension during the last half century and especially during the last twenty years. In
1964 Lope Blanch affirmed:
I do not believe that it is excessively exaggerated to affirm that "Spanish of America" continues
being an illustrious stranger. Even the name in itself, of global application, -- Spanish of
America -- used to designate indiscriminately so many and so different Hispanic speech
modalities, it is a good demonstration of the relative ignorant state in which we were.
But the moans coming from several generations of linguists, because of the lack of first
hand information about America, Guinea, The Philippines, or about the urban speeches
of Spain, have weakened. Some of those linguists decided to go from the complaint to
action, from the libraries’ shelves to questionnaires, recordings and documents. Thanks
to these works, the Hispanic Geolinguistics and Sociolinguistics have experienced a
worthy advance in recent years.
In the field of Hispanic linguistic geography, the publication of atlases has been
continuous throughout the second half of the 20th century, in spite of the enormous
deficiencies remaining from first half of the century. In Spain, after the first volume of
the Atlas Lingüístico y Etnográfico de Andalucía, some important works appeared; the
first volume of Atlas Lingüístico de la Península Ibérica, the Atlas Lingüístico y
Etnográfico de las Islas Canarias, Atlas Lingüístico y Etnográfico de Aragón Navarra y
Rioja, Léxico de los marineros peninsulares, Atlas Lingüístico y Etnográfico de
Cantabria, Atlas Lingüístico y Etnográfico de Castilla-León, all of them by Manuel
Alvar, and the on-line version of Atlas Lingüístico (y etnográfico) de Castilla-La
Mancha, by García-Mouton and Moreno-Fernández. Contributions to the Atlas
Linguarum Europae, Atlas Lingüístico del Mediterráneo, and Atlas Linguistique Roman
must be added to those works.
In the Hispanic America, numerous geolinguistic projects have been conducted; to the
old Navarro’s map collection (El español en Puerto Rico), it is necessary to sum up the
only volume of Atlas Lingüístico y Etnográfico del Sur de Chile, the complete Atlas
Lingüístico y Etnográfico de Colombia and Atlas Lingüístico de México, in addition to
other in course works, like Atlas Diatópico and Diastrático de Uruguay. But no doubt
the project allowing for the first time to obtain a general linguistic landscape of all the
Spanish-speaking America is the Atlas Lingüístico de Hispanomérica, that so far is
integrated by the works El español en el Sur de los Estados Unidos, El español en la
República Dominicana, El español en Venezuela, El español en Paraguay, all of them
by Manuel Alvar, and other forthcoming volumes.
2
The Sociolinguistic research developed in the last thirty years took the concept of
“Sociolinguistics” in its wider sense and embraced all the works concerned with the
relationship between language and society. Hispanic Sociolinguistics would include the
next research lines:
3
them to vacillate between Variationist Sociolinguistics and Sociology of
language.
Hispanic America is the most urbanized among the less developed regions in the world.
Practically three quarters of the population live in cities and it is expected that in the
2025 approximately 85% of the Hispanic American population will be urban. It assumes
that a detailed knowledge of the Spanish of America is necessary for sociolinguistic
study of its urban communities.
4
should not be very exclusive: it should be a Spanish-speaking urban community --
monolingual or bilingual with a population, or a part of it, traditionally settled in a place
and with certain sociological diversity.
Reasons to justify these minimum requirements are easy to explain. First of all, the
project’s general goal is to achieve a synchronic sociolinguistic Corpus of Spanish
language; PRESEEA’s speech communities may be monolingual in Spanish or
bilingual, although in this last case Spanish must be a frequent use language in the
community, and the bilingual speakers should be able to use Spanish in functionally
similar conditions to monolingual use. Obviously a suitable and complete study of
bilingual cities leads to the consideration of many other elements not receiving further
considerations right now: linguistic attitudes towards each language, sociostylistic
distribution and social functions of languages. In these cases, researchers directly
responsible for the study of each bilingual city will determine in what way the common
methodological criteria can be combined with some other specific criteria depending of
the cities’ particular profiles.
Also, it is convenient to work with communities with a well-established population to
ensure that a conscience of speech community exists with a well-known sociostylistic
configuration recognizable by their own speakers. On the other hand, in order to be
confident that the research effort produces results it is proposed to work with
communities offering an internal variety and sociological wealth.
PRESEEA’s methodological guidelines are applied in speech communities associated
with concrete cities. It is possible that, in many cases, a speech community exceeds the
limits of a determined city, but it is realistic to work on entities delimited with certain
objectivity. PRESEEA’s data collection can be done in any speech community fitting
the conditions already stated, although it is important not to banish the study of the
biggest cities of Spain and Hispanic America. In Table 1 the number of inhabitants of
the most populated cities in each Hispanic American is listed.
______________________________________________________________________
Buenos Aires (Ar) (1991) 2,965,403 (urban conglomerate: 10,911,403)
Córdoba (Ar) (1991) 1,179,067
La Paz (Bo) (1992) 711,036
Santa Cruz de la Sierra (Bo) 694,616
Santa Fe de Bogotá (Co) (1993) 5, 726, 957
Cali (Co) (1993) 1,783,546
San José (CR) (1992) 302,574
Alajuela (CR) (1991) 158,276
La Habana (Cu) (1989) 2,077,938
Santiago de Cuba (1990) 405,354
Santiago (Ch) (1992) 5,180,757
Concepción (Ch) (1992) 330,448
Guayaquil (Ec) (1990) 1,508,844
Quito (1990) 1,100,847
San Salvador (ES) (1992) 422,570
Santa Ana (ES) (1992) 202,337
Madrid (Es) (1991) 3,084,673
Barcelona (Es) (1991) 1,681,132
Ciudad de Guatemala (Gu) (1995) 1,167,495
Escuintla (Gu) (1995) 123,048
Tegucigalpa (Ho) (1989) 608,100
5
San Pedro Sula (Ho) (1988) 321,197
México D.F. (Mé) (1990) 8,235,744 (Ciudad de México: 18 747 400)
Guadalajara (Mé) (1990) 2,178,000
Managua (Ni) (1985) 682,111
León (Ni) (1985) 100,982
Ciudad de Panamá (pan) (1990) 584,803
San Miguelito (Pan) (1990) 243,025
Asunción (Par) (1992) 502,426
Ciudad del Este (Par) (1992) 133,893
Lima (Pe) (1993) 6,434,328 (incluye Callao)
Arequipa (Pe) (1993) 633,428
San Juan (PR) (1990) 437,745;
Bayamón (PR) (1990) 220,262
Santo Domingo (RD) (1989) 2,200,000
Santiago de los Caballeros (RD) (1989) 467,000
Montevideo (Ur) (1985) 1,311,976
Salto (Ur) (1985) 80,823
Caracas (Ve) (1990) 1,822,465 (área metropolitana: 2,784,042)
Maracaibo (Ve) (1990) 1,363,873
Table 1.- Most populated cities and number of inhabitants. Source: Almanaque mundial
1996, Editorial Televisa, 1995.
The list in Table 1 is just a general frame regarding the type of city fitting the
PRESEEA guidelines. It does not mean other communities with a smaller size cannot be
studied through this sociolinguistic methodology. So far (January 2004) the Hispanic
communities incorporated in PRESEEA to bring on spoken language samples are the
following (Table 2):
______________________________________________________________________
Argentina
Neuquén (Patagonia)
Colombia
Barranquilla (Caribean coast)
Bogotá
Guatemala
Guatemala
México
Culiacán (Sinaloa)
México DF
Puerto Rico
San Juan de Puerto Rico
Spain
Alcalá de Henares (Madrid)
Cádiz (Andalucia)
Las Palmas (Canary Islands)
Lérida (Catalunya)
Madrid
Málaga (Andalucia)
Valencia
Zaragoza (Aragon)
Venezuela
6
Caracas
_____________________________________________________________________
Table 2.- Hispanic cities incorporated to PRESEEA (jan. 2004). Visit
<http://www.linguas.net/preseea>
3.2- Sampling
7
would be collected through questionnaires in addition to interviews. Crossing data from
those variables, it would be possible to work with another post-stratification variable:
socio-cultural level.
The proto-type sample is proposed in the following table.
___________________________________________________
Generation 1 Generation 2 Generation 3
M W M W M W
___________________________________________________
Educ. Level. 1 11M 11W 12M 12W 13M 13W
___________________________________________________
Educ. Level. 2 21M 21W 22M 22W 23M 23W
__________________________________________________
Educ. Level. 3 31M 31W 32M 32W 33M 33W
___________________________________________________
Table 3.- Proto-type sample by quotas. (M: men; W: women)
In relation to the sample size, it is reasonable to count four speakers for each cell in
Table 3, given a uniform affixation is proposed. This sample size would consistently be
72 informants, which represents 1/25000 proportion for a city of around two million
inhabitants and is much more suitable for those of a smaller population.3
As said before, social variables used to divide the universe are gender, age, and
educational level. All of them allow a quantitative sociolinguistic processing (Moreno-
Fernández 1998): regarding the convenience and interest of working with the age
variable, it is not possible to comment on anything beyond arguments given in
sociolinguistic literature -- it is simply an essential variable in any work of this field. It
is proposed to distinguish three generations: 20 to 34 years; 35 to 54 years; and 55
years and older. In this respect, it is important to take into account that life expectancy
in Hispanic America oscillates approximately between 60 and 75 years.
In regard to inclusion of "gender" and "educational level" in the samples, it is necessary
to remember that very few sociolinguistic studies have been conducted without them,
although gender is a factor of little explanatory capacity in a good number of analyses.
In order to facilitate comparisons with results from tens of researches, it seems suitable
to maintain gender as a stratification factor.4 Alternately, post-stratification variables
allow comparisons with results of previous researches and they are useful as reference
points. Variants (or factors) proposed for those post-stratification variables (or group of
factors) are incomes,5 housing conditions,6 and profession.7
3
Only half a dozen of cities included in Table 1 would be below the usually considered canonical level of
representativeness (0,025), although it is true that they use to be the cities with a bigger sociolinguistic
prestige and socio-economic weight (Buenos Aires, Lima, Madrid, México, Santa Fe de Bogotá and
Santiago de Chile). The criterion of local researchers is decisive in order to collect materials through
partial studies (the most representative districts, the most populated neighbourhoods, ...) or increasing to
five (90 informants) or six (108) the number of speakers by sample’s quota. In similar terms, for urban
communities with a number of inhabitants lower than 500000, the number of speakers by quota could be
reduced to three, so the sample could consist of 54 informants (1/9250).
4
Variants distinguished in the variable "educational level" are the following ones: 1. Illiterate, without
studies; primary education (up to 10-11 years old approx.); approximately 5 years of schooling; 2.
Secondary education (up to 16-18 years old approx.); approximately 10-12 years of schooling; 3. Higher
education (university, college) (up to 21-22 years old approx.); approximately 15 years of schooling.
5
It is recommended to distinguish five categories, with exclusively local validity.
6
1, House with sanitary and access limitations; 2, Modest house or apartment; 3, Elegant and spacious
house or apartment, with many amenities.
7
1, Travelling pedlars,/hawkers and salesmen, no specialized urban workers, farmers, domestic service,
no specialized services; 2, Small retailers, secretaries and clerks, specialized workers, craftsmen,
mechanics, salesmen in stores, collectors, technical assistants, policemen and guards, soldiers; 3,
8
The” Way of life" variable was introduced by Højrup and developed by James Milroy,
and it allows linking small dimension social networks to other structures or social
groups.8 Considering the utility that the concept of "way of life" may have in a project
like PRESEEA, it is important to think that the three typical ways of life for the Western
world are sufficiently common or regular to find them in practically all the Hispanic
speech communities. These ways of life gather some basic features of socio-cultural and
socio-economic levels handled in other studies and they can avoid several serious
problems, like the virtual non-existence of "middle-class". The hope, therefore, is that
they are explanatory variables for linguistic behavior. As a favorable argumentation for
the use of the "way of life" variable, it could be added that each research group is able
to include other different “ways of life” not included in other communities, but perhaps
indispensable in the study of certain societies. However, it is important that each feature
of the” way of life” , common and specific,– and the socio-cultural guidelines
associated to them, are described in a complete and detailed way. Essential information
to assign one speaker to a way of life or another could be collected using personal data
sheets and information gathered during the interviews.
Regarding to analysis of gathered materials, researchers can handle, as explanatory
variable, either independent post-stratification variables or these same variables
combined in sociocultural or socio-economic levels. It is not advisable to treat a set of
these as independent variables because the overlappings would be inevitable and the
quantitative analyses would be affected.
In order to conclude this chapter, it is important to highlight that all these
methodological criteria and norms are a minimum seeking equivalence or comparison
between materials proceeding from different research centers. Those are bases for a
University professionals, teachers of secondary and primary education, small industrialists and producers,
intermediate leaders, technicians, supervisors; 4, University professionals of free exercise, public and
private sector’s managers, military men with graduation, medium industrialists and producers, college
students; 5, High civil employees of the legislative, executive, and judicial authority, high officials of the
Army, great private industrialists, great landowners, executives directors of public and private sectors.
8
The way of life obeys to a model in which several ethnic groups or classes are represented as elements
internally structured and related to other groups. In this model, linguistic behaviour obeys more to the
networks’ power of determination and structures that to the attributes perceived as typical of certain social
groups. Besides there are networks with capacity to impose their sociolinguistic patterns to weaker others.
Priority is given to types of job and familiar activities, and to the speaker relations with other members of
the group, on certain characteristics or qualifying attributes. Groups are considered like a consequence of
fundamental structures of the society, dividing population in substantially different ways of life. The ways
of life proposed by Højrup and Milroy - and proposed by us to handle within our project in an
experimental, provisional, and absolutely voluntary way - are the following ones:
Way of life 1.- Primary units of production (agriculture, fishing, small services). Cooperative relationship
among workmates. Family implied in the production. Self-employment. Little free time: the more one
works, the more one gains. Narrow social networks.
Way of life 2.- Work in a production system that is not controlled by workers. One works to gain a pay
and enjoy periods of free time. Labour relations separated from familiar scope. Certain work mobility.
Narrow networks of solidarity with companions and neighbours.
Way of life 3.- Qualified profession, able to control the production and to direct the works of other
people. Time of vacations dedicated to work. One works to ascend in the hierarchy and to acquire more
power. Competitive attitude with colleagues.
The ideological features that would characterize these ways of life would be "the family" for way 1, "the
leisure" for 2 way and "the work" for way 3. It must be valued, however, that the concept of "way of life"
is fundamentally structural; profile characteristics of a group are determined in contrast with those from
the other ways. On the other hand, relations between the three cultural ways of life and practices
associated to them do not have why to be exactly the same in all the countries, reason why, in a
contrastive study, it is important to describe them with detail.
9
common use, but nonexclusive; the aim is that local researchers feel free to go further
on methodological requirements: nothing stopping to carry out linguistic attitudes
questionnaires or another tests of different types. To increase the number of speakers by
quotas is also possible, as well as making recordings in different contexts or situations.
It is also legitimate to analyze 14 and 19 years old speakers, to proceed with ways of
life out of the previously described, or to include post-stratification variables in addition
to the predicted ones. There is no doubt, however, about the significance of materials
following the common methodological guidelines for PRESEEA.
10
spite of the problems related to this end, a list of morpho-syntactical variables has been
elaborated9 (Moreno, Cestero, Molina & Paredes):
Different strategies can be used during the interview to elicit these kinds of variables.
Some of them appear along the conversations in a natural way so no special type of
question must be managed. Moreno, Cestero, Molina and Paredes explain that variables
with numbers 1, 2, 6, 9-14, and 18 fall in this side. Other variables, however, require the
existence of certain discourse type to appear more easily, given that its use can be
pragmatically conditioned. That is the reason why PRESEEA methodology proposes to
handle a list of themes or thematic modules along the 45-90 minutes of conversation.
Interviews are structured considering the next thematic modules:
1. Greetings
2. Weather
3. Place where one lives
4. Family and friendship
5. Customs
6. Danger of death
7. Important anecdotes in the life
8. Desire of economic improvement
9. Final
All that spoken language is supposed to be sufficient enough to allow grammatical and
discursive analysis regarding important sociolinguistic factors. Along the interview, the
modules’ order can vary according to the circumstances of the conversation. Some
directions for dealing with modules are set out next. The aim is that the set of materials
from different communities offer balanced samples for each module.
______________________________________________________________________
1. GREETINGS
9
See also the works by C. Silva-Corvalán (1994: 399-415) and P. Martín-Butragueño (1994: 29-75).
11
Introduction
How do you want to be addressed? (Sp. tú – usted – vos)
The truth is sometimes addressing people is a problem; you never know how to address some kind of
people? For example, how do you address your friends? What about if they are elder people? What about
a young and unknown person? How do you address an elder person, man or woman, for example to ask a
question, when you meet him/her in the street? What about your doctor? And what about not very well
known neighbors? How do you prefer to be addressed? What do you think when a younger person
addresses you using tú? It is a problem; I sometimes do not know what to do. (To ask in case they try to
avoid the asymmetric addressee system. To ask in what situations it seems better to him/her).
How are you doing? Are you excited?
In these days all we are a little (what is this word:) extrange, I believe that it is caused by the weather.
True?
2. WEATHER
Today it is cold/hot!
I do not like summer/winter. What do you prefer?
This year cold/hot/rain is harder than the last year, right?
I think the weather is changing, at least by this zone, what do you think? Why is that so?
Do you remember last years weather during this season? (rain, cold, wind, storms). What about last
winter/summer?
They say that the Earth’s climate is changing. What do you think will happen if there is less rain and
drought continues for several years?
12
(In case of having a profession already) That is what you really wanted to do? Why did not you do
another thing? How do you imagine your life if you had been/done... What do you usually do in a normal
day? (Description of a normal day since one wakes up until one goes to bed)
Are you satisfied with your lifestyle? Why?
What will be your family/husband/children/parents doing right now?
5. CUSTOMS
Now holydays are coming, what do you usually do in Christmas/summer? What will you do on next
vacations?
The other important vacations are Christmas/summer. Right? They are special because of the food and
meeting the family. What is the typical food for Christmas in your region? How is it made? Do you know
how to cook? Do you think it is necessary to know how to cook? What is the typical food in your
region/city/country? How is it made? Why do you use such ingredients? When are they added?
Do you think Christmas is just/mainly a religious celebration?
What plans do you have for next holydays? If you could choose, what would you like to do?
6. DEATH DANGER
Travels are always a little bit scary, because of the accidents and the bad news you usually heard. Tight?
Have you ever been in death danger? What happened? (physical description of narrative characters and
places) What had it happened if... And if... What would you do if you are again in a similar situation?
What did some people taking part in the affair say?
9. END
Finally... I believe all of us are lucky in certain way. Right? Let’s enjoy good things of life.
I should buy newspaper. Do you know where a newspaper store is? Could you explain to me how to reach
that place?
Good, it was a pleasure to talk to you. Thank you very much. I hope we may do it again. Right?
______________________________________________________________________
Handling of these thematic modules along the conversations is supposed to bring out
materials with usefulness enough to proceed to the grammar and discourse analysis.
From the discourse points of view, modules are thought to elicit different types of texts
such as descriptions, arguments, narratives, and evaluative commentaries, beside
conversation itself and many different acts of speech. Some of the most important relate
to address forms (nouns and pronouns), greetings, farewells, and tag questions, for
instance.
From the syntactical point of view, different modules may facilitate elicitation of
different grammar variables. As explained by Moreno, Cestero, Molina and Paredes,
PRESEEA interviews can collect data about verbal uses and values. All modules
include questions regarding: hypothetical situations, which are very useful to study
indicative-subjunctive/conditional variation; past events (present/past tense variation);
forthcoming events (present / future tense variation); with temporal, causal, or finality
references (clauses introduced by cuando, para que). At the same time it could elicit
verbal uses like haber de/que, tener que, deber (de), poder, ser capaz de to study their
meaning or their possible time and aspect marks. The uses of ser and estar elicited in
13
the Corpus by means of physical and personality descriptions, as well as the variation of
haber and estar (description of some places’ inside) are interesting points of the Spanish
language.. Other aspects are especially remarkable also, like verbal concordance in
impersonal uses, elicited while opinions are given, the use of impersonality, by means
of se or tú appearing frequently within the costumes module or when a recipe is
explained), or the use of haber and hacer (hay/hace niebla...) talking about the weather.
All these considerations are taken into account while interviews are in progress. That is
why it is affirmed that PRESEEA is a Corpus with a grammar and discourse bias,
without leaving the interest for other linguistic levels or sociolinguistic aspects.
5. CONCLUSION
Once recordings are made, the materials’ transcription and storage process is
achieved.10 Relating to the system followed to transcribe the recorded materials, it
seems logical to propose the use of an international system, admitted in industrial means
and used in a considerable number of countries. It is proposed that PRESEEA follows
the TEI (Text Encoding Initiative) international conventions.11 This transcription system
is introduced in the guide "Standard Generalized Markup Language" (SGML).12
Although the textual labels used to mark the materials of the spoken language could be
many, it is proposed to make use of only some of them. Distribution of information and
communication between PRESEEA groups are effective through the project’s website
(HTTP://www.linguas.net/preseea) and mailing list (preseea@colmex.mx).
At the end of these pages, it is important to insist on one of the PRESEEA signs of
identity: associate researchers can feel completely free to extending their objectives and
their study techniques, as long as the common guidelines are respected. The freedom is
absolute for analysis and interpretation of the linguistic materials. Through the
guidelines presented here it is possible to get in all the sociolinguistic tasks summarized
by Carmen Silva-Corvalán in 1992: description of phonetic, morphological and
syntactical variation in Spanish; study of linguistic changes in progress; description of
pragmatic elements and different discursive styles of Spanish.
The main goal of PRESEEA is to gather comparable materials, well transcribed, well
identified by their origin, in the best technical conditions and in the most effective way.
Beyond that, PRESEEA wants to be a way to narrow the relations between linguists
10
In order to obtain the fastest transcription possible and the easiest to correct, it is recommended
transcribing in ordinary spelling, using text processor (applications “Word”® or “Wordperfect”® are
recommended). Tape reproduction made trough a pedal control dictaphone is recommendable.
Transcriptions must be registered in ASCII (Texto DOS) and corrected by a minimum of two people.
11
TEI was born in a Conference organized by the Association for Computers and the Humanities held in
Poughkeepsie, New York, in 1987. It is the biggest international project to this purpose and the mentioned
association, as well as the Association for Computational Linguistics, the Association for Literary and
Linguistic Computing, the National Endowment for the Humanities, United States, the DG XIII,
European Union Commission, the Canadian Social Science and Humanities Research Council, and the
Andrew W. Mellon Foundation, sponsor it. The TEI’s goal of is to develop and to spread a well-defined
format in order to facilitate the text interchange between researchers interested in the processing of the
natural language.
12
It basically consists of a series of marks or labels of < > type reflecting the diverse features of the
transcribed texts. Each transcribed text must go preceded by a series of identification labels. The heading
labels are of three types. In the first place, those for general identification of electronic text appear.
Relating to the transcription, labels indicating the beginning of a structural element are inserted within <
>; labels marking the end of that element are inserted within </ >. The transcription itself must go
immediately preceded by the label <texto>. At the end the label </texto > appears.
14
proceeding from both sides of the Atlantic looking for a better knowledge of the
Spanish language.
REFERENCES
15
LOPE BLANCH, J.. (ed.) (1977) Estudios sobre el español hablado en las principales ciudades de
América. México, UNAM.
LOPE BLANCH, J. (1986) El estudio del español hablado culto. Historia de un proyecto. México,
UNAM.
LOPE BLANCH, J. (1990-2000) Atlas Lingüístico de México, México, El Colegio de México.
LÓPEZ MORALES, H. (1983) Estratificación social del español de San Juan de Puerto Rico. México,
UNAM.
LÓPEZ MORALES, H. (1993) Sociolingüística, 2ª. ed., Madrid, Gredos.
MARTÍN BUTRAGUEÑO, P. (1994) “Hacia una tipología de la variación gramatical”, Nueva Revista de
Filología Hispánica, 41, 1, pp. 29-75.
MILROY, J. (1992), Linguistic Variation and Change, Oxford, Blackwell.
MONTES GIRALDO, J. J. (1995) Dialectología general e hispanoamericana. Orientación teórica,
metodológica y bibliográfica, 3ª ed., Bogotá, Instituto Caro y Cuervo.
MORENO FERNÁNDEZ, F. (1996a) “Metodología del ‘Proyecto para el estudio sociolingüístico del
Español de España y de América’ (PRESEEA)”, Lingüística, 8 pp. 257-287.
MORENO FERNÁNDEZ, F. (1997) Trabajos de sociolingüística hispánica, Alcalá de Henares, Universidad
de Alcalá.
MORENO FERNÁNDEZ, F. (1998) Principios de Sociolingüística y Sociología del lenguaje, Barcelona,
Ariel.
MORENO FERNÁNDEZ, F., A.M. CESTERO, I. MOLINA, and F. PAREDES (2000), “La sociolingüística
de Alcalá de Henares en el «Proyecto para el Estudio Sociolingüístico del Español de España y
América» (PRESEEA)”, Oralia, 3: 149-168.
MORENO FERNÁNDEZ, F., A.M. CESTERO, I. MOLINA, and F. PAREDES (2001) La lengua hablada
en Alcalá de Henares. Corpus PRESEEA – Alcalá. Hablantes de instrucción superior, Alcalá de
Henares, Universidad de Alcalá.
NAVARRO TOMAS, T. (1948) El español en Puerto Rico. Contribución a la geografía lingüística
hispanoamericana, Río Piedra, Universidad de Puerto Rico.
NAVARRO TOMAS, T. (1962) Atlas Lingüístico de la Península Ibérica, I, Madrid, C.S.I.C.
PINO MORENO, M. y M. SÁNCHEZ SÁNCHEZ (1998) “El subcorpus oral del banco de datos CREA –
CORDE (Real Academia Española): procedimientos de transcripción y codificación”, Oralia, 2, pp.
83 – 138.
ROMAINE, S. (1980), “What is a speech community?”, Belfast Working Papers in Language and
Linguistics, 4, 3 (1980), pp.41- 59.
ROMAINE, S. (1984) The language of Children and Adolescents: The Acquisition of Communicative
Competence, Oxford, Blackwell.
RONA, J.P. (1958) Aspectos metodológicos de la dialectología hispanoamericana. Montevideo.
SILVA-CORVALÁN, C. (1994) “Direcciones en los estudios sociolingüísticos de la lengua española”, Actas
del Congreso de la Lengua Española, Sevilla, Instituto Cervantes, pp. 399-415.
SILVA-CORVALÁN, C. (1997) “Variación sintáctica en el discurso oral: problemas metodológicos”, en F.
Moreno Fernández, Trabajos de sociolingüística hispánica, Alcalá de Henares, Universidad de
Alcalá, pp. 115 – 135.
THUN, H. (dir.) (1998) Atlas Lingüístico Diatópico y Diastrático del Uruguay (ADDU), Tomo I:
Consonantismo y vocalismo del español, Kiel, Westensee-Verl.
VAN HERWIJNEN, E. (1994) Practical SGML, Boston, Kluwer.
WEIJNEN, A. (1976) Atlas Linguarum Europae. Introducción, Madrid, Comisión Española del ALE.
16