Vous êtes sur la page 1sur 6

LIBRARIANSHIP STUDIES & INFORMATION TECHNOLOGY

Problems of Natural Language in Indexing

August 10, 2018

Problems of Natural Language in Indexing

PROBLEMS OF NATURAL LANGUAGE IN INDEXING


Derived indexing is based on the natural language of the documents which proves to be
problematic sometimes in the Subject Indexing Process. These problems prompted to move
towards the use of Assigned indexing. These problems can be categorized under two heads:

Problems inherent in the language

Problems pertaining to relationships

Contents

Problems of Natural Language in Indexing

Problems inherent in the language

Synonyms

Homographs

Use of Plural-Singular Forms

Multi-Worded Concept

Complex Subject

Problems Pertaining to Relationships

Semantic Relationships

Syntax

Problems inherent in the language


Specific problems encountered in this connection are:

a) synonyms, b) homographs, c) singular and plural forms, d) multi-worded concepts, and e)


complex subjects

a) Synonyms are terms with the same or similar meanings. Such terms are present in every
subject near synonyms are most common. True synonyms which mean exactly the same thing
and which are used precisely in the same context, are rather unusual. Some situations in which
synonyms arise are:

i) In the case of some subjects which have one stem and several derivatives, or computing,
computed, computation. Sometimes, it is acceptable to treat such words as equivalent to one
another, and at other times it is important to differentiate between them.

ii) Some of the subjects might have both common and technical names, and these must be
recognized for the purpose of subject indexing so that depending upon the clientele for whom the
index is meant., these are reflected in the index. Examples are ‘Sodium Chloride’ and ‘Salt’.

iii) Use patterns of terms also present a problem. The indexer should try to keep pace with
changes in normal usage. E.g. ‘Wireless’ to ‘Radio’ to ‘Transistor’.

iv) Some concepts are named differently in different versions of one language. American and
English are examples of such differences in usage, for example, lift and elevator, catalog
(American) and catalogue (British).

v) Near synonyms which mean two or more words having nearly the same meaning e.g. salary,
wage, income.

b) Homographs mean words which have the same spelling but different meaning. In normal
language usage, the meaning of such homographs is established by the context in which the term
is used. For example, Pitch (Cricket), Pitch (Music) and Tank (Military Vehicles), and Tank
(Water tank), or Bear (to carry), Bear (animal).
c) Use of Plural and Singular Forms: Generally, the plural and singular forms of the same now
are regarded as an equivalent, but there are some situations when it is necessary to treat them
distinct. This also creates problem certain words are used as noun and adjective, e.g. heat and
hot, which again becomes a problem.

d) Multi-Worded Concept: Some subjects cannot adequately be described by one word, and
require two or more words to specify them fully. Examples are: Information Retrieval,
Underwater Colour Photography, Algebraic Topology, etc. In such cases, no matter which word
(in the term) is used as the main approach point in the index, the user might choose to seek the
subject under the second or third word (in the multiform) first.

e) Complex Subject: Complex subjects contain more than one unit concept in them and a number
of terms may be used to fully describe these subjects. Each of these concepts might form a
potential search key in the index. E.g. ‘History of Science’.

Problems Pertaining to Relationships

There are two main categories of relationships between subjects. These are known as syntatic
relationships and semantic relationships. A syntatic relationship exists between two terms in a
statement. A semantic relationship exists between terms that are defined in a vocabulary as
having meanings that are in some way related, are sequenced and interrelated so that the
statement becomes meaningful.

The statements “They are eating” and “They eating are” exemplify where syntatic rules operate.
Although both statements contain the same words, according to the rules of syntax only one of
them is correct and meaningful.

Examples of semantic relationships appear among terms such as heating, electric heating, plasma
heating, heat, and temperature. The meanings of these terms do not completely overlap with one
another. Nevertheless, we recognize that they are related in some manner.
Semantic Relationships: Relationship between Meaning

An aspect of meaning that has particular relevance for indexing, because of its bearing on
vocabulary control, is the relationships between meanings, and therefore denotation of words
used to represent them. Foskett (1996) noted three categories of relationships:

Equivalent

Hierarchical

Affinitive/Associative

Two expressions are equivalent when they denote the same referent. Synonyms are an obvious
case of equivalence, as one variation in spelling and word form, and acronyms and
abbreviations. Examples of such equivalences are ‘I.D. Cards’ and ‘Identification cards; ‘SDI’
and ‘Selective Dissemination of Information’. We have to take note that there are degrees of
equivalence: some words overlap the meaning of others but do not mean the same thing, such as
‘animals’ and ‘zoology’

There are two hierarchical relationships: genus-species and whole part. For instance, ‘Homo
Sapiens’ is a species of the genus ‘Homo’. All of ‘Homo Sapiens’ belong to the genus ‘Homo’,
but only some of the species which belong to the genus ‘Homo’ belong to the species ‘Homo
Sapiens’. An example of the whole-part relationship is a camera lens.

An example of a group of words that bears affinitive relationships is teaching, learning


education, training, and teachers, the difficulty that arises with these kinds of relationships is
that, unlike equivalence and hierarchical relationships, affinitive relationships are dependent on
context. Education may imply training, and training is frequently a part of education, but it
cannot be assumed that training necessarily implies education. Decisions about affinitive
relationships cannot easily be built into the indexing language or into the system of which the
language is a part; they must be made on an individual basis.

Syntax
A statement consists of elements from the vocabulary of a language joined together in such a
way that it has more meaning than a simple list of the same elements. The additional meaning is
given to the statement by its structure, or syntax because it shows the relationships between the
elements.

Example1:

i) Encyclopaedia of Bibliography

ii) Bibliography of Encyclopaedia

Example2:

i) The lady is smiling

ii) The smiling is lady

The above examples show the importance of syntax, i.e. the sentences convey a particular
significant meaning only when the terms are written in a particular order.

REFERENCES

Information Access Through The Subject : An Annotated Bibliography / by Salman Haider. -


Online : OpenThesis, 2015. (408 pages ; 23 cm.)

ARTICLE HISTORY
Last Updated: 2018-05-24

Written 2017-03-11

FEEDBACK

Help us improve this article! Contact us with your feedback. You can use the comments
section below, or reach us on social media.

This multimedia article Problems of Natural Language in Indexing is widely discussed,


appreciated, cited, referred, and hyperlinked. Some places where it is discussed and referred are
given below.

Linkedin

Victoria Frâncu, Librarian at Central University Library of Bucharest, Bucharest, Romania [In
LinkedIn Group - Information Science and LIS, March 23, 2017] -- I really enjoyed reading this
article which I find interesting and informative for the problems it presents. I particularly
appreciated the way the syntax and semantic relationships are explained and illustrated.

Thanks all for your love, suggestions, testimonials, likes, +1, tweets, and shares ...

Tags:

CLASSIFICATION

SUBJECT

https://www.librarianshipstudies.com/2017/03/problems-of-natural-language-in-indexing.html

Vous aimerez peut-être aussi