Académique Documents
Professionnel Documents
Culture Documents
Information Retrieval
Presented by:
Namita Singh
B.Tech 3rd year CS
GLA University
What is CLIR?
Cross-language information retrieval (CLIR) is a
subfield of information retrieval dealing with
retrieving information written in a language
different from the language of the user's query. For
example, a user may pose their query in English
but retrieve relevant documents written in French.
Multilingual Collections
There are 6,703 languages listed in the Ethnologue
Digital libraries
OCLC Online Computer Library Center serves more
than 17,000 libraries in 52 countries and contains over
30 million bibliographic records with over 500 million
records ownership attached in more than 370 languages
2005
5%
8%
9%
8%
32%
5%
6%
English
5%
English
52%
5%
21%
3%
4%
5%
3%
2%
2%
Japanese
Scandanavian
Portuguese
3%
German
Italian
Other
Chinese
6%
4%
Spanish
Chinese
Korean
3%
French
Dutch
English
2%
5%
2%
Spanish
Japanese
German
French
Italian
Chinese
Dutch
Scandanavian
Korean
Portuguese
Other
English
8%
1 2%
40%
6%
4%
8%
2%
5%
2%
6%
2%
5%
S p an
ish
Jap an
e se
Ge rm
an
Fre n
ch
C hi
nese
Sc an
d an
av
ian
Italian
Du
tch
K orea n
Po rtu
g
u
e se
O ther
E
ng
li
sh
Importance of CLIR
CLIR research is becoming more and more
important for global information exchange and
knowledge sharing.
National Security
Foreign Patent Information Access
Medical Information Access for Patients
CLIR is Multidisciplinary
CLIR involves researchers from the
following fields: information retrieval, natural
language processing, machine translation and
summarization, speech processing, document
image understanding,
human-computer
interaction
Approaches to CLIR
11
Design Decisions
What to index?
12
Controlled Vocabulary
Dictionary-based
Document Translation
Free Text
Corpus-based
16
17