Vous êtes sur la page 1sur 8

Using cultural knowledge to assist communication

between people with different cultural background


Bruno A. Sugiyama1, Junia C. Anacleto1, Sidney Fels2 and Helena M. Caseli1
1
Department of Computer Science, Federal University of São Carlos, São Carlos – SP, Brazil
2
Department of Electrical and Computer Engineering, University of British Columbia, Vancouver – BC, Canada
+55 (16) 3351-8615
{bruno_sugiyama, junia}@dc.ufscar.br, ssfels@ece.ubc.ca, helenacaseli@dc.ufscar.br

Keywords
ABSTRACT Cultural Translation. Communication mediated by technology,
We present a computational application to facilitate text chat- Human Computer Interaction, Natural Language Process
based communication between people with different cultural and
language background. We focus on end-to-end communication
between people with rudimentary and intermediary knowledge of 1. INTRODUCTION
the second language using computer support rather than using a Currently, text messaging is a common form of communication
simple connection with automated computer translation. Through between people either on their computer or cell phone due to the
a user-centered design process, involving three increasingly hi- growth of the internet [13] and the ubiquitous nature of
fidelity prototypes, we created a system that allows users who communication technologies. Due to globalization [9], these text
speak different languages to send text messages between them that messages are frequently exchanged between people who speak
begins with an automated translation of their message that does a different languages and have different cultural backgrounds. Since
partial translation but normally has words that are not translated the exchange of messages is via text, it suggests that automatic
well. These poorly translated words are then searched for in a language translation systems may be helpful. However,
common sense knowledge base for the sender's culture that communication between users that speak different languages can
contains meanings gleaned from a large open source initiative to be affected due to linguistic and cultural differences of each
collect common sense knowledge. Using these additional concepts participant making automated translation difficult. By culture we
and words coupled to a translator, the user can select from a list of understand it is “that complex whole which includes knowledge,
translations those that are better suited to the intention of the belief, art, morals, law, custom, and any other capabilities and
message. We illustrate the usefulness of our approach empirically habits acquired by man as a member of society” [17].
to show that users find the augmented translated messages are In order to deal with the users’ culture, this work presents a chat
culturally sensitive and provide better communication experiences application to assist users in the process of creating messages in
than without it. Our study used messaging between Portuguese another language intended as an augmentation of an automatic
(Brazilian) and English speakers. text translator tool. Typically, cultural idioms, slang and phrases
etc. cause trouble for automatic translation, which is where our
Categories and Subject Descriptors system provides suggestions based on the user’s cultural
D.2.2 [Software Engineering]: Design Tools and Techniques - background. We find these suggestions by adapting a common
evolutionary prototyping, user interfaces; H.1.2 [Information sense knowledge database [1] (DB) from each users’ background,
Systems]: User/Machine Systems - human factors, human (i.e., Brazilian DB for Brazilian text and a US DB for American
information processing; H.5.2 [Information Interfaces and text), to create a culturally contextualized application. From a
Presentation]: User Interfaces - graphical user interface, natural user’s perspective, culturally based words not found in the
language, prototyping, screen design, user-centered design; automated translator tool appear in a text box that has alternative
suggestions from our system that makes suggestions to allow them
to select the meaning they intend rather than the usually more
General Terms literal meaning that comes from the automatic translator. In this
Design, Human Factors, Experimentation, Languages. fashion, the communication between two culturally separate
groups can be improved.
This paper is organized as follows: section 2 presents works that
Permission
Permission toto make
make digital
digital or
or hard
hard copies
copies ofof all or part
all or part of
of this
this work
work for
for complement and are related with ours; section 3 describes the
personal
personal or
or classroom
classroom use
use is granted without
is granted without fee provided that
fee provided that copies
copies are
are methodology adopted and the development process of our
not
not made
made or or distributed
distributed for
for profit
profit or commercial advantage
or commercial advantage andand that
that application; section 4 describes an experiment that illustrated the
copies
copies bear
bear this
this notice
notice and
and the
the full
full citation
citation onon the first page.
the first page. To
To copy
copy use of commonsense knowledge in the translation process; section
otherwise,
otherwise, oror republish,
republish, to
to post
post on
on servers
servers oror toto redistribute
redistribute toto lists,
lists, 5 presents the results related to the experiment exposed in section
requires
requires prior
prior specific
specific permission
permission and/or
and/or aa fee.
fee. 3; finally, section 6 presents the conclusion and some future
SIGDOC 2010, September
Conference’10, Month 1–2,27–29,
2010,2010, S.Carlos,
City, State, SP, Brazil.
Country.
works concerning the application.
Copyright
Copyright 2010
2010 ACM
ACM 978-1-4503-0403-0…$5.00.
1-58113-000-0/00/0010…$10.00.

183
2. RELATED WORK mapped through 20 relations such as defined-as, is-a, part-of,
made-of, property-of and others. These relations give more
2.1 Commonsense Knowledge semantic in the link between two concepts. The set of concepts
This work is related to a range of projects ([1], [2], [10], for and relation forms a semantic network of concepts called
example) that are involved in the applicability of a kind of cultural ConceptNet (Figure 1 - IV). Some examples of facts from
knowledge: common sense. Common sense can be defined as the ConceptNet are presented below:
knowledge of every-day things based in life experience or beliefs
of a group considering the time, space and social aspects defined-as(breezy, girlfriend)
Examples of common sense knowledge are “a lemon is sour”, made-of(milkshake, ice cream)
“when you receive a gift you may be happy” or “a pineapple is a
kind of fruit”. is-a(soccer, sport)

In order to use this kind of knowledge in computational The computer applications work with the ConceptNet through an
application, making them more familiar to the user’s context, this API that provides some function to manipulate its data.
work adopted the commonsense knowledge database of two The OMCS-Br Project has more than 280,000 sentences stored in
projects: Open Mind Common Sense (OMCS) and Open Mind its database and just over 1,800 contributors. While the OMCS
Common Sense in Brazil (OMCS-Br). While OMCS exists since English site, currently, has over a million sentences from over
1999, the OMCS-Br Project was created in 2005. Both of them 15,000 contributors.
collect common sense knowledge through their websites.
In Figure 1 is shown the architecture of the OMCS-Br Project. 2.2 Machine Translation
Because the two projects have similar ways to collect and use this Other related projects are those who work with machine
kind of knowledge, we will describe only the Brazilian Project. translation of texts. Machine Translation (MT) is one of the most
important subfields of NLP, and the phrase-based Statistical
In the Brazilian website (Figure 1 - I), this knowledge is collected
Machine Translation (SMT) approach is considered the state-of-
by a fill-in-the-gap mechanism: semi-structured sentences
the-art according to the automatic measures BLEU [15] and NIST
(templates) with gaps to be completed by people. For example:
[3]. The translation and language models used in SMT are built
A breezy is also known as _____. from a training parallel corpora (a set of source sentences and
their translations into the target language) by means of IBM
A milkshake is made of ________. models [7] which calculate the probability of a given source word
Soccer is a kind of ______. (or sequence of words) be translated to a target word (or sequence
of words).
II In our experiment to test our hypothesis in this article, we decide
to use Google Translate 1 because it performed well in some
III preliminary runs, but any other machine translation could be
chosen. We also performed experiments with the open-source
SMT Moses Toolkit 2 [5]. The corpus available for our research
[4] contained articles from the online version of the Brazilian
scientific magazine Pesquisa FAPESP 3 written in Brazilian
Portuguese (original) and English (version). It contains
approximately 500.000 words in each language, which is
considerably less than the training set used by Google.
Our proposal is to create a chat tool that is “MT-independent”, i.e.,
I is not restricted to a specific MT. The aim of the project is to
excel the power of Common Sense. Thus, in order to achieve this
IV goal, we have to choose a MT that provides a good translation in
the chat topics. Our work will deal with Portuguese (Brazil) to
English translations and vice-versa.
Figure 1 Architecture of OMCS-Br
The templates (written in English in this paper just for 3. WORK DESCRIPTION
clarification) are made of three parts: a dynamic part, a static part 3.1 Methodology
and a blank part (the horizontal line). The dynamic part, The development of our chat application has been following the
represented by the bold words, is filled automatically by the User-Centered Designer (UCD) approach described in [2]. We
computer; the static part is a fixed query structure; the blank part adopted that approach because we are interested in the
is where the user writes what she/he thinks.
The complete sentence is stored in a database (Figure 1 - II) and,
then, is processed by some NLP mechanisms (a lemmatizer, a PoS 1
http://translate.google.com.br/
tagger, etc.) that break it in interconnected concepts (Figure 1 - 2
http://www.statmt.org/moses/
III). The link between two concepts is tagged by a Minky’s
3
relation. Misnky [11] defined that the human knowledge can be http://revistapesquisa.fapesp.br

184
communication between different users. Thus, we need to 3.3 Prototyping
understand the user’s behavior to develop a successful tool. Because the chat application of this project is not only a tool to
The life cycle of the project consists of iterative and interrelated connect users but one to assist the translation process, we need to
stages. For each stage, we need to define (1) the goal of the stage, understand how would be the interaction process between the user
(2) the questions raised in that step, (3) the resources that helped and the two resources described above. With the purpose of
to answer the questions, (4) the answers for the questions and (5) achieving this goal we have been built some prototypes, passing
the stakeholders who participated in this iteration. An example of through three fidelity-level of prototyping: low, medium and high.
a stage is described as follow: (1) modeling the user mental In order to build applications that have higher chances to be
model; (2) How is the interaction among users? Which mental successful from the end-user’s point of view we need to be aware
model to choose: sender mental model or receiver mental model?; of the end-user’s mental model [6]. Paper prototyping [16]
(3) block diagrams and paper prototyping that shows a interaction permits designers to draw an interface closer to what users expect.
between two users; (4) The sender will work on the creation of a It does not require any technology techniques and it is very fast to
message with the help of a machine translation and a cultural design. This kind of low-fidelity prototype is recommended in
knowledge base; the system will be designed guided by the early project stages due the low cost and flexibility in design
sender’s mental model; (5) the stakeholders of that stage are the changes.
supervisor of the project, the co-supervisor and a foreign
professor. We used paper prototyping in two stages of the project. The two
paper prototyping are shown in Figure 2. In the early prototype
Following this approach, the initial stages of the project consists (Figure 2 – item I) the user that wants to send a messenger
of studies about concepts from HCI and NLP areas. HCI will (sender) writes a text in her/his native language. She/he sends it to
provide mechanisms to guide the project development focusing on the receiver. The message is translated by a SMT and is displayed
end-user needs. NLP will provide methods and techniques to to the receiver. The system identifies and highlights the words or
translated texts and data, i. e. convert a text in source language to expression that could not be translated by the SMT system. The
target language using MT. From these fields, this project adopts receiver then clicks on the highlights words and concepts related
two resources: common sense knowledge to deal with to that word are displayed to her/him so that she/he can
participants’ cultural specificities of the chat and statistical comprehend the message. This prototype maps the mental model
machine translation to provide a good translation. of the receiver.

3.2 Using the cultural knowledge I


For this work we are interested in two of the twenty Minsky’s
relations: defined-as and is-a. The relation “defined-as” connects
concepts that have the same meaning, i.e., synonym words. The
relation “is-a” represents a hierarchy between the concepts [10].
For instance, is-a(soccer, sport) represents that the concept
“sport” is more abstract or generic than the concept “soccer”.
Using the ConceptNet, an application can provide, for instance, a
synonym list for slang words with a simple algorithm described
below.
Given a sentence, the application divides it in phrases. For each II
phrase, a search through the ConceptNet is performed in order to
identify if the phrase is connected with the concept “slang” by the
relation “is-a”. This means that the word or expression can be
slang in the sentence. For each term found in the earlier step,
another search through the ConceptNet is performed in order to
find other related terms linked by the relation “defined-as”. These
related words are synonyms of the slang found in the given
sentence.
For example, given the sentence “My breezy is very kind to my
mom”, the application can identify through the ConceptNet that
the word “breezy” can be slang, i.e., the fact “is-a(breezy, slang)”
exists in the semantic network. The next step is to find the Figure 2 Paper prototype
synonym words, selecting concepts that are linked by the relation This paper prototype was presented to the stakeholders. A NLP
defined-as, for instance, “defined-as(breezy, girlfriend)”. We need expert was surprise with this kind of prototype. One problem with
to collect all X in “defined-as(X, Y)” and “defined-as(Y, X)” that model occurs when the message contains a lot of words that
where Y=“breezy”, thus we have our list of synonyms of the word could not be translated. The receiver might get lost with so many
“breezy”. terms in that message. Two HCI experts suggest that the system
need a resource where the user can participate in the translation
process.

185
In another stage, we created the prototype shown in Figure 2 – the message be processed by the two tools previously described.
item II. The application was named 2-Chat and a new mental The cultural knowledge output will be displayed is item IV
model was mapped: the sender’s mental model. This version puts (Cultural Translator) and the translated message will be show in
the user and machine together in order to create a message in the item V (Machine Translator). Item VI identifies the button “Send”
target language. The interaction model of this prototyped is closer that will send the translated message to its destination. In order to
to the one presented in our high-fidelity prototype that will be illustrate this process, we describe below an example of the
described in the next section. process of creating a message in the communication between a
In order to validate our proposal we evolved our prototype to a Brazilian and an American.
medium-fidelity one [12] (also called mid-fidelity prototype [8]) When the interaction starts, the American user sees the interface
to be presented to a committee of teachers. This prototype is in Figure 4. Then, he receives a message from the Brazilian asking
shown in Figure 3. This kind of prototype increased the look-and- something about an actress of a movie (displayed in Figure 5 – I).
feel of the application and still kept some characteristics of the Immediately after this, he writes (in Figure 5 - II): “That girl is not
low-fidelity prototype: it was fast to build and easy to make a minger, but she has a stranger appearance” and pushes the
changes. The prototyping was build with the help of Balsamiq “Translate” button (in Figure 5 – III). In that moment, the
Mockup 4 and Microsoft Office PowerPoint 2007 5. The name of message is sent to the Machine Translator (to translate from
the application was changed to Culture-to-Chat (C2C). English to Portuguese) and also to the Cultural Translator (to look
for common sense knowledge related to the words without
translation). Supposing that the Machine Translator would not be
able to provide the translation for the English word “minger”, the
generated Portuguese sentence cannot provide a fully
understanding (shown in Figure 5 – V). The Cultural Translator
can help the American user to provide some information about the
term “minger”, using the algorithm described in section 2.1. For
example, in Figure 5 - IV, the Cultural Translator would show
that ”minger” is defined as “a ugly person” or “something not
attractive“ (followed by their translations to Portuguese: definido
como pessoa feia, definido como sem atrativos). Thus, the
American can edit the translated message (in Figure 5 - V), for
instance, exchanging the word “minger” with the Portuguese
expression “pessoa feia” and, finally, send it (Figure 5 - VI).

IV
I
Figure 3 Medium-fidelity prototype of the chat
After the teachers’ approval, in the next stage of the project, we
needed to investigate the applicability of the common sense and
SMT tools. It is very difficult to connect preexistent
computational resources with the medium-fidelity prototype.
Therefore, we built a high-level prototype in Java (J2SE) that
interacts with the ConceptNet API and the SMT Moses Toolkit.
For the time being, this prototype does not promote the V
communication of different users but can be used to test some II
functional features. This prototype can be used to perform some
proof-tests in order to analyze the content of the two resources. In
the next section, we will describe the interface and the interaction
process between the application and users.
III VI
3.4 Culture-to-Chat
The interface of the high-fidelity prototype of C2C is shown in Figure 4 Interface of the high-fidelity prototype
Figure 4. In this figure, item I represents the area where all the
history of the conversation will be displayed. Item II is where the Thus, the Brazilian will receive an understandable Portuguese
original message will be written in the source language and, once message and will be able to write back following a similar process.
finished, the button “Translate” (item III) have to be pushed and In this similar process the roles of the users are inverted, so the
Brazilian user turns into the sender and the American user turns
into the receiver. In this process the new sender (Brazilian user)
4
http://www.balsamiq.com/builds/mockups-web-demo/ will interact with the commonsense knowledge from the OMCS-
5
Br semantic network instead of the OMCS one.
http://office.microsoft.com/pt-br/downloads/CD010200683.aspx

186
reformulate her/his original message, exchanging words or
expressions to better translate it.
Our study will be instantiated in the use of slang words. This kind
IV of words is very common in chats and informal speech. We
I classify the use of slang in four types:
(1) Slang that does not have a translated. Example: baranga
(a Portuguese word which means a woman that is not
attractive).
(2) Slang word in the source language that has a translation
but it is not a slang in the target language. Example:
café com leite (coffee with milk, which in Portuguese
means a naïve person).
V
II (3) Slang word in source language that has a translation, the
translated word is slang in target language but with
other meaning. Example: galinha (chicken, which in
Portuguese means a person who dates many girls and in
English means a coward).

III VI (4) Slang that has a translation with the same meaning in
both language. Example: Meu Deus! (My God!, to
Figure 5 Interface while creating a message express surprise).
The flow of information in the sending process is shown in Figure The next steps of study will focus on the impact of these kinds of
6. The arrow I represents the sending of the message in native words in the communication of people with different cultural
language. The system identifies some words or expressions and backgrounds and how this can contribute with the learning
uses the common sense semantic network to provide related process of the users. In the next section we present an experiment
concepts. These concepts are translated (arrow II) and all of them involving slang words, common sense and machine translation.
(the concepts and its translation) are returned to the user (arrow
III). Finally, the user can change and edit his message in the 4. FIRST EXPERIMENT
foreign language and send it to the receiver (arrow IV). In order to verify whether the use of translation resources can help
users to create messages in foreign language, we conducted an
experiment described below. We want to verify if the
commonsense knowledge presented in ConceptNet has been
somehow useful during the translation process. The hypothesis on
this experiment is that the students/users will adopt some
suggestions from the ConceptNet in order to create or translate a
sentence in the target language.
We have selected five Brazilian students with basic and
intermediate level of English. The task that they were asked to
perform was: “With some given sentences in Brazilian Portuguese
language, provide their translations, which will be sent to an
American person”. A preliminary automatic translation of each
sentence was given as a suggestion to them. There were also
available some cards containing a list of synonyms to some words.
We have emphasized that these two resources were suggestions
Figure 6 Flow of data in the sending process and the students could either accept them or not.
We believe that this application has three goals but our current The experiment was applied in individual sessions so that none of
research will focus on the first one. The first goal is to connect the students were aware of what the others answered. The students
and facilitate the communication between two people that speak were expected to write in a sheet of paper the entire translated
different idioms and thus belong to different cultures. Cultural sentences that they considered to be understandable by a foreign
Translator and Machine Translator promote this facilitation person. We could neither answer questions related to the
because they help the sender to write in a more understandable translation process nor comment whether the translation was
way to the receiver. The second goal is the exposure of the user to correct.
a target language learning process. Cultural Translator
collaborates with that task acting like a phrase book, expanding We have chosen five Brazilian Portuguese sentences that
words and amplifying the vocabulary by translating them. The illustrated the use of slang words. Four of the five sentences
third goal is to exposure the sender to a reflection of his own represented the four types of slang use mentioned in section 3.4.
language. Cultural Translator helps providing alternatives to The last sentence did not contain slang words, but contains words
that did not have a translation (they were specific objects from
Brazilian culture).

187
We emphasize that the content of the sentences can be very 5 feijoada, caipirinha
offensive because of the meaning of the slang words. We do not
necessarily share the same idea expressed by the sentences. The
choice of the slang words used in the sentences was based purely Table 3. Example of card with common sense knowledge
on cultural issues without regarding moral values. They are part of
the informal vocabulary of Brazil found in blogs and internet Tribufu
chats and they address good examples of cultural translation. algo muito feio something very ugly
The translated words and sentences presented in this experiment mulher fora do padrão de woman outside the standard of
were provided by Google Translate, on May 26th, 2010. The beleza beauty
original sentences in Brazilian Portuguese and their automatic
translations in English are presented in Table 1. baranga slapper
Table 1. Original sentences and their translation mulher sem beleza woman without beauty
Number Original Sentence / Translated Sentence horrorosa horrible
1 Aquela maria-chuteira é um tribufu. It is interesting to observe that some slang words have more than
That maria-boot is a tribufu one meaning and that the semantic network came with all
meanings. For example, the word “galinha” (chicken) is related to
2 Meu vizinho tem uma namorada que parece um cão the concepts “small bird” and “womanizer”. The user can choose
chupando manga. the appropriated meaning that fits her/his translation. The results
My neighbor has a girlfriend that looks like a dog of the experiment will be listed in the next section.
sucking mango
5. RESULTS
3 Jogadores de futebol são muito galinhas e ficam
We asked a non-native English teacher to grade all the translated
com muitas meninas
sentences. The variables that he considered in the rating were
Football players are too many chick and stay with grammar and comprehension. The grades vary from 1 to 5 where
girls the grade 1 means a very poor translation and 5 a very good one.

4 O casal passou a lua de mel em Nova Iorque In the first sentence, three of five students changed the expression
“maria-chuteira” to one of the suggested card. Another student
The couple spent their honey moon in New York changed the word to another of her/his knowledge. The last
5 Os brasileiros gostam muito de comer feijoada e student let the term unchanged. Another word edited was
beber caipirinha. “tribufu” which all students adopted the same synonym (horrible)
presented in Table 3. The students’ translated sentences and their
The Brazilians are very fond of eating feijoada and grades can be seen in Table 4.
caipirinha drink.
Table 4. Translations of the first sentence
For each sentence we performed a search in the OMCS-Br
Translated sentence Grade
semantic network looking for words or expressions that were
considered slang words, using the algorithm described in section That maria-boot is a tribufu 1
3.2. The result of that algorithm is shown in Table 2. For each
expression in Table 2, we have made cards containing a list of Student 1 That woman interested in football 2
related words linked through the Minsky’s relation “Defined As”. player is a horrible
These cards also contained synonyms provided Microsoft Office Student 2 That woman who likes football player 3
Word 2007 Synonym Dictionary6. All synonyms were listed in the is a horrible
same card, therefore the students were not aware of the different
sources the words might have. An example of a card that Student 3 That woman interested in soccer player 4
described the word “tribufu” (sentence 1) is shown in Table 3. is horrible

Table 2. Slang words of the original sentences Student 4 That soccer-player groupie is horrible 5
Sentence Words or expressions Student 5 That maria-boot is a horrible 2
1 maria-chuteira, tribufu It is important to note that the students 1, 2, 5 made some
grammatical errors when creating the translated sentence. In first
2 cão chupando manga
sentence, for instance, “…is a horrible” is not grammatically
3 galinha, ficar correct. We observed that the sentences written by the students
had a higher score than the one translated by Google.
4 lua de mel
The second translated sentences are shown in Table 5. The
expression “cão chupando manga” (dog sucking mango) was
edited by all the students even though it has a literal translation.
6
http://office.microsoft.com/pt-br/word/default.aspx

188
These results showed us that some sentences were not well which did not have problems in translation. In the end, the
constructed and still had problems in comprehension. students did not modify the translated message.
Table 5. Translations of the second sentence In the sentence five, we did not choose slang words to compose
the original sentence. Instead, we choose words that designated
Translated sentence Grade
cultural things and could not be translated, for instance, kinds of
My neighbor has a girlfriend that looks 2 dishes. This was a test to expand our experiment and observe the
like a dog sucking mango impact of commonsense knowledge in other kinds of words.
Student 1 My neighbor has a girlfriend that is very 5 The semantic network could provide only one term related to the
ugly person word “feijoada”, saying that it is a food: “defined-as(feijoada,
Student 2 My neighbor has a girlfriend that looks 2 food)”. The word “caipirinha” has two meanings: one is used to
like very ugly designate a hick while the other meaning is the name of an
alcoholic drink in Brazil. At the moment of the experiment the
Student 3 My neighbor has a girlfriend that is very 5
semantic network of OMCS-Br and Microsoft Word Dictionary
ugly
only provided terms related to the first sense (hick). The translated
Student 4 My neighbor has a girlfriend that looks 4 sentences are shown on Table 7.
very ugly
Table 7. Results of the fifth sentence
Student 5 My neighbor has a girlfriend that looks 2
like very ugly Translated sentence Grade

The third sentence was the most difficult to translated. The results, The Brazilians are very fond of eating 3
presented in Table 6, showed some interesting issues. One of feijoada and caipirinha drink.
them is the use of the word “soccer” and “football”. Only two The Brazilians like very of eating 4
Student 1
students (3, 4) noted the difference about these words while the “feijoada” (traditional food) and drink
others kept the translation provided by the machine translation. “caipirinha” (traditional drink)
The student 2 created a new sentence with the same meaning but
without using any suggested resource. Some examples of Student 2 The Brazilians like to eat food made of 5
expressions suggested by ConceptNet to the word “galinha” beans and pork named “feijoada”. They
(chick) and that were used in the translation process were like to drink “caipirinha” that is made of
“womanizer”, “flirt”, “surface dating”. The Microsoft Word lemon, alcohol (cachaça), sugar and ice.
Dictionary provided synonyms for the word “ficar” (stay) but
none of them were used in this case. Student 3 The Brazilians like very much eating 5
“feijoada” (a typical food made of beans
Table 6. Translations of the third sentence and pork) and drinking caipirinha
Translated sentence Grade (alcoholic drinking made of lemon, ice
and sugar)
Football players are too many chick and 1
stay with girls Student 4 The Brazilians like too much a mix of 4
beans and pork meat and drink caipirinha
Student 1 Football players are many womanizer and 4
surface dating with many girls Student 5 The Brazilians are very fond of eating 4
food tipic (feijoada) and caipirinha drink
Student 2 Football players like to have a lot of 3
woman Analyzing the content of the students’ sentences, we noted that
the commonsense knowledge was helpful in some situations. This
Student 3 Soccer players are womanizer and date 5 experiment showed us that the students analyze what is suggested
many girls and perceived that some automatic translations needed a
reformulation. In our examples the commonsense knowledge was
Student 4 Football/soccer players are very flirt and 2
more useful than the Microsoft Word Synonyms Dictionary. This
stay with several girls
happened because the sentences contained words which meaning
Student 5 Football players are too flirt and surface 4 depends on cultural factors (for instance, slang words). These
dating with girls words were chosen due the fact they are very common in a chat or
conversation between Brazilian people.
The fourth sentence (O casal passou a lua de mel em Nova
Iorque) received 5 points. All the students wrote the following
translation: “The couple spent their honeymoon in New York” 6. CONCLUSION
(the same provided by Google Translate). The students 1 and 5 This work presented a prototype an application called Culture-to-
had some questions about the word “lua de mel” (honey moon) Chat, or C2C, aiming at promoting a better comprehensive
because they perceive that the literal meaning of this expression communication between users with different cultural background.
can cause confusion in another culture. The semantic network of Specifically, the application is to support the chat between people
OMCS-Br and Microsoft Word Dictionary do not provided (at with two different language background: Portuguese and English
that moment) any suggestion about the expression “honeymoon”. focused on end-to-end communication between people with
They only suggested terms related to the world “passar” (spend) rudimentary and intermediary knowledge of the second language

189
using computer support rather than using a simple connection [6] Cooper, A., Reimann, R., and Cronin, D. 2007 About Face
with automated computer translation. We are proposing a 3: the Essentials of Interaction Design. John Wiley & Sons,
methodology that adopts machine translation combined to a Inc.
cultural knowledgebase with additional concepts and words [7] Doddington, G. 2002. Automatic evaluation of machine
presented to the user after a first automatic translation. These translation quality using n-gram co-occurrence statistics. In
concepts and words can be selected from a list of translations that Proceedings of the Second international Conference on
can be better suited to the intention of the message, in order to Human Language Technology Research (San Diego,
help users on translating properly certain cultural expressions California, March 24 - 27, 2002). Human Language
used in the vocabulary for their chat. The cultural knowledge Technology Conference. Morgan Kaufmann Publishers, San
expressed as commonsense knowledge is collected by the OMCS- Francisco, CA, 138-145.
Br project and the American OMCS projects as well. Preliminary
tests show that our proposal can be effective for certain situations [8] Engelberg, D. and Seffa, A. 2002. A Framework for Rapid
and users could work better on expanding their own vocabulary in Mid-Fidelity Prototyping of Web Sites. In Proceedings of the
the second language and consequently having a better translated IFIP 17th World Computer Congress - Tc13 Stream on
message using this technique than using only the machine Usability: Gaining A Competitive Edge (August 25 - 29,
translation support. Although the results are not sufficient to 2002). J. Hammond, T. Gross, and J. Wesson, Eds. IFIP
measure the real impact of the proposed approach, they give some Conference Proceedings, vol. 226. Kluwer B.V., Deventer,
evidences that it works in specific situations. As future work, The Netherlands, 203-215.
according to the results presented here, the prototype is evolving [9] Eune, J.and Lee, K. P. 2009. Analysis on Intercultural
from a desktop prototype to a web application working with Differences through User Experiences of Mobile Phone for
Google Translate (instead of SMT Moses Toolkit). Therefore, we globalization. In Proceedings of International Association of
can perform some experiments to analyze how the combination of Societies of Design Research (Coex, Seoul, Korea, October
cultural knowledge and machine translation can impact on the 18 – 22, 2009)
communication between users with different cultural background.
[10] Faaborg, A. and Lieberman, H. 2006. A goal-oriented web
browser. In Proceedings of the SIGCHI Conference on
7. ACKNOWLEDGMENTS Human Factors in Computing Systems (Montréal, Québec,
The authors thank to CAPES and FAPESP for partial financial Canada, April 22 - 27, 2006). R. Grinter, T. Rodden, P. Aoki,
support to this research. We also thank all the collaborators of the E. Cutrell, R. Jeffries, and G. Olson, Eds. CHI '06. ACM,
Open Mind Common Sense Project who have been building the New York, NY, 751-760. DOI=
common sense knowledge base considered in this research. http://doi.acm.org/10.1145/1124772.1124883
[11] Ginsberg, M. 1991. Marvin Minsky: The Society of Mind.
8. REFERENCES Artif. Intell. 48, 3 (Apr. 1991), 335-339. DOI=
[1] Anacleto, J. C., Carvaho, A. F. P. de. 2008. Improving http://dx.doi.org/10.1016/0004-3702(91)90033-G.
Human-Computer Interaction by Developing Culture- [12] Leone, P., D. Gillihan, and T. Rauch. 2000. Web-based
sensitive Applications based on Common Sense Knowledge. prototyping for user sessions: Medium-fidelity prototyping.
In Advances in Human-Computer Interaction. Vienna. In Proceedings of the Society for Technical Communications
[2] Anacleto, J. C., Fels, S., and Villena, J. M. 2010. Design of a 44th Annual Conference, pp. 231-234. Toronto: STC.
web-based therapist tool to promote emotional closeness. In [13] Morris, M. and Ogan, C. 1996. The Internet as Mass
Proceedings of the 28th of the international Conference Medium. The Journal of Communication v. 46. 39-50. DOI=
Extended Abstracts on Human Factors in Computing http://dx.doi.org/10.1111/j.1460-2466.1996.tb01460.x
Systems (Atlanta, Georgia, USA, April 10 - 15, 2010). CHI
EA '10. ACM, New York, NY, 3565-3570. DOI= [14] Myers, B., Hollan, J., Cruz, I., Bryson, S., Bulterman, D.,
http://doi.acm.org/10.1145/1753846.1754019. Catarci, T., Citrin, W., Glinert, E., Grudin, J., and Ioannidis,
Y. 1996. Strategic directions in human-computer interaction.
[3] Brown, P. F., Cocke, J., Pietra, S. A., Pietra, V. J., Jelinek, F., ACM Comput. Surv. 28, 4 (Dec. 1996), 794-809. DOI=
Lafferty, J. D., Mercer, R. L., and Roossin, P. S. 1990. A http://doi.acm.org/10.1145/242223.246855.
statistical approach to machine translation. Comput. Linguist.
16, 2 (Jun. 1990), 79-85. [15] Papineni, K., Roukos, S., Ward, T., and Zhu, W. 2002.
BLEU: a method for automatic evaluation of machine
[4] Caseli, H.M. and Nunes, I.A. 2009, Statistical Machine translation. In Proceedings of the 40th Annual Meeting on
Translation: little changes big impacts, In Proceedings of the Association For Computational Linguistics (Philadelphia,
7th Brazilian Symposium in Information and Human Pennsylvania, July 07 - 12, 2002). Annual Meeting of the
Language Technology. São Carlos, SP, Brazil., pp. 1-9. ACL. Association for Computational Linguistics,
[5] Caseli, H.M., Sugiyama, B.A. & Anacleto, J.C. 2010, Using Morristown, NJ, 311-318. DOI=
Common Sense to generate culturally contextualized http://dx.doi.org/10.3115/1073083.1073135.
Machine Translation, In Proceedings of the NAACL HLT [16] Snyder, C. Paper Prototyping: the fast and easy way to
2010 Young Investigators Workshop on Computational design and refine user interfaces. San Francisco, CA:
Approaches to Languages of the Americas. Los Angeles, Morgan Kaufmann, 2003, 408 p.
California. June 2010., pp. 24-31.
[17] Tylor, E. 1920. Primitive Culture. New York: J.P. Putnam’s
Sons. 1.

190

Vous aimerez peut-être aussi