Académique Documents
Professionnel Documents
Culture Documents
2
LaoNet- Lao Human Resources Network
Melbourne, Australia
pan@Lao.Net
A case study: the Montaigne-Lao project5 foster dialogue among peers, to encourage people to
take ownership of the issues and problems and
thereafter to search and identify means and ways of
1. General description realizing the goals.
The initial step requires coaching and mentorship by
Basically, the Montaigne project’s idea is to offer a free
an experienced teamwork overseer to manage the
cooperative work facility on the web for developing
evolution of the group and keep the group focus.
linguistic resources and machine translation tools. The
As in any community, cyber-communities suffer
Montaigne-Lao project is the first implemented step of
from weak "signal to noise ratio" and effectiveness. It is
this concept and has Lao as target language.
often the case that groups of people would break away
One important question is how to go about
into teams that are more compatible working on some
developing a collaborative effort targeting specific
common goals.
needs, or in other terms how to harvest the goodwill and
The driving force is not always money or fame, but
to channel the energies towards producing workable
the desire for the common good, so the 'why' would be
solutions. The cyber-forum described in the next section
answer by the 'why not'. Often the effort of participants,
is the way we propose for fostering such participation
mostly in time, could expand their horizon and bring
and for guiding it. It is the basis of an ICT cyber-
new challenges, as well as unforeseen solutions.
cooperation.
The resourcefulness and contact of the mentors, who
More generally, we describe here three already
could see to the group needs, could steer the group up to
existing components of the Montaigne-Lao project:
some level of self-management. "If there is a will there
Forum, is a way" aided by good resource would makes it easy.
Dictionary, Once the means and tools are available for wider
Virtual keyboard, participation, the virtual workgroup could be replicated
and two projects: to other fields. The trust of the process would bring peer
groups to work on common problems for mutual
Text to speech, bilateral benefits.
Lao New Coding. LaoNet forum was modeled on such cyber
We would like to emphasize that, if these existing or cooperation; with its limited resource, despite many
prospective developments have initially been driven difficulties and lack of financial resources we have
separately6, our discussions on the web allowed the managed to achieve some milestone results: Lao
achievement of synergies and of technical emulation. Romanization for Transliteration (LRT) 1995; STEA
Lao IT Technology seminar-VTE August 1995, leading
2. Forum & mailing list: a method to emulate to the introduction of Internet into Laos; first Lao
the cooperation History Symposium at Berkeley USA in March 2003;
STEA PAN-Laos ICT affairs seminar VTE March
It is often said that the Internet has modernized the 2003. The forum has brought many, many people to
workgroup relationship across cyberspace. However a exchange ideas across the cyber-space without having
forum does not necessarily become a project team. ever needed to meet physically to work together.
What are the ingredients that would attract common
minds to consolidate the efforts instead of projects
being worked on in isolation, duplicating efforts and 3. Dictionary and Translation Support
resources? Environment (DTSE)
The keys are: central contact point, shared goals, 3.1. Functional architecture
commitment, and participants. Obviously participants The three following tasks participate to an initial step of
are very important as they are the how and the fuel to the Montaigne-Lao dictionary project:
keep the central contact forum going. Even if some a) A group of skilled persons build lexical entries,
people think it is a waste of effort, it can quickly expand partly starting from existing paper dictionaries.
and sometimes replace the traditional networking of b) A panel of specialists derives a reference on-line
peer groups of people. dictionary from the gathered lexical entries.
To create such a forum is to enlist a group of people c) Various tools and applications are developed to
with the same interest, Internet news-groups are a good utilize the resource.
place to start for recruits as well as private mailing lists. An original point is that, in parallel with this work of
If such contact point does not exist, then individuals or specialists, a lexical contribution coming from a
organizations should allocate resources to set it up, to worldwide public is also compiled. This process relies
on a translation support environment provided on the
5
: http://sabaidi.imag.fr/ web site. In particular, a Lao-French bilingual editor is
6
: Mostly in France: dictionary and LaoWord/virtual proposed to the visitors together with a word for word
keyboard and in Australia: forum, Lao new coding and text to translation service.
speech.
Bilingual editor
7
: http://www.papillon-dictionary.org/
8
: On this matter, see André Clas, Igor Mel'cuk and Alain
Polguère's book, Introduction à la lexicographie explicative et
combinatoire, Duculot 1995
9
: http://www.inalco.fr
10
: Lao is written from left to right with an alphabet deriving
from Indian scripts. A major characteristic of Lao writing is
that words are not separated with spaces, like Khmer, Thai or
Burmese writings.
11
: Another important characteristic of Lao writing is that
some vowels are placed before the consonant. This contributes
Functional architecture to make the automatic sort of Lao dictionaries more complex.
4. Virtual Keyboard (VK) Having the text transcribed, the romanized Lao text
is fed into a Festival voice synthesizer engine. Minor
Deriving from the Lao word processor called LaoWord
sound rules modification of Standard English sound
which is an add-in for Microsoft Word, a virtual
library is sufficient to make the engine cope with
keyboard was developed to provide a general Unicode
producing a workable Lao text to voice system.
input device for Lao: LaoUniKey12. LaoUniKey is part
Future Lao TTS work could share the same basis of
of the Montaigne project both because it is available on-
this approach in terms of word construction, word
line but also in the sense that its source code is available
break, sentence formation, and utterance rules.
under GPL license.
Its code has already been used as a basis of an
ongoing Montaigne-Lao project addressing an improved 6. Lao New Coding (LNC)
Lao keyboard evaluator. Due to a lack of technology 6.1 Lao writing basic
native to operating systems that can handle foreign Lao writing is derived from Indian Sanskrit script and
language script, the enhanced-VK is designed to create many words are borrowed to build Lao vocabulary.
some smart pre-emptive keyboard across application Word could be made of a single or multiple syllables.
and platforms. In the first place, it simply interprets and The syllable is formed by: [consonant/cluster
maps the keystrokes to character codes (8 bits or 16 bits consonants][vowel/derived vowel] [consonant modifier
Unicode). By implementing some keystrokes buffering, (optional)][tone (optional)]; this structure is being
keystrokes history recall could be implemented. By referred to as a word-clusters here within.
adding a lookup of keystrokes translation table, macro Lao vowels compose of multiple sub-elements; each
programming of keystrokes is achieved. The macro element occupies a predefined position within the word-
single keystroke could be expanded to a stream of cluster, some elements proceed and displace consonant
keystrokes, paving a way to implement multi elements when invoke. Such symbols compounding is not
vowel sound of a phonetic virtual keyboard. supported for Lao within any operating system, the lack
By enhancing the keystrokes buffering with some of standards had bred many incoherent implementations
intelligent rules or a dictionary lookup words correction that had confused the computerization of Lao language.
could be preempted. It is interesting to note that the term "Lao grammar"
The virtual keyboard is written in C language and is is classically defined as the rules for Lao speech and
limited to Windows platforms at the moment .Later on, writing, words construction and little inter-linkage of
by implementing it by using a platform-independent words. Multi-words words are built by joining a number
language such as Java, it will be possible to create a of words. Lao grammar does not have extensive word
smart keyboard across platforms. identifier to explicitly define words relationship in the
multi-words construction.
5. Text To Speech (TTS) Traditionally, a Lao text is not punctuated, other than
A text to speech synthesizer (TTS) is currently being being organized in paragraphs of long continuous
worked on to broaden the appeal and usefulness of the stream of characters.
resource, to attract an audience that would sustain and So when 'Lao grammar' is referred to in this text, we
contribute to the growth of the online dictionary. It is are referring to the word-clusters construction rule.
planned that the service will be an extension of the 6.2 Current technology
dictionary server, where a client application could make In current technology, Lao words require some five
a word to sound lookup request. Such approach would bytes on average. Keystrokes or bytes stream of Lao
omit a need for voice synthesizer engine at the client words require a correct order of sequence to correctly
end. It will also foster simple voice capable front-end position the elements within the word-cluster. Possible
applications e.g. language teaching web page. solution is to assign extra bytes that would preserve
The voice synthesizer is based on the public domain elements' attributes and position information. The recent
Festival system (http://www.festvox.org/). Although a font standard, OpenType13 developed by Adobe and
full native Lao words to sound library is yet to be Microsoft, combines TrueType and Postscript
constructed, an intermediate solution is adopted to technologies to provide new typographic features such
demonstrate the concept. With Thai TTS already being as the desirable capability of elements compounding,
more advanced we hope that a future GMS forum could promises a possible solution to Lao script problems.
see Lao TTS benefiting from NECTEC TTS work. Reliance on keystrokes sequence alone could
The current implementation is done with the Lao text complicate lexical search and matching of words.
being transcribed using roman characters employing an 6.3 New encoding
earlier work of romanization for transliteration (LRT). LNC offers a new way of correctly storing symbol
The transcription engine was carefully coded closely elements' position at bits level within the word-cluster,
to Lao grammar so to guarantee a reversal transcription. in two to three bytes compare with current technology
12
: LaoUniKey's technical principles are described in
13
(Berment 2002a) at section 4: "An input method using hooks". : http://www.adobe.com/type/opentype/main.html
19 Conclusion
: See also (Berment 2002b) for a general approach.