Vous êtes sur la page 1sur 19

An Introduction to Language Processing with Perl and Prolog

Chapter 1: An Overview of Language Processing

Pierre Nugues
Lund University Pierre.Nugues@cs.lth.se http://www.cs.lth.se/home/Pierre_Nugues/

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

1 / 19

Chapter 1: An Overview of Language Processing

Applications of Language Processing

Spelling and grammatical checkers: MS Word Text indexing and information retrieval on the Internet: Google, Microsoft Bing, Yahoo Telephone information that understands some spoken questions: SJ (trains in Sweden) or Tellme.com in the United States Speech dictation of letters or reports: IBM ViaVoice, Windows Vista Translation: Google Translate, SYSTRAN

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

2 / 19

Chapter 1: An Overview of Language Processing

Applications of Language Processing (ctnd)

Direct translation from spoken English to spoken Swedish in a restricted domain: SRI and SICS Voice control of domestic devices such as tape recorders: Philips or disc changers: MS Persona Conversational agents able to dialogue and to plan: TRAINS Spoken navigation in virtual worlds: Ulysse, Higgins Generation of 3D scenes from text: Carsim

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

3 / 19

Chapter 1: An Overview of Language Processing

Linguistics Layers

Sounds Phonemes Words and morphology Syntax and functions Semantics Dialogue

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

4 / 19

Chapter 1: An Overview of Language Processing

Sounds and Phonemes

Serious

Cest par l` a It is that way

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

5 / 19

Chapter 1: An Overview of Language Processing

Lexicon and Parts of Speech

The big cat ate the gray mouse The /article big /adjective cat /noun ate /verb the /article gray /adjective mouse /noun Le /article gros /adjectif chat /nom mange /verbe la/article souris /nom grise /adjectif Die /Artikel groe /Adjektiv Katze /Substantiv it /Verb die /Artikel graue /Adjektiv Maus /Substantiv

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

6 / 19

Chapter 1: An Overview of Language Processing

Morphology

Word worked travaill e gearbeitet

Root form to work + verb + preterit travailler + verb + past participle arbeiten + verb + past participle

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

7 / 19

Chapter 1: An Overview of Language Processing

Syntactic Tree
sentence

noun phrase

verb phrase

article

noun

verb

noun phrase

article

noun

The

boy

hit

the

ball

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

8 / 19

Chapter 1: An Overview of Language Processing

Syntax: A Classical View

A graph of dependencies and functions

Verb Subject The boy


Object hit the ball

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

9 / 19

Chapter 1: An Overview of Language Processing

Semantics

As opposed to syntax:
1 2

Colorless green ideas sleep furiously. *Furiously sleep ideas green colorless.

Determining the logical form: Sentence Frank is writing notes Fran cois ecrit des notes Franz schreibt Notizen Logical representation writing(Frank, notes). ecrit(Fran cois, notes). schreibt(Franz, Notizen).

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

10 / 19

Chapter 1: An Overview of Language Processing

Lexical Semantics

Word senses:
1 2 3 4 5

note (noun) short piece of writing; note (noun) a single sound at a particular level; note (noun) a piece of paper money; note (verb ) to take notice of; note (noun) of note: of importance.

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

11 / 19

Chapter 1: An Overview of Language Processing

Reference

1. sentence Pierre wrote notes

2. logical representation wrote(pierre, notes)

3. real world

referencing

referencing

Louis Pierre Charlotte


operating systems computational linguistics Prolog programming

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

12 / 19

Chapter 1: An Overview of Language Processing

Ambiguity

Many analyses are ambiguous. It makes language processing dicult. Ambiguity occurs in any layer: speech recognition, part-of-speech tagging, parsing, etc. Example of an ambiguous phonetic transcription: The boys eat the sandwiches That may correspond to: The boy seat the sandwiches ; the boy seat this and which is ; the buoys eat the sand which is

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

13 / 19

Chapter 1: An Overview of Language Processing

Models and Tools

Linguistics has produced an impressive set of theories and models Language processing requires signicant resources Models and tools have matured. Resources are available. Tools involve notably nite-state automata, regular expressions, rewriting rules, logic, statistics and machine learning.

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

14 / 19

Chapter 1: An Overview of Language Processing

The Carsim System: A Text-to-Scene Converter


Texts V ehicule B venant de ma gauche, je me trouve dans le carrefour, ` a faible vitesse environ 40 km/h, quand le v ehicule B, percute mon v ehicule, et me refuse la priorit e ` a droite. Le premier choc atteint mon aile arri` ere gauche, XML Templates // Static Objects STATIC [ ROAD TREE ] // Dynamic Objects DYNAMIC [ VEHICLE [ ID = vehicule b; INITDIRECTION = east; 3D Animation

= NLP engine
Pierre Nugues

= Java 3D animation program


An Introduction to Language Processing with Perl and Prolog

15 / 19

Chapter 1: An Overview of Language Processing

Dialogue: The Persona Project from Microsoft Research


A conversation with Peedy Turn User: Peedy: User: Peedy: User: Utterance [Peedy is asleep on his perch] Good morning, Peedy. [Peedy rouses] Good morning. Lets do a demo. [Peedy stands up, smiles] Your wish is my command, what would you like to hear? What have you got by Bonnie Raitt? [Peedy waves in a stream of notes, and grabs one as they rush by.] I have The Bonnie Raitt Collection from 1990. Pick something from that How about Angel from Montgomery?

Peedy: User: Peedy:

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

16 / 19

Chapter 1: An Overview of Language Processing

Dialogue: The Persona Project from Microsoft Research


User: Peedy: User: Peedy: User: Peedy: User: Peedy: User: Sounds good. [Peedy drops note on pile] OK. Play some rock after that. [Peedy scans the notes again, selects one] How about Fools in love? Who wrote that? [Peedy cups one wing to his ear] Huh? Who wrote that? [Peedy looks up, scrunches his brow] Joe Jackson Fine. [Drops note on pile]

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

17 / 19

Chapter 1: An Overview of Language Processing

Persona System Architecture

Source: http: //research.microsoft.com/research/pubs/view.aspx?pubid=439


Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

18 / 19

Chapter 1: An Overview of Language Processing

Research Relevance
Large companies like Microsoft, Google, Yahoo, IBM, or Xerox have a research activity in natural language processing. The 7th European framework program (2007-2013) names six technology pillars in information technologies. Two of them are related to language processing: Knowledge, cognitive and learning systems: semantic systems; capturing and exploiting knowledge embedded in web and multimedia content; bio-inspired articial systems that perceive, understand, learn and evolve, and act autonomously; learning by convivial machines and humans based on a better understanding of human cognition. Simulation, visualization, interaction and mixed realities: tools for innovative design and creativity in products, services and digital media, and for natural, language-enabled and context-rich interaction and communication.

Pierre Nugues

An Introduction to Language Processing with Perl and Prolog

19 / 19

Vous aimerez peut-être aussi