An Introduction to Language Processing with Perl and Prolog
Chapter 1: An Overview of Language Processing
Pierre Nugues
Lund University Pierre.Nugues@cs.lth.se http://www.cs.lth.se/home/Pierre_Nugues/
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
1 / 19
Chapter 1: An Overview of Language Processing
Applications of Language Processing
Spelling and grammatical checkers: MS Word Text indexing and information retrieval on the Internet: Google, Microsoft Bing, Yahoo Telephone information that understands some spoken questions: SJ (trains in Sweden) or Tellme.com in the United States Speech dictation of letters or reports: IBM ViaVoice, Windows Vista Translation: Google Translate, SYSTRAN
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
2 / 19
Chapter 1: An Overview of Language Processing
Applications of Language Processing (ctnd)
Direct translation from spoken English to spoken Swedish in a restricted domain: SRI and SICS Voice control of domestic devices such as tape recorders: Philips or disc changers: MS Persona Conversational agents able to dialogue and to plan: TRAINS Spoken navigation in virtual worlds: Ulysse, Higgins Generation of 3D scenes from text: Carsim
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
3 / 19
Chapter 1: An Overview of Language Processing
Linguistics Layers
Sounds Phonemes Words and morphology Syntax and functions Semantics Dialogue
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
4 / 19
Chapter 1: An Overview of Language Processing
Sounds and Phonemes
Serious
Cest par l` a It is that way
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
5 / 19
Chapter 1: An Overview of Language Processing
Lexicon and Parts of Speech
The big cat ate the gray mouse The /article big /adjective cat /noun ate /verb the /article gray /adjective mouse /noun Le /article gros /adjectif chat /nom mange /verbe la/article souris /nom grise /adjectif Die /Artikel groe /Adjektiv Katze /Substantiv it /Verb die /Artikel graue /Adjektiv Maus /Substantiv
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
6 / 19
Chapter 1: An Overview of Language Processing
Morphology
Word worked travaill e gearbeitet
Root form to work + verb + preterit travailler + verb + past participle arbeiten + verb + past participle
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
7 / 19
Chapter 1: An Overview of Language Processing
Syntactic Tree
sentence
noun phrase
verb phrase
article
noun
verb
noun phrase
article
noun
The
boy
hit
the
ball
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
8 / 19
Chapter 1: An Overview of Language Processing
Syntax: A Classical View
A graph of dependencies and functions
Verb Subject The boy
Object hit the ball
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
9 / 19
Chapter 1: An Overview of Language Processing
Semantics
As opposed to syntax:
1 2
Colorless green ideas sleep furiously. *Furiously sleep ideas green colorless.
Determining the logical form: Sentence Frank is writing notes Fran cois ecrit des notes Franz schreibt Notizen Logical representation writing(Frank, notes). ecrit(Fran cois, notes). schreibt(Franz, Notizen).
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
10 / 19
Chapter 1: An Overview of Language Processing
Lexical Semantics
Word senses:
1 2 3 4 5
note (noun) short piece of writing; note (noun) a single sound at a particular level; note (noun) a piece of paper money; note (verb ) to take notice of; note (noun) of note: of importance.
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
11 / 19
Chapter 1: An Overview of Language Processing
Reference
1. sentence Pierre wrote notes
2. logical representation wrote(pierre, notes)
3. real world
referencing
referencing
Louis Pierre Charlotte
operating systems computational linguistics Prolog programming
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
12 / 19
Chapter 1: An Overview of Language Processing
Ambiguity
Many analyses are ambiguous. It makes language processing dicult. Ambiguity occurs in any layer: speech recognition, part-of-speech tagging, parsing, etc. Example of an ambiguous phonetic transcription: The boys eat the sandwiches That may correspond to: The boy seat the sandwiches ; the boy seat this and which is ; the buoys eat the sand which is
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
13 / 19
Chapter 1: An Overview of Language Processing
Models and Tools
Linguistics has produced an impressive set of theories and models Language processing requires signicant resources Models and tools have matured. Resources are available. Tools involve notably nite-state automata, regular expressions, rewriting rules, logic, statistics and machine learning.
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
14 / 19
Chapter 1: An Overview of Language Processing
The Carsim System: A Text-to-Scene Converter
Texts V ehicule B venant de ma gauche, je me trouve dans le carrefour, ` a faible vitesse environ 40 km/h, quand le v ehicule B, percute mon v ehicule, et me refuse la priorit e ` a droite. Le premier choc atteint mon aile arri` ere gauche, XML Templates // Static Objects STATIC [ ROAD TREE ] // Dynamic Objects DYNAMIC [ VEHICLE [ ID = vehicule b; INITDIRECTION = east; 3D Animation
= NLP engine
Pierre Nugues
= Java 3D animation program
An Introduction to Language Processing with Perl and Prolog
15 / 19
Chapter 1: An Overview of Language Processing
Dialogue: The Persona Project from Microsoft Research
A conversation with Peedy Turn User: Peedy: User: Peedy: User: Utterance [Peedy is asleep on his perch] Good morning, Peedy. [Peedy rouses] Good morning. Lets do a demo. [Peedy stands up, smiles] Your wish is my command, what would you like to hear? What have you got by Bonnie Raitt? [Peedy waves in a stream of notes, and grabs one as they rush by.] I have The Bonnie Raitt Collection from 1990. Pick something from that How about Angel from Montgomery?
Peedy: User: Peedy:
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
16 / 19
Chapter 1: An Overview of Language Processing
Dialogue: The Persona Project from Microsoft Research
User: Peedy: User: Peedy: User: Peedy: User: Peedy: User: Sounds good. [Peedy drops note on pile] OK. Play some rock after that. [Peedy scans the notes again, selects one] How about Fools in love? Who wrote that? [Peedy cups one wing to his ear] Huh? Who wrote that? [Peedy looks up, scrunches his brow] Joe Jackson Fine. [Drops note on pile]
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
17 / 19
Chapter 1: An Overview of Language Processing
Persona System Architecture
Source: http: //research.microsoft.com/research/pubs/view.aspx?pubid=439
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
18 / 19
Chapter 1: An Overview of Language Processing
Research Relevance
Large companies like Microsoft, Google, Yahoo, IBM, or Xerox have a research activity in natural language processing. The 7th European framework program (2007-2013) names six technology pillars in information technologies. Two of them are related to language processing: Knowledge, cognitive and learning systems: semantic systems; capturing and exploiting knowledge embedded in web and multimedia content; bio-inspired articial systems that perceive, understand, learn and evolve, and act autonomously; learning by convivial machines and humans based on a better understanding of human cognition. Simulation, visualization, interaction and mixed realities: tools for innovative design and creativity in products, services and digital media, and for natural, language-enabled and context-rich interaction and communication.
Pierre Nugues
An Introduction to Language Processing with Perl and Prolog
19 / 19