Outline

Lecture outline

Uploaded by

killanholly97

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views34 pages

Outline

Lecture outline

Uploaded by

killanholly97

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Language and Statistics II

Lecture 1: Introduction
Noah Smith
Today’s Plan
• What’s this course about?
– Goals
– Topics
• How will we be evaluated?
• Some history
• Q&A
My Goals
• You’ll read, write, and present technical
information better.
• A deeper, more connected understanding of the
EMNLP literature.
• New ideas all around for bringing NLP “tasks”
closer to NLP applications.
• You’ll refine your taste for good research.
• You’ll enjoy the next ACL/NAACL/EMNLP and
impress people with your insight and
communication skills!
“Who Am I and Why Am I
Here?”
(introductions all around)
Presumptions
• You took L&S I, or equivalent
• You believe in statistical methods for
language technology
• You want to know more about current
practice in the area
Topics
• In lectures, we’ll mostly cover tools and “tasks”
• Your lit review will cover an application
• If you’ve taken ML, this course may feel
applications-focused.
• If you’ve taken applied NLP courses (ASR, IR, MT,
IE, etc.), this course may feel theoretical.
• If you’ve taken both, this course will tie things
together.
• If you’ve taken linguistics, this course will be
frustrating - that’s a good thing!
Topics
• Sequence models (might be review)
– Markov models
– HMMs
– Algorithms and applications
Topics
• Sequence models (might be review)
• Log-linear models
– Theory
– Practice
– Examples
• Ratnaparkhi tagger
• CRFs
– Tasks
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
– Algorithms
– Applications
– Tools
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
– Theory
– Eisner, Charniak, Ratnaparkhi, Collins, McDonald
– Maybe: LTAG, CCG
– Practical issues
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
• Dynamic programming
– Unified framework for aligning, labeling, parsing,
…
– Implementation challenges
– Limitations
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
• Dynamic programming
• Going discriminative
– Blast from the past: transformation-based learning
– Perceptrons and maximum-margin training
– Reranking
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
• Dynamic programming
• Going discriminative
• Going unsupervised
– Expectation-Maximization
– Contrastive Estimation
– Dirichlet processes (maybe)
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
• Dynamic programming
• Going discriminative
• Going unsupervised
• Going semi-supervised
– Self-training
– Yarowsky algorithm, Cotraining
Topics
• Sequence models (might be review)
• Log-linear models
• Weighted finite-state technology
• Weighted grammars and parsing
• Dynamic programming
• Going discriminative
• Going unsupervised
• Going semi-supervised
• Time/interest depending: MT, OT, Kernels
Evaluation
• Lectures, suggested readings 
– ~4-6 Assignments (20%)
– Final Exam (20%)

• Literature review
– Written document (35%)
– Oral presentation (25%)
Literature Review
Comprehensive review of the literature:
– Define clearly a problem within NLP
– Define existing evaluation procedures
– Discuss available datasets
– Thorough, coherent discussion of existing
techniques
– Comparison among techniques, if possible
– Current obstacles
– Insights on tackling or avoiding those obstacles,
improving evaluation, “scaling up,” etc.
Suggested Topics
• Question answering
• Textual entailment and paraphrase
• Morphology induction and modeling
• Syntax-based machine translation
• Data-oriented parsing and translation
• Syntax-based language modeling
• Finite-state parsing
• Optimality theory
(You’re welcome to propose other areas!)
Carrots
• Theses usually have literature reviews.
• Computational Linguistics will start
publishing literature reviews soon.
• Well-written reviews, when put online,
tend to be oft-referenced and oft-cited.
Deliverables
• Sept. 12: pick topics, initial reading list
• Oct. 16-20: progress meeting
• Nov. 10: first draft
• Last 1-2 weeks of class: talks
• Dec. 8: final version
Question
• Interspeech is the fourth week of term;
who intends to be there?
Supplications
• New faculty member, new course …
– Please have patience!
– All feedback is welcome!
• Ask questions!
– I don’t know everything.
– But I probably know where to look or who
to ask.
(Most of)
The Rest (of the Lecture)
is History

Let’s not repeat the mistakes of the past.

Cocktail party conversation at the next ACL.

Issues to keep in mind as we proceed.

Zellig Harris (1909-1992)
• Validation criteria for linguistic
analysis
• Linguistic transformations as a tool
for describing language
mathematically
• Centrality of data!
• Students: Chomsky, Gleitman,
Joshi, …
• Note: structuralism never died in
Europe.
Claude Shannon (1916-2001)
• Father of information theory
• Entropy: a mathematical measure
of uncertainty
• Information can be encoded
digitally; questions include how to
encode information efficiently and
reliably.
• Huge impact on speech
recognition (and space exploration
and digital media invention and …)
• 1949: Weaver compared
translation to cryptography
Victor Yngve (1920-)
• Early computational linguist
• Showed “depth limit” of human sentence
processing - restricted left branching (but
not right)
• Theme: what are the real observables in
language study? Sound waves!
• Early programming language, COMIT, for
linguists (influenced SNOBOL)
Yehoshua Bar-Hillel (1915-
1975)
• First academic to work on MT
• Believed in the close
relationship of logic and
language
• Tremendous foresight in
identifying the problems in
MT … before it existed.
Noam Chomsky (1928-)
• Universal grammar  productivity
• Chomsky hierarchy
• Generative grammar (P&P, GB,
Minimalism), a series of (mainly
syntax) theories that are based
largely on the
grammatical/ungrammatical
boundary.
• Data? Native speaker judgments.
• Claim: “probability of a sentence” is a
meaningless idea.
ALPAC Report
• “Automatic Language Processing Advisory
Committee” (1964-6)
• Skeptical of MT research; to paraphrase, “we
don’t have it and it looks like we never will.”
• Supportive of basic research in linguistics:
“we need understanding!”
• Bar-Hillel left the field
• Killed MT for a while
The Rise of Rationalism
(1960-85)
• In linguistics, more and more focus on
syntax, less on processing &
algorithms
• Rule-based approaches in AI ≈ innate
knowledge of language (reasoning, etc.)
• Many linguists didn’t/don’t care about
applications
Science-Engineering Debate
• NLP = CL?
• Are we doing science or engineering?
• Can computational experiments tell us
anything about human intelligence?
• Can theories of human intelligence give
insight to engineering problems?
• Beware the worst of both worlds:
– Science requires no application …
– Engineering requires no rigor …
The Return of Empiricism
(1985-today)
• Late 1980s: ASR meets NLP
• Major efforts at IBM
– Also, AT&T, U. Penn, CMU
• Candide - statistical MT model
• Spatter - statistical parser
• Now empirical methods are mainstream
Rationalist-Empiricist Debate
• Visible across AI, Cognitive Science, Linguistics
• Skinner/Chomsky
• ASR/GOFAI
• Connectionism/Symbolic systems
• Corpus-based linguistics/“theoretical” linguistics
• Statistical NLP/Knowledge-based NLP

• Mature view: science (many unresolved questions)

vs. engineering (use what you’ve got)
Some Opinions
• Good engineering is
– Intuitive and understandable
– Formally rigorous
– Replicable (like good science!)
• Mediocre engineering can often be “cleaned up”
later (if replicable)
• Linguistics (& cognitive science)  smarter models
and features
• Empirical NLP has more to offer to linguistics than
rule-based NLP!
• NLP is one of the most interesting & difficult ML
application areas

Lect1 Intro 3jan08
No ratings yet
Lect1 Intro 3jan08
94 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
68 pages
CB3591 - Engineering Ssecure Software Systems - Notes
No ratings yet
CB3591 - Engineering Ssecure Software Systems - Notes
50 pages
Unit 1
No ratings yet
Unit 1
99 pages
NLP Module 1
No ratings yet
NLP Module 1
55 pages
AnandKumar Course Intro IT356
No ratings yet
AnandKumar Course Intro IT356
42 pages
Unit I - Natural Language Processing
No ratings yet
Unit I - Natural Language Processing
34 pages
Part01 Overview
No ratings yet
Part01 Overview
31 pages
NLP Unit-1-Introduction-And-Word-Level-Analysis NLP Unit-1-Introduction-And-Word-Level-Analysis
No ratings yet
NLP Unit-1-Introduction-And-Word-Level-Analysis NLP Unit-1-Introduction-And-Word-Level-Analysis
26 pages
Part01 Overview
No ratings yet
Part01 Overview
31 pages
Introduction To NLPAbebe Zerihun
No ratings yet
Introduction To NLPAbebe Zerihun
45 pages
Module 1 Lecture 1
No ratings yet
Module 1 Lecture 1
29 pages
NLP Notes Unit 1
No ratings yet
NLP Notes Unit 1
42 pages
NLP Notes Unit 1
No ratings yet
NLP Notes Unit 1
42 pages
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
No ratings yet
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
41 pages
Nlp-Unit-I Final
No ratings yet
Nlp-Unit-I Final
31 pages
NLP Course for Students
No ratings yet
NLP Course for Students
25 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Natural Language Processing Slides
No ratings yet
Natural Language Processing Slides
1,027 pages
Intro IT356
No ratings yet
Intro IT356
45 pages
NLP Notes Unit 1to5 Final
No ratings yet
NLP Notes Unit 1to5 Final
75 pages
NLP Merged
100% (1)
NLP Merged
975 pages
NLP Course Introduction
No ratings yet
NLP Course Introduction
42 pages
1 - Introducntion To NLP
No ratings yet
1 - Introducntion To NLP
43 pages
Unit1 (Part1)
No ratings yet
Unit1 (Part1)
49 pages
NLP & Linguistics for Researchers
No ratings yet
NLP & Linguistics for Researchers
35 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
57 pages
AI M3 Merged PDF
No ratings yet
AI M3 Merged PDF
98 pages
NLP Course Overview Fall 2020
No ratings yet
NLP Course Overview Fall 2020
44 pages
NLP Course: Theory & Applications
No ratings yet
NLP Course: Theory & Applications
16 pages
NLP: History, Challenges, and Applications
No ratings yet
NLP: History, Challenges, and Applications
50 pages
1 - Intro - To - NLP 2
No ratings yet
1 - Intro - To - NLP 2
55 pages
NLP Unit-1-Introduction-And-Word-Level-Analysis
No ratings yet
NLP Unit-1-Introduction-And-Word-Level-Analysis
25 pages
NLP01 IntroNLP
No ratings yet
NLP01 IntroNLP
68 pages
Nlpslide
No ratings yet
Nlpslide
21 pages
Lec1-UNIT5 - MORE SIMPLER
No ratings yet
Lec1-UNIT5 - MORE SIMPLER
28 pages
Introduction to NLP and CL
100% (1)
Introduction to NLP and CL
182 pages
Introduction
No ratings yet
Introduction
29 pages
Lecture01 Introduction
No ratings yet
Lecture01 Introduction
37 pages
Module 1
No ratings yet
Module 1
40 pages
2025 NLP Lecture 01 Course Overview
No ratings yet
2025 NLP Lecture 01 Course Overview
60 pages
Module1 Chapter1
No ratings yet
Module1 Chapter1
23 pages
NLP Teaching Plan With UNIT 1
No ratings yet
NLP Teaching Plan With UNIT 1
60 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
01 Introduction To NLP
No ratings yet
01 Introduction To NLP
39 pages
NLP PPT1
No ratings yet
NLP PPT1
29 pages
Introduction To Natural Language Processing-03-01-2024
No ratings yet
Introduction To Natural Language Processing-03-01-2024
27 pages
Mod 1
No ratings yet
Mod 1
71 pages
PS: Introduction To Psycholinguistics
No ratings yet
PS: Introduction To Psycholinguistics
38 pages
Artificial Intelligence: Natural Language Processing
No ratings yet
Artificial Intelligence: Natural Language Processing
41 pages
PLN 1
No ratings yet
PLN 1
72 pages
NLP Introduction
No ratings yet
NLP Introduction
35 pages
ML Module A7707 - Part1
No ratings yet
ML Module A7707 - Part1
48 pages
Natural Language Processing Unit 1-2
No ratings yet
Natural Language Processing Unit 1-2
18 pages
LO1. Introduction To NLP
No ratings yet
LO1. Introduction To NLP
88 pages
Lec 1.1.2
No ratings yet
Lec 1.1.2
44 pages
Lecture 01
No ratings yet
Lecture 01
22 pages
Lesson 1 Introduction To Natural Language Processing
No ratings yet
Lesson 1 Introduction To Natural Language Processing
93 pages
Lect2 Regex
No ratings yet
Lect2 Regex
57 pages
Real Analysis I Course Outline
No ratings yet
Real Analysis I Course Outline
2 pages
SMA 300 Course Outline-1
No ratings yet
SMA 300 Course Outline-1
2 pages
Sma 300 Lecture 6 Sequences 240911 234658
No ratings yet
Sma 300 Lecture 6 Sequences 240911 234658
9 pages
Sma 300 Lecture 4 Topology of Real Numbers 240911 234645
No ratings yet
Sma 300 Lecture 4 Topology of Real Numbers 240911 234645
6 pages
Summerscales mclc2009
No ratings yet
Summerscales mclc2009
7 pages
Soal Dan Pembahasan Error Recognition Masuk Kelas ID
No ratings yet
Soal Dan Pembahasan Error Recognition Masuk Kelas ID
4 pages
English Las q1 Wk7
No ratings yet
English Las q1 Wk7
11 pages
Aqui Hay Todo Mija - AP Literature & Composition
No ratings yet
Aqui Hay Todo Mija - AP Literature & Composition
7 pages
Engineering Drawing (Draw 111)
100% (1)
Engineering Drawing (Draw 111)
82 pages
Image Caption Generator
No ratings yet
Image Caption Generator
16 pages
English Vocabularies
No ratings yet
English Vocabularies
62 pages
English Language Exercises
No ratings yet
English Language Exercises
41 pages
CL IF-Clauses - Type 2 - PDF Grammar Worksheet - B1 - IF011
No ratings yet
CL IF-Clauses - Type 2 - PDF Grammar Worksheet - B1 - IF011
1 page
CLASS 6 Sample Paper
No ratings yet
CLASS 6 Sample Paper
2 pages
Cambridge English Exams Overview
No ratings yet
Cambridge English Exams Overview
21 pages
UPPSC Mains 2023 - 2 Essay
No ratings yet
UPPSC Mains 2023 - 2 Essay
1 page
KET Reading & Writing Part 7 Guide
No ratings yet
KET Reading & Writing Part 7 Guide
7 pages
Another, Other, The Other
No ratings yet
Another, Other, The Other
2 pages
A History of The Sikhs, 1839-2004 (2 Volumes) - Umair Mirza - Free Download, Borrow, and Streaming - Internet Archive
No ratings yet
A History of The Sikhs, 1839-2004 (2 Volumes) - Umair Mirza - Free Download, Borrow, and Streaming - Internet Archive
4 pages
Marathi Morphology Analysis
No ratings yet
Marathi Morphology Analysis
19 pages
Learner Organist Course-Module 1-3 - Reduced Size
No ratings yet
Learner Organist Course-Module 1-3 - Reduced Size
27 pages
Emba 2013 CV Book
No ratings yet
Emba 2013 CV Book
281 pages
Teacher's Day Trivia or Riddles
No ratings yet
Teacher's Day Trivia or Riddles
2 pages
Wortschatz Kapitel 1
No ratings yet
Wortschatz Kapitel 1
3 pages
Present Simple Present Continuous
No ratings yet
Present Simple Present Continuous
2 pages
Download
No ratings yet
Download
23 pages
Chethan (CV) 1
No ratings yet
Chethan (CV) 1
2 pages
Unit 1.5 - Lesson 1b - Grammar - Page 19 - Reading
No ratings yet
Unit 1.5 - Lesson 1b - Grammar - Page 19 - Reading
30 pages
First Conditionals - Type 1
No ratings yet
First Conditionals - Type 1
4 pages
Reading 1: People
No ratings yet
Reading 1: People
2 pages
K2 Language Curriculum
No ratings yet
K2 Language Curriculum
13 pages
IELTS Masterclass - IELTS Writing Task 1 @aslanovs - Lessons
No ratings yet
IELTS Masterclass - IELTS Writing Task 1 @aslanovs - Lessons
4 pages
(PPT) - Topic 2
No ratings yet
(PPT) - Topic 2
27 pages
Red. (Den of Mercenaries Book 1 - London Miller
No ratings yet
Red. (Den of Mercenaries Book 1 - London Miller
758 pages

Outline

Uploaded by

Outline

Uploaded by

Language and Statistics II

Let’s not repeat the mistakes of the past.

Cocktail party conversation at the next ACL.

Issues to keep in mind as we proceed.

• Mature view: science (many unresolved questions)

You might also like