0% found this document useful (0 votes)

24 views35 pages

Lecture 07

Uploaded by

1162407364

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views35 pages

Lecture 07

Uploaded by

1162407364

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Natural Language

Processing
Lecture 7: Parsing with Context Free Grammars II.
CKY for PCFGs. Earley Parser.

11/13/2020

COMS W4705
Yassine Benajiba
Recall: Syntactic Ambiguity
S → NP VP NP → she
VP → V NP NP → glasses
VP → VP PP D → the
PP → P NP N → cat
NP →DN N → glasses
NP → NP PP V → saw
S S
P → with
VP[1,6] VP[1,6]

VP[2,4] NP[2,6]

NP PP[4,6] NP PP

NP V D N P NP NP V[1,2] D N P NP
she saw the cat with glasses she saw the cat with glasses

Which parse tree is “better”? More probable?

Probabilities for Parse Trees
• Let be the set of all parse trees generated by
grammar G.

• We want a model that assigns a probability to each parse

tree, such that .

• We can use this model to select the most probable parse

tree compatible with an input sentence.

• This is another example of a generative model!

Selecting Parse Trees
• Let be the set of trees generated by grammar G whose
yield (sequence of leafs) is string s.

• The most likely parse tree produced by G for string s is

• How do we define P(t)?

• How do we learn such a model from training data (annotated or

un-annotated).

• How do we find the highest probability tree for a given

sentence? (parsing/decoding)
Probabilistic Context Free
Grammars (PCFG)
• A PCFG consists of a Context Free Grammar
G=(N, Σ, R, S) and a probability P(A → β) for each
production A → β ∈ R.

• The probabilities for all rules with the same left-hand-

side sum up to 1:

• Think of this as the conditional probability for A → β,

given the left-hand-side nonterminal A.
PCFG Example
S → NP VP [1.0] NP → she [0.05]
VP → V NP [0.6] NP → glasses [0.05]
VP → VP PP [0.4] D → the [1.0]
PP → P NP [1.0] N → cat [0.3]
NP →DN [0.7] N → glasses [0.7]
NP → NP PP [0.2] V → saw [1.0]
P → with [1.0]
Parse Tree Probability
• Given a parse tree , containing rules
the probability of t is

S → NP PP 1.0

VP → VP PP .4

VP → V NP
.6
NP → D N PP →P NP
.7 1.0

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05

1 x .05 x .4 x .6 x 1 x 0.7 x 1 x 0.3 x 1 x 1 x .05 = .000126

Parse Tree Probability
• Given a parse tree , containing rules
the probability of t is

S → NP PP 1.0

VP → V NP .6

NP → NP PP
.2

NP → D N PP →P NP
.7 1.0

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05

1 x .05 x .6 x 1 x .2 x .7 x 1 x .3 x 1 x 1 x .05 = 0.000063 < 0.000126

Estimating PCFG
probabilities
• Supervised training: We can estimate PCFG probabilities from a
treebank, a corpus manually annotated with constituency
structure using maximum likelihood estimates:

• Unsupervised training:

• What if we have a grammar and a corpus, but no annotated

parses?

• Can use the inside-outside algorithm for parsing and do EM

estimation of the probabilities (not discussed in this course)
The Penn Treebank
• Syntactically annotated corpus of newspaper text (1989
Wall Street Journal Articles).
• The source text is naturally occurring but the treebank is
not:
• Assumes a specific linguistic theory (although a simple
one).
• Very flat structure (NPs, Ss, VPs).
PTB Example
( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken))
(, ,)
(ADJP (NML (CD 61) (NNS years))
(JJ old))
(, ,))
(VP (MD will)
(VP (VB join)
(NP (DT the) (NN board))
(PP-CLR (IN as)
(NP (DT a) (JJ nonexecutive) (NN director)))
(NP-TMP (NNP Nov.) (CD 29))))
(. .)))
PTB Example
Parsing with PCFG
• We want to use PCFG to answer the following questions:

• What is the total probability of the sentence under the

PCFG?

• What is the most probable parse tree for a sentence

under the PCFG? (decoding/parsing)

• We can modify the CKY algorithm.

Basic idea: Compute these probabilities bottom-up using
dynamic programming.
Computing Probabilities
Bottom-Up

S → NP PP .05 x .00126 x 1 = 0.000063

VP → V NP 1 x .0021 x .6 = .00126

NP → NP PP .21 x x .05 x .2 = .0021

NP → D N PP →P NP
1 x .3 x .7 = .21 1 x 1 x .05 = .05

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05
CKY for PCFG Parsing
• Let be the set of trees generated by grammar G
starting at nonterminal A, whose yield is string s

• Use a chart π so that π[i,j,A] contains the probability of the highest

probability parse tree for string s[i,j] starting in nonterminal A.

• We want to find π[0,lenght(s),S] -- the probability of the highest-

scoring parse tree for s rooted in the start symbol S.
CKY for PCFG Parsing
• To compute π[0,lenght(s),S] we can use the following recursive
definition:

Base case:

• Then fill the chart using dynamic programming.

CKY for PCFG Parsing
• Input: PCFG G=(N, Σ, R, S), input string s of length n.

• for i=0…n-1: initialization

• for length=2…n: main loop

for i=0…(n-length):
j = i+length
for k=i+1…j-1:
for A ∈ N:

Use backpointers to retrieve the highest-scoring parse tree (see previous lecture).
Probability of a Sentence

• What if we are interested in the probability of a sentence,

not of a single parse tree (for example, because we want
to use the PCFG as a language model).

• Problem: Spurious ambiguity. Need to sum the

probabilities of all parse trees for the sentence.

• How do we have to change CKY to compute this?

Earley Parser
• CKY parser starts with words and builds parse trees bottom-
up; requires the grammar to be in CNF.

• The Earley parser instead starts at the start symbol and tries
to “guess” derivations top-down.

• It discards derivations that are incompatible with the

sentence.

• The early parser sweeps through the sentence left-to-right

only once. It keeps partial derivations in a table (“chart”).

• Allows arbitrary CFGs, no limitation to CNF.

Parser States
• Earley parser keeps track of partial derivations using parser
states / items.

• State represent hypotheses about constituent structure based

on the grammar, taking into account the input.

• Parser states are represented as dotted rules with spans.

• The constituents to the left of the · have already been seen
in the input string s (corresponding to the span)

S → · NP VP [0,0] “According to the grammar, there may be an NP

starting in position 0. “

NP → D A · N [0,2] "There is a determiner followed by an adjective in s[0,2]“

NP → NP PP · [3,8] "There is a complete NP in s[3,8], consisting of an NP and PP”

Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → · the [0,0]
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1]
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] passive state
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D · N [0,1]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] passive state
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D · N [0,1]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → · student [1,1]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D N · [0,2]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → student · [1,2]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → NP · VP [0,2]
VP → VP PP D → the
NP → NP · PP [0,2] NP → D N · [0,2]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → student · [1,2]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Algorithm
• Keep track of parser states in a table (“chart”). Chart[k]
contains a set of all parser states that end in position k.
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• Initialization: For each production S→α ∈R

add a state S →·α[0,0] to Chart[0].

• for i = 0 to n:
• for each state in Chart[i]:
• if state is of form A →α ·s[i] β [k,i]:
scan(state)
• elif state is of form A →α ·B β [k,i]:
predict(state)

• elif state is of form A →α · [k,i]

complete(state)
Earley Algorithm
• Keep track of parser states in a table (“chart”). Chart[k]
contains a set of all parser states that end in position k.
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• Initialization: For each production S→α ∈R

add a state S →·α[0,0] to Chart[0].

• for i = 0 to n:
• for each state in Chart[i]:
• if state is of form A →α ·s[i] β [k,i]:
scan(state) else then is states of form
A →α · β [k,i], i.e.
• elif state is of form A →α ·B β [k,i]:
predict(state) β is not s[i], in which case we
don’t want to do anything
• elif state is of form A →α · [k,i]
complete(state)
Earley Algorithm - Scan
• The scan operation can only be applied to a state if the dot is
in front of a terminal symbol that matches the next input
terminal.

• function scan(state): // state is of form A →α ·s[i] β [k,i]

• Add a new state A →α s[i]·β [k,i+1]

to Chart[i+1]
Earley Algorithm - Predict
• The predict operation can only be applied to a state if the dot is
in front of a non-terminal symbol.

• function predict(state): // state is of form A →α ·B β [k,i]:

• Add a new state B →· γ [i,i]

to Chart[i]

• Note that this modifies Chart[i] while the algorithm is looping

through it.

• No duplicate states are added (Chart[i] is a set)

Earley Algorithm - Complete
• The complete operation may only be applied to a passive item.

• function complete(state): // state is of form A →α · [k,j]

• for each state B → β ·A γ [i,k] add a new state

B → β A · γ[i,j] to Chart[j]

• Note that this modifies Chart[i] while the algorithm is looping

through it.

• Note that it is important to make a copy of the old state

before moving the dot.
• This operation is similar to the combination operation in CKY!
Earley Algorithm - Runtime
• The runtime depends on the number of items in the chart
(each item is “visited” exactly once).

• We proceed through the input exactly once, which takes

O(N).

• For each position on the chart, there are O(N) possible split
points where the dot could be.

• Each complete operation can produce O(N) possible new

items (with different starting points).

• Total: O(N3)
Earley Algorithm -
Some Observations
• How do we recover parse trees?

• What happens in case of ambiguity?

• Multiple ways to Complete the same state.

• Keep back-pointers in the parser state objects.

• Or use a separate data structure (CKY-style table or

hashed states)

• How do we make the algorithm work with PCFG?

• Easy to compute probabilities on Complete. Follow back pointer with

max probability.

Constituency Parsing PPT 2
No ratings yet
Constituency Parsing PPT 2
33 pages
Unit 3
No ratings yet
Unit 3
19 pages
NLP Unit-Iii
No ratings yet
NLP Unit-Iii
26 pages
Advanced NLP: CFG Parsing Guide
No ratings yet
Advanced NLP: CFG Parsing Guide
28 pages
6 Probabilisticparse
No ratings yet
6 Probabilisticparse
46 pages
NLP Unit 3
No ratings yet
NLP Unit 3
17 pages
Week 3 - Probablistic Context Free Grammars
No ratings yet
Week 3 - Probablistic Context Free Grammars
18 pages
NLP Sem 3 Unit
No ratings yet
NLP Sem 3 Unit
12 pages
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
No ratings yet
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
21 pages
NLP Unit 3 (Part 1)
No ratings yet
NLP Unit 3 (Part 1)
7 pages
NLPPR6
No ratings yet
NLPPR6
6 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Natural Language Processing UNIT 2
No ratings yet
Natural Language Processing UNIT 2
32 pages
Thuật toán NLP
No ratings yet
Thuật toán NLP
57 pages
Chapter 9 V 2
No ratings yet
Chapter 9 V 2
18 pages
Slp14 Handout s17hw
No ratings yet
Slp14 Handout s17hw
71 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
Parsing and Ambiguity in NLP
No ratings yet
Parsing and Ambiguity in NLP
18 pages
Parsing Techniques for NLP Students
No ratings yet
Parsing Techniques for NLP Students
60 pages
Chart Parsers PDF
No ratings yet
Chart Parsers PDF
7 pages
NLP Parsing Techniques
No ratings yet
NLP Parsing Techniques
54 pages
14 Ai Cse551 NLP 2 PDF
No ratings yet
14 Ai Cse551 NLP 2 PDF
39 pages
SCFG PCFG LCFG
No ratings yet
SCFG PCFG LCFG
25 pages
PCFGs for Linguistics Students
No ratings yet
PCFGs for Linguistics Students
79 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
NLP Parsing Techniques Explained
No ratings yet
NLP Parsing Techniques Explained
11 pages
Notes 4
No ratings yet
Notes 4
7 pages
Probabilistic Context-Free Grammar
No ratings yet
Probabilistic Context-Free Grammar
13 pages
CS6120 35650 - Spring2025 - Assignment - 2-1
No ratings yet
CS6120 35650 - Spring2025 - Assignment - 2-1
5 pages
Basic Parsing Techniques - Parsing
No ratings yet
Basic Parsing Techniques - Parsing
20 pages
CFG & PCFG
No ratings yet
CFG & PCFG
15 pages
NLP Unit-2
No ratings yet
NLP Unit-2
18 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
Context-Free Grammars and Parsing
No ratings yet
Context-Free Grammars and Parsing
7 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
NLP 3
No ratings yet
NLP 3
4 pages
Formal Languages, Automata and Computability
No ratings yet
Formal Languages, Automata and Computability
29 pages
Xu-Ly-Ngon-Ngu-Tu-Nhien - Kai-Wei-Chang - 16-Cky - (Cuuduongthancong - Com)
No ratings yet
Xu-Ly-Ngon-Ngu-Tu-Nhien - Kai-Wei-Chang - 16-Cky - (Cuuduongthancong - Com)
61 pages
18-Predictive Parsing
No ratings yet
18-Predictive Parsing
152 pages
Unit 3
No ratings yet
Unit 3
4 pages
Early Parser
No ratings yet
Early Parser
4 pages
CH 08
No ratings yet
CH 08
31 pages
14 Syntax 1
No ratings yet
14 Syntax 1
22 pages
4.chapter5 - Syntactic and Semantic Representations
No ratings yet
4.chapter5 - Syntactic and Semantic Representations
47 pages
A Look at Parsing and Its Applications
No ratings yet
A Look at Parsing and Its Applications
5 pages
NLP Unit-4
No ratings yet
NLP Unit-4
6 pages
Basic Parsing Techniques
No ratings yet
Basic Parsing Techniques
9 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
Efficient Earley Parsing With Regular Right-Hand Sides
No ratings yet
Efficient Earley Parsing With Regular Right-Hand Sides
14 pages
The Expectation Maximization (EM) Algorithm: Continued!
No ratings yet
The Expectation Maximization (EM) Algorithm: Continued!
67 pages
PCFG
No ratings yet
PCFG
79 pages
Semiring Parsing
No ratings yet
Semiring Parsing
34 pages
Inducing Tree-Substitution Grammars: Trevor Cohn
No ratings yet
Inducing Tree-Substitution Grammars: Trevor Cohn
44 pages
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
Alc P2 QP (05.05.2023)
No ratings yet
Alc P2 QP (05.05.2023)
1 page
Compiler Design BTCS3602 Question Bank 1
No ratings yet
Compiler Design BTCS3602 Question Bank 1
4 pages
Logic Assignment
No ratings yet
Logic Assignment
3 pages
Question Bank FLA
No ratings yet
Question Bank FLA
8 pages
L2 - Lexical Analysis PDF
No ratings yet
L2 - Lexical Analysis PDF
49 pages
Intro to Predicate Logic
No ratings yet
Intro to Predicate Logic
20 pages
Regular Languages and Finite Automata: Lecture Notes On
No ratings yet
Regular Languages and Finite Automata: Lecture Notes On
56 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
CompilerDesign VK
No ratings yet
CompilerDesign VK
31 pages
Natural Deduction in CS: Part 3
No ratings yet
Natural Deduction in CS: Part 3
6 pages
Principles of Programming Languages: Syntax Analysis
100% (1)
Principles of Programming Languages: Syntax Analysis
51 pages
Compiler
No ratings yet
Compiler
81 pages
Predicate Logic
No ratings yet
Predicate Logic
64 pages
LP G8WProofs
No ratings yet
LP G8WProofs
8 pages
Chapter 7 - Compiler Construction
No ratings yet
Chapter 7 - Compiler Construction
23 pages
SPCC Case Study Parser
No ratings yet
SPCC Case Study Parser
4 pages
Infin Proof Theory Lecture Notes
No ratings yet
Infin Proof Theory Lecture Notes
65 pages
Module 1 Logical Reasonng PROPOSITIONS
No ratings yet
Module 1 Logical Reasonng PROPOSITIONS
22 pages
07 - First Order Logic
No ratings yet
07 - First Order Logic
20 pages
Chiradza Lawine H190638e Assignment 2
No ratings yet
Chiradza Lawine H190638e Assignment 2
5 pages
ITF24-DS-Assignment #1
No ratings yet
ITF24-DS-Assignment #1
3 pages
Bourbaki
No ratings yet
Bourbaki
20 pages
Formal Language Solutions
No ratings yet
Formal Language Solutions
3 pages
Rules of Inference
No ratings yet
Rules of Inference
10 pages
CC Project Proposal
No ratings yet
CC Project Proposal
10 pages
Logical Propositions Explained
No ratings yet
Logical Propositions Explained
22 pages
KCS-402 2022-23
No ratings yet
KCS-402 2022-23
2 pages
Lecture 07
No ratings yet
Lecture 07
27 pages
Pure Mathematics - Reasoning and Logic
No ratings yet
Pure Mathematics - Reasoning and Logic
24 pages
Logic Chapter 1 - Decision Procedure
100% (2)
Logic Chapter 1 - Decision Procedure
5 pages