0% found this document useful (0 votes)

41 views37 pages

Lecture15 Parsing

The document provides an overview of parsing in Natural Language Processing, detailing top-down and bottom-up parsing methods, their advantages and disadvantages, and specific algorithms like the Earley and CYK parsers. It also discusses probabilistic parsing and its benefits, particularly in the context of Indian languages and their unique grammatical structures. References to foundational works in the field are included, highlighting the complexity of parsing in diverse linguistic contexts.

Uploaded by

shwetavairagi.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views37 pages

Lecture15 Parsing

Uploaded by

shwetavairagi.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

|| Jai Sri Gurudev||

Sri Adichunchanagiri Shikshana Trust (R)

SJB INSTITUTE OF TECHNOLOGY

(Affiliated to Visvesvaraya Technological University, Belagavi& Approved by AICTE, New Delhi.)
No. 67, BGS Health & Education City, Dr. Vishnuvardhan Road Kengeri, Bengaluru – 560 060

Subject: Natural Language Processing(18CS743)

By
CHETAN R, Assistant Professor
Semester / Section: 7A and B

Department of Information Science & Engineering

Aca. Year: ODD SEM /2021-22
PARSING

2
Overview
 The task that uses the rewrite rules of a grammar to either generate a
particular sequence of words or reconstruct its derivation is termed
as parsing.
 The following constraints guide the search process:
1. Input: Words in the input sentence. A valid parse is one that covers
all the words in a sentence. These words must constitute the leaves of
the final parse tree.
2. Grammar: The root of the final parse tree must be the start symbol
of the grammar.
3
Top-down parsing

4
Top-down search space

5
Bottom-up parsing

6
Pros and Cons
 The top-down parser search starts generating trees with the start
symbol of the grammar. It never wastes time exploring a tree
leading to a different root.
 It wastes time exploring S trees that eventually result in words that
are inconsistent with the input.
 The bottom up parser never explores a tree that does not match
the input.
 It wastes time generating trees that have no chance of leading to an
7 S-rooted tree.
Basic Top-Down Parser

8
Derivation using top-down, depth first algorithm

9
Disadvantages
1. Left Recursion: which causes the search to get stuck in an infinite loop.
2. Structural Ambiguity: which occurs when a grammar assigns more than one parse
to a sentence.
3. Attachment Ambiguity: If a constituent fits more than one position in a parse tree.
4. Coordination Ambiguity: Occurs when its is nit clear which phrases are being
combined with a conjunction like ‘and’.
5. Local Ambiguity: Occurs when certain parts of a sentence are ambiguous.
6. Repeated Parsing: Parser often builds valid trees for portions of the input that it
discards during backtracking.

10
EARLEY PARSER
 It implements an efficient parallel top-down search using
dynamic programming.
 It builds a table of sub-trees for each of the constituents in the
input.
 The most important component of this algorithm is the Earley
Chart that has n+1 entries, where n is the number of words in
the input.
 The algorithm makes a left to right scan of input to fill the
11 elements in this chart.
State Information
1. A sub-tree corresponding to a grammar rule.
2. Information about the program made in completing the
sub-tree.
3. Position of the sub-tree with respect to input.
A state is represented as a dotted rule and a pair of numbers
representing starting position and the position of dot.
A  X1 …  C … Xm, [i,j]

12
Algorithm

13
Predictor
 Generates new states representing potential expansion of the
non-terminal in the left-most derivation.
 It is applied to every state that has a non-terminal to the right of
the dot, when the category of that non-terminal is different
from the part-of-speech.
 If A  X1 …  C … Xm, [i,j] then for every rule of the form
B the operation adds to chart[j], the state:
B  , [j,j]
14
Example
 When the generating state is S   NP VP, [0,0] the predictor
adds the following states to chart[0]:
 NP   Det Nominal, [0,0]
 NP   Noun, [0,0]
 NP   Pronoun, [0,0]
 NP   Det Noun PP, [0,0]

15
Scanner
 Scanner is used when a state has a part of speech category to
the right of the dot.
 It examines the input to see if the part-of-speech appearing to
the right of the dot matches one of the part-of-speech
associated with the current input.
 If Yes, then it creates a new state using the rule.
 If the state is A  …  a …, [i,j] and ‘a’ is associated with wj
then, it adds a  … wj  [i,j] to chart [j+1].
16
Completer
 Completer is used when the dot reaches the right end of the
rule.
 The presence of such a state signifies successful completion of
the parse of some grammatical category.
 If A  … , [j,k], then the computer adds
B  … A  … [i, k] to chart [k] for all states
B  … A  … , [i, j] in chart [j].

17
18
CYK Parser
 Cocke-Younger-Kasami is a dynamic programming parsing
algorithm.
 It follows a bottom-up approach in parsing. Builds the parse tree
incrementally. Each entry in the table is based on the previous
entries.
 The CYK algorithm assumes the grammar to be in chomsky normal
form (CNF). A CFG is in CNF if all the rules are of only two forms.
A BC
19 A  W, where W is a word.
CYK Algorithm

20
Example
Sentence: “The girl wrote an essay”

21
Probabilistic Parsing
 A statistical parser works by assigning probabilities to possible parses
of a sentence and returning the most likely parse as the final one.
 More formally given a grammar G, sentence s and a set of possible
parse trees of s which we denote by (s), a probabilistic parser finds
the most likely parse ` of s as follows:

22
Advantages
1. Probabilistic parser offers is removal of ambiguity for
parsing.
2. The search becomes more efficient.

23
Probabilistic Context Free Grammar

24
Example

25
Probability Estimation

26
Estimating Rule Probability

27
Two Parse Trees

28
Parsing PCFGs

29
Indian Languages
The majority of the indian languages are free word order.
The order of the sentence can be changed without leading
to a grammatically incorrect sentence.

30
Contd..
 Extensive and productive use of complex predicates (CPs) is another property that
most Indian languages have in common.
 A complex predicate combines a light verb, noun, or adjective to produce a new
verb.

31
Parsing Indian Languages
 Bharti and Sangal described an approach for parsing of Indian
languages based on Paninian grammar formalism. It has 2 stages:
1. Is responsible for identifying word groups.
2. Assigning parse structure to the input sentence.

32
Karaka Chart

33
Constraint Graph

34
Constraints

35
Parse of the sentence

36
References
1. Bharti, Akshar and Rajeev Sangal, 1990, ‘A Karaka-based
approach parsing of Indian Languages’, Proceedings of the 13th
Conference on Computational Linguistics, Association for
Computational Linguistics, 3.
2. Chomsky, N., 1957, Syntactic Structures, Mouton, The Hague.

NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
Natural Language Processing UNIT 2
No ratings yet
Natural Language Processing UNIT 2
32 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Top-Down Parsing Techniques
No ratings yet
Top-Down Parsing Techniques
73 pages
CS6109 Module 5
No ratings yet
CS6109 Module 5
117 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
67 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
Compiler Syntax Analysis Guide
No ratings yet
Compiler Syntax Analysis Guide
42 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Unit 2
No ratings yet
Unit 2
22 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Lexical Analysis and Parsing Techniques
No ratings yet
Lexical Analysis and Parsing Techniques
62 pages
CD Unit 3
No ratings yet
CD Unit 3
76 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Chapter 3 (Part 1)
No ratings yet
Chapter 3 (Part 1)
33 pages
Compiler Designnotes
No ratings yet
Compiler Designnotes
18 pages
CD - CH3 - Syntax Analysis (Parsing)
No ratings yet
CD - CH3 - Syntax Analysis (Parsing)
109 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
Unit - 3 Syntax Analyzer
No ratings yet
Unit - 3 Syntax Analyzer
43 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Lecture 13
No ratings yet
Lecture 13
35 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
Basic Parsing Techniques - Parsing
No ratings yet
Basic Parsing Techniques - Parsing
20 pages
Unit - 2 NLP - R20
No ratings yet
Unit - 2 NLP - R20
21 pages
Compiler Construction Lecture 12 Predictive Parsing-Step1
No ratings yet
Compiler Construction Lecture 12 Predictive Parsing-Step1
24 pages
Compiler Rewind
No ratings yet
Compiler Rewind
52 pages
Chart Parsers PDF
No ratings yet
Chart Parsers PDF
7 pages
Name: Gapkwi S. Reuel REG NO: U21DLCS10193 Course: Cosc 408: A. What Is Analytic Grammar?
No ratings yet
Name: Gapkwi S. Reuel REG NO: U21DLCS10193 Course: Cosc 408: A. What Is Analytic Grammar?
8 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 8-9 Instructor Name: Atif Ishaq
34 pages
Compiler CH-3
No ratings yet
Compiler CH-3
6 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
36 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
Unit 3
No ratings yet
Unit 3
19 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
54 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Parsing - 1
No ratings yet
Parsing - 1
59 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
Module 3 NLP
No ratings yet
Module 3 NLP
32 pages
MODULE 3 Syntax Analysis
100% (1)
MODULE 3 Syntax Analysis
182 pages
Thuật toán NLP
No ratings yet
Thuật toán NLP
57 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
CH03
No ratings yet
CH03
57 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
56 pages
Presented by Jyoti Thakur
No ratings yet
Presented by Jyoti Thakur
31 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
Context-Free Grammars Explained
No ratings yet
Context-Free Grammars Explained
21 pages
Syntax Analysis and Parsing Guide
No ratings yet
Syntax Analysis and Parsing Guide
105 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
Class Three
No ratings yet
Class Three
74 pages
Practicals
No ratings yet
Practicals
27 pages
Cyber
No ratings yet
Cyber
12 pages
Blockchain
No ratings yet
Blockchain
13 pages
Os Lab Manual 23-24
No ratings yet
Os Lab Manual 23-24
49 pages
A Sample PYP Planner Rubric: 4 3 2 1 Central Idea and Lines of Inquiry
No ratings yet
A Sample PYP Planner Rubric: 4 3 2 1 Central Idea and Lines of Inquiry
4 pages
Form Three Lesson Plan Term 2
No ratings yet
Form Three Lesson Plan Term 2
81 pages
Lesson Plan 1
No ratings yet
Lesson Plan 1
2 pages
Module 11 Use of Tech
No ratings yet
Module 11 Use of Tech
11 pages
Barnett, M. (2005) - Social Constructivism PDF
100% (1)
Barnett, M. (2005) - Social Constructivism PDF
20 pages
Infographic Sensation and Perception
No ratings yet
Infographic Sensation and Perception
2 pages
PEPP
No ratings yet
PEPP
2 pages
English Lecture (1-6) Solutions
No ratings yet
English Lecture (1-6) Solutions
84 pages
ANT253 Finals Guide
No ratings yet
ANT253 Finals Guide
23 pages
A Study of The Impact of Video Games On Youth
100% (1)
A Study of The Impact of Video Games On Youth
34 pages
The Big Fat Genius Guide To Games
50% (2)
The Big Fat Genius Guide To Games
284 pages
Approaches in Literary Criticisms
No ratings yet
Approaches in Literary Criticisms
34 pages
Bitancor, Alvin Marcos, Abby Mocorro, Yasmine Provido, Aliana Sanchez, Dennilyn Sanchez, KC Salazar, Tracy Mae Ticsay, Shane Nicole
No ratings yet
Bitancor, Alvin Marcos, Abby Mocorro, Yasmine Provido, Aliana Sanchez, Dennilyn Sanchez, KC Salazar, Tracy Mae Ticsay, Shane Nicole
8 pages
Grade 11 Philo q1 Mod 2 Method of Philospphizing v3
No ratings yet
Grade 11 Philo q1 Mod 2 Method of Philospphizing v3
27 pages
Theories of Personality Matrix: Prepared by
No ratings yet
Theories of Personality Matrix: Prepared by
10 pages
Advanced 2 Basic 1 Developing: Literary Analysis Rubric 4 3 Proficient
No ratings yet
Advanced 2 Basic 1 Developing: Literary Analysis Rubric 4 3 Proficient
1 page
Essential Vocabulary Guide
No ratings yet
Essential Vocabulary Guide
45 pages
Benner
100% (1)
Benner
3 pages
Purposive Communication
100% (1)
Purposive Communication
3 pages
Module Assignment 1
No ratings yet
Module Assignment 1
45 pages
Out-of-Field Mentors' Experiences
No ratings yet
Out-of-Field Mentors' Experiences
9 pages
Discover Your Learning Style Quiz
No ratings yet
Discover Your Learning Style Quiz
3 pages
Creative Writing: Short Stories: Definition of The Short Story
No ratings yet
Creative Writing: Short Stories: Definition of The Short Story
6 pages
English A Literature SL/HL Written Assignment: Assessment Chart
No ratings yet
English A Literature SL/HL Written Assignment: Assessment Chart
2 pages
HTIC 2023 Abstract Miroslaw Czak Dynamic and Static Posture Infl Music Entrain
No ratings yet
HTIC 2023 Abstract Miroslaw Czak Dynamic and Static Posture Infl Music Entrain
1 page
Motivation, Attitude, and Language Learning: Nasser Oroujlou, Dr. Majid Vahedi
No ratings yet
Motivation, Attitude, and Language Learning: Nasser Oroujlou, Dr. Majid Vahedi
7 pages
Deep Learning Workshop Guide
No ratings yet
Deep Learning Workshop Guide
3 pages
History of Artificial Intelligence Befor
No ratings yet
History of Artificial Intelligence Befor
6 pages
Topic 11
No ratings yet
Topic 11
14 pages
Grammar Insights for Teachers
No ratings yet
Grammar Insights for Teachers
8 pages

Lecture15 Parsing

Uploaded by

Lecture15 Parsing

Uploaded by

|| Jai Sri Gurudev||

Sri Adichunchanagiri Shikshana Trust (R)

SJB INSTITUTE OF TECHNOLOGY

Subject: Natural Language Processing(18CS743)

Department of Information Science & Engineering

You might also like