100% found this document useful (1 vote)
901 views8 pages

Solutions To NLP I Mid Set A

1) The document is an exam for a Natural Language Processing course, containing multiple choice and descriptive questions. 2) It asks students to define parsing and partial parsing, explain elements of simple noun phrases, describe deterministic parsers and the stages of natural language processing, and list sources of ambiguity. 3) The questions also ask students to list applications of NLP, draw a parse tree for a sample sentence, and explain why natural language understanding requires representing and reasoning about world knowledge.

Uploaded by

jyothibellaryv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
901 views8 pages

Solutions To NLP I Mid Set A

1) The document is an exam for a Natural Language Processing course, containing multiple choice and descriptive questions. 2) It asks students to define parsing and partial parsing, explain elements of simple noun phrases, describe deterministic parsers and the stages of natural language processing, and list sources of ambiguity. 3) The questions also ask students to list applications of NLP, draw a parse tree for a sample sentence, and explain why natural language understanding requires representing and reasoning about world knowledge.

Uploaded by

jyothibellaryv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ADITYA COLLEGE OF ENGINEERING

PUNGANUR ROAD, MADANAPALLE-517325


IV-B.Tech (R13) II Sem- I Internal Examinations FEB-2017 (Descriptive) (CODE A)
(13A05802) NATURAL LANGUAGE PROCESSING (Computer Science & Engineering)
Time: 90 min Max Marks: 30

Part A
(Compulsory)
1. Answer the following questions.
a. Define Parsing and partial parsing.
Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer
languages, conforming to the rules of a formal grammar.
Partial parsing, is used to parse an input message bit stream only as far as is necessary to satisfy the current
reference.Partial parsing techniques aim to recover syntactic information efficiently and reliably from unrestricted
text, by sacrificing completeness and depth of analysis.
b. Explain with an example the elements of the simple noun phrases.
Noun phrases (NPs) are used to refer to things: objects, places, concepts, events, qualities, and so on. The simplest NP
consists of a single pronoun: he, she, they, you, me, it, i, and so on. Pronouns can refer to physical objects, to events,
as in the sentence and to qualities. Pronouns do not take any modifiers except in rare forms. Another basic form of
noun phrase consists of a name or proper noun. These nouns appear in capitalized form in carefully written English.
Names may also consist of multiple words. Excluding pronouns and proper names, the head of a noun phrase is
usually a common noun.
Nouns divide into two main classes:
count nouns nouns that describe specific objects or sets of objects.
mass nouns nouns that describe composites or substances. Mass nouns cannot be counted.
In addition to a head, a noun phrase may contain specifiers and qualifiers preceding the head. The qualifiers further
describe the general class of objects identified by the head, while the specifiers indicate how many such objects are
being described, as well as how the objects being described relate to the speaker and hearer.
Specifiers are constructed out of ordinals (such as first and second), cardinals (such as
one and two), and determiners.
Determiners can be subdivided into the following general classes:
articles the words the, a, and an.
demonstratives words such as this, that, these, and those.
possessives noun phrases followed by the suffix s, such as Johns and the fat mans, as well as possessive pronouns,
such as her, my, and whose.
wh-determiners words used in questions, such as which and what.
quantifying determiners words such as some, every, most,no, any, both, and half
A simple noun phrase may have at most one determiner, one ordinal, and one cardinal. It is possible to have all three,
as in the first three contestants. An exception to this rule exists with a few quantifying determiners such as many, few,
several, and little. The qualifiers in a noun phrase occur after the specifiers (if any) and before the head. They consist
of adjectives and nouns being used as modifiers. The following are more precise definitions:
adjectives - words that attribute qualities to objects yet do not refer to the qualities themselves.
noun modifiers - mass or count nouns used to modify another noun.
Different inflectional forms that nouns take and how they are realized in English are - the singular and plural forms.
Pronouns take forms based on person (first, second, and third) and gender (masculine, feminine, and neuter).
c. Explain about Deterministic Parser.
In natural language processing, deterministic parsing refers to parsing algorithms that do not back up. LR-parsers are
an example. A deterministic parser can be built that depends entirely on matching parse states to direct its operation.
Instead of allowing only shift and reduce actions, however, a richer set of actions is allowed that operates on an input
stack called the buffer
d. List various stages involved in NLP.
Lexical Analysis - It involves identifying and analyzing the structure of words. Lexicon of a language means the
collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs,
sentences, and words.
Syntactic Analysis Parsing - It involves analysis of words in the sentence for grammar and arranging words in a manner
that shows the relationship among the words. The sentence such as The school goes to boy is rejected by English
syntactic analyzer.
Semantic Analysis - It draws the exact meaning or the dictionary meaning from the text. The text is checked for
meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer
disregards sentence such as hot ice cream.
Discourse Integration - The meaning of any sentence depends upon the meaning of the sentence just before it. In
addition, it also brings about the meaning of immediately succeeding sentence.
Pragmatic Analysis - During this, what was said is re-interpreted on what it actually meant. It involves deriving those
aspects of language which require real world knowledge.
e. What are the sources of ambiguity?
Ambiguity of a sentence is arising from the following sources:
1. Multiple meanings of words
2. Multiple attachment points of prepositional phrases.
3. Clause attachment points

Part-B
UNIT-I & II
2.(a) List various applications of NLP.
Text-based applications involve the processing of written text, such as books, newspapers, reports, manuals,
e-mail messages, and so on. These are all reading-based tasks.
Text-based natural language applications are:
Finding appropriate documents on certain topics from a database of texts
Extracting information from messages or articles on certain topics
Translating documents from one language to another
Summarizing texts for certain purposes
Machine Translation
Fighting Spam
Information Extraction
Summarization
Sentiment Analysis
Text Classification

Dialogue-based applications involve human-machine communication. Typical potential applications include


Question-answering systems, where natural language is used to query a database
Automated customer service over the telephone
Tutoring systems, where the machine interacts with a student
Spoken language control of a machine
General cooperative problem-solving systems

(b) Draw the parse tree for the following sentence. the visitor saw the old painting in the den.
(c) Natural Language Understanding requires a capability to represent and reason about knowledge of
the world? Justify?
Knowledge representation and reasoning (KR) is the field of artificial intelligence (AI) dedicated to representing
information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a
medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from
psychology about how humans solve problems and represent knowledge in order to design formalisms that will
make complex systems easier to design and build. Knowledge representation and reasoning also incorporates
findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets
and subsets.

Examples of knowledge representation formalisms include semantic nets, systems architecture, Frames, Rules, and
ontologies. Examples of automated reasoning engines include inference engines, theorem provers, and classifiers.

The justification for knowledge representation is that conventional procedural code is not the best formalism to use
to solve complex problems. Knowledge representation makes complex software easier to define and maintain than
procedural code and can be used in expert systems.
Knowledge representation goes hand in hand with automated reasoning because one of the main purposes of
explicitly representing knowledge is to be able to reason about that knowledge, to make inferences, assert new
knowledge, etc. Virtually all knowledge representation languages have a reasoning or inference engine as part of the
system. A key trade-off in the design of a knowledge representation formalism is that between expressivity and
practicality. The ultimate knowledge representation formalism in terms of expressive power and compactness is First
Order Logic (FOL). There is no more powerful formalism than that used by mathematicians to define general
propositions about the world. However, FOL has two drawbacks as a knowledge representation formalism: ease of
use and practicality of implementation. First order logic can be intimidating even for many software developers.
Languages which do not have the complete formal power of FOL can still provide close to the same expressive power
with a user interface that is more practical for the average developer to understand. The issue of practicality of
implementation is that FOL in some ways is too expressive. With FOL it is possible to create statements (e.g.
quantification over infinite sets) that would cause a system to never terminate if it attempted to verify them
or
3. (a) What are Inflectional form & Derivational forms?
The study of morphology concerns the construction of words from more basic components. There are two
basic ways that new words are formed, traditionally classified as inflectional forms and derivational forms.
Inflectional forms use a root form of a word and typically add a suffix so that the word appears in the
appropriate form given the sentence. Verbs are the best examples of this in English. Each verb has a basic
form that then is typically changed depending on the subject and the tense of the sentence. For example,
the verb sigh will take suffixes such as -s, -ing, and - ed to create the verb forms sighs, sighing, and sighed,
respectively. These new words are all verbs and share the same basic meaning. Derivational morphology
involves the derivation of new words from other forms. The new words may be in completely different
categories from their subparts. For example, the noun friend is made into the adjective friendly by adding
the suffix - ly. A more complex derivation would allow you to derive the noun friendliness from the
adjective form. There are many interesting issues concerned with how words are derived and how the
choice of word form is affected by the syntactic structure of the sentence that constrains it.

(b) What is Morphology?


Morphology It is a study of construction of words from primitive meaningful units.
Morpheme It is primitive unit of meaning in a language.
(c) What are Transition Networks? Represent a noun Phrase segment of a transition network
Kathy Jumped the horse and Parse the above sentence both using top-down & bottom-up Methods.
UNIT- II & III
4. (a) What do you mean by Unification of two literals?
Unification is a "pattern matching" procedure that takes two atomic sentences, called literals, as input, and returns
"failure" if they do not match . All variables in the given two literals are implicitly universally quantified. To make
literals match, replace (universally-quantified) variables by terms.
Two feature structures A and B unify ( A B) if they can be merged into one consistent feature structure C:
Otherwise, unification fails:

(b) Describe various efficient techniques for encoding ambiguity.


Local Ambiguity Packing: Local Ambiguity Packing is a case in which a portion of the input sentence can be
parsed and reduced to a particular grammar category in multiple ways. It allows storing these multiple sub
parses in a single common data structure, indexed by a single pointer. Any constituent further up in the
parse tree can then refer to the set of sub parses via this single pointer instead of referring to each of the
sub analyzer individually.
Chart Parsing also supports to encode the ambiguity.
Modifying the grammar so that the ambiguities become semantic ambiguities rather than the syntactic
ambiguities.
Attachment ambiguities can be addressed by canonical interpretations where rather than generating all
interpretations, a single interpretation is constructed with the widest scope.
D-Thoery (or) Description Theory approach - in this the meaning of the output of the parser is modified. It
defines a transitive relation dominates for encoding process. In this the output is taken as a set of
dominance relationships.

(c) Describe Minimal attachment and Right Association with an example for each.
The minimal attachment principle states that there is a preference for the syntactic analysis that creates the least
number of nodes in the parse tree. Try to group the latest words received together under existing category nodes;
otherwise, build a new category.
Example:
The man kept the dog in the house

This principle predicts that the first interpretation is preferred, which probably agrees with your intuition.
Right Association or Late Closure

This principle states that all other things being equal, new constituents tend to be interpreted as being part of the
current constituent under construction (rather than part of some constituent higher in the parse tree).

Example:
George said that Henry left in his car.

preferred interpretation is that

Henry left in the car rather than that George spoke in the car.

George said that Henry left in his car.

The right association principle prefers the former.


or
5. Write short notes on the following
a. Shift Reduce Parser.
Using techniques that encode uncertainty, so that the parser need not make an arbitrary choice and later
backtrack.
Rather, the uncertainty is passed forward through the parse to the point where the input eliminates all but
one of the possibilities.
If you did this explicitly at parse time, you would have an algorithm similar to the breadth-first parser.
All the possibilities are considered in advance, and the information is stored in a table that controls the
parser, resulting in parsing algorithms that can be much faster.
These techniques were developed for use with unambiguous context-free grammars.

But these techniques can be extended in various ways to make them


applicable to natural language parsing.
Shift Reduce Parser Elements
Parse Stack
Input Stack
Shift/Reduce Actions
Parse (Oracle) Table
Reduce Action: The states that consist of a single rule with the dot at the far right-hand side, such as S2',
S -> NP VP o
indicate that the parser should rewrite the top symbols on the parse stack according to this rule.The newly
derived symbol (S in this case) is pushed onto the top of the input stack.
Shift Action:
Any other state not containing any completed rules is interpreted by the transition diagram. If the top input
symbol matches an arc, then it and the new state (at the end of the arc) are pushed onto the parse stack.
Example :

Consider parsing "The man ate the carrot". The initial state of the parser is
Parse Stack Input Stack
(S0) (The man ate the carrot)
Looking up the entry in the table for state SO for the input ART
(the category of the word the), you see a shift action and a move to state S1:
Parse Stack Input Stack
(S1 ART S0) (man ate the carrot)
Looking up the entry for state S1 for the input N, you see a shift action
and a move to state S1:
Parse Stack Input Stack
(S1' N S1 ART S0) (ate the carrot)
Looking up the entry for state Si, you then reduce by rule 2.2, which removes the Si,,
N, 51, and ART from the parse stack and adds NP to the input stack:

Parse Stack Input Stack


(S0) (NP ate the carrot)
Again, consulting the table for state S0 with input NP, you now do a shift and move to
state S2:
Parse Stack Input Stack
(S2 NP S0) (ate the carrot)
Next, the three remaining words all cause shifts and a move to a new state, ending up with the parse state:
Parse Stack Input Stack
(S1' N S1 ART S3 V S2 NP S0)( )
The reduce action by rule 2.2 specified in state S1' pops the N and ART from the stack (thereby popping S1 and S1' as
well), producing the state:
Parse Stack Input Stack
(S3 V S2 NP S0) (NP)
You are now back at state S3, with an NP in the input, and after a shift to state S3', you reduce by rule 2.4,
producing:
Parse Stack Input Stack
(S2 NP S0) (VP)
Finally, from state S2 you shift to state S2 and reduce by rule 2.1, producing:
Parse Stack Input Stack
(S0) (S)
From this state you shift to state S0' and are in a position to accept the sentence.

b. POS
The process of assigning a part-of-speech or lexical class marker to each word in a corpus:

WORDS
TAGS
the
koala
put N
the V
keys P
on DET
the
table
Applications for POS Tagging
Speech synthesis pronunciation
Lead Lead
INsult inSULT
OBject obJECT
OVERflow overFLOW
DIScount disCOUNT
CONtent conTENT
Parsing: e.g. Time flies like an arrow
Is flies an N or V?
Word prediction in speech recognition
Possessive pronouns (my, your, her) are likely to be followed by nouns
Personal pronouns (I, you, he) are likely to be followed by verbs
Machine Translation

To do POS tagging, first need to choose a set of tags. We Could pick very coarse (small) tagsets N, V, Adj, Adv.
More commonly used tags are Brown Corpus 87 tags more informative but more difficult to tag. Most commonly
used is Penn Treebank 47 tags.

c. Viterbi
The Viterbi algorithm is used to compute the most probable path (as well as its probability). It requires knowledge of
the parameters of the HMM model and a particular output sequence and it finds the state sequence that is most
likely to have generated that output sequence. It works by finding a maximum over all possible state sequences.
In sequence analysis, this method can be used for example to predict coding vs non--coding sequences.
In fact there are often many state sequences that can produce the same particular output sequence, but with
different probabilities. It is possible to calculate the probability for the HMM model to generate that output
sequence by doing the summation over all possible state sequences. This also can be done efficiently using the
Forward algorithm (or the Backward algorithm), which is also a dynamical programming algorithm. In sequence
analysis, this method can be used for example to predict the probability that a particular DNA region match the
HMM motif (i.e. was emitted by the HMM model). A HMM motif can represent a TF binding site for ex.

ADITYA COLLEGE OF ENGINEERING


PUNGANUR ROAD, MADANAPALLE-517325
IV-B.Tech (R13) II Sem - I Internal Examinations FEB-2017 (Objective) (CODE A)
(13A05802) NATURAL LANGUAGE PROCESSING (Computer Science & Engineering)
Name: Roll No:

Time: 20 min Max Marks: 10

Answer all the questions 51=5M


1. Define particle.
Particles are the words that help to construct some verb forms Particles generally overlap with the class of
prepositions
2. Describe ELIZA.
ELIZA is an early natural language processing computer program created from 1964 to 1966[1] at the MIT
Artificial Intelligence Laboratory by Joseph Weizenbaum. Created to demonstrate the superficiality of
communication between man and machine, Eliza simulated conversation by using a 'pattern matching' and
substitution methodology that gave users an illusion of understanding on the part of the program, but had
no built in framework for contextualizing events.
3. Define Feature.
Feature structures form the basis for many grammar formalisms used in computational linguistics.
Feature structure grammars (aka attribute-value grammars, or unification grammars) can be used as a more
compact way of representing rich CFGs and a way to represent more expressive grammars
4. What is Discourse.
It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.
5. What are the elements of shift reduce parser.
Parse Stack
Input Stack
Shift/Reduce Actions
Parse (Oracle) Table
Choose the correct answer from following questions. 10 1/2 = 5 M
1. _______________ are the words that attribute qualities to objects yet do not refer to the qualities themselves
a) adjectives b)nouns c) pronouns d)Adverbs [ a ]
2. ____ mass or count nouns used to modify another noun,
a) noun modifiers b) Noun Phrases c) PRO d)None [ a ]
3. A simple VP may consist of some adverbial modifiers followed by the ______ and its complements.
a) head verb b) Noun c) Verb d) all the above [ d ]
4. A simple declarative sentence consists of an NP, the subject, followed by a verb phrase (VP), the predicate.
a) True b) False [ a ]
5. Transitive verbs allow another form of verb group called the _____, which is constructed using a be auxiliary
followed by the past participle.
a) passive form b) active form c) noun form d) None [ a ]
6. Simple transition networks are often called finite state machines (FSMs).
a) True b) False [ a ]
7. Popular method of building a parser for CFGs is to encode the rules of the grammar directly in a logic
programming language such as
a) PROLOG b) LISP c) COBOL d) LOGIC [ a ]
8. An instance of the word are, however, may agree with second person singular or any of the plural forms, so its
AGR feature would be a variable ranging over the values
a) {2s 1p 2p 3p} b) {2p 1p 2p 3p} c) {2s 1p 2s 3p} d) {2s 1p 2p 3s} [ a ]
9. The unification-based formalism can be defined precisely by representing feature structures as
a)DAGs b) FSM c) DFSM d) All the above. [ a ]
10. 1. Natural Language Processing (NLP) is field of
a) Computer Science b) Artificial Intelligence c) Linguistics d) All of the mentioned [ d ]

You might also like