0% found this document useful (0 votes)
61 views7 pages

NLP Question Bank

The document is a comprehensive question bank covering various topics in Natural Language Processing (NLP), including definitions, theories, models, and applications. It consists of multiple modules, each containing questions related to NLP concepts, parsing techniques, machine learning classifiers like Naïve Bayes, information retrieval systems, and machine translation. The questions range from theoretical explanations to practical applications and algorithms, providing a thorough overview of the field.

Uploaded by

1dt22ai045genai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views7 pages

NLP Question Bank

The document is a comprehensive question bank covering various topics in Natural Language Processing (NLP), including definitions, theories, models, and applications. It consists of multiple modules, each containing questions related to NLP concepts, parsing techniques, machine learning classifiers like Naïve Bayes, information retrieval systems, and machine translation. The questions range from theoretical explanations to practical applications and algorithms, providing a thorough overview of the field.

Uploaded by

1dt22ai045genai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Question Bank

Module 1:

1. Define NLP and also explain the different approaches to NLP


2. Describe the following
X-bar Theory b) Projection principle c) Theta role c) Sub-categorization

3. Explain Transformational grammar (all component) with example


4. Interpret the advantage of using Government and Binding over Transformational
grammar.
5. Infer the different modeling approaches with explanation and explain any 4 NLP.
applications
6. What makes Natural Language Processing difficult? Contrast the difference
between English and Indian Languages
7. List and explain the challenges of NLP.
8. Illustrate with suitable examples the different levels of NLP.
9. Explain the role of transformational rules in transformational grammar with the
help of an example.
10. Explain Statistical Language Model amd find the probability of the test sentence
P(they play in a big garden) in the following training set using bi-gram model.
There is a big garden
Children play in the garden.
They play inside beautiful garden.

11. Explain applications of NLP.


12. List the problems associated with n-gram model. Explain how these problems are
handled.
13. What are Karaka relations? Explain Karaka theory with example.
14. Explain n-gram model. How data sparsesness problem handled in’n-gram’ model.
15. Explain with example binding theory.
16. Consider the following training set:
The Arabian Knights.
These are the fairy tales of the east.
The stories of the Arabian Knights are translated in many languages.
Find the probability of the following test sentence using bi-gram model:
“The Arabian Knights are the fairy tales of the east.”
17. List and explain different phases of analysis in NLP with an example for each.
18. Write Regular Expression for the following:
a) To accept strings book or books.
b) To accept colour and color.
c) To accept any +ve integer with an optimal decimal point.
d) To check a string is an email address or not.
19. Identify the morphological type (Noun phrase, Verb Phrase, Adjective Phrase) of
the following sentence segment
Important to Bill
Looked up the tree
20. Construct the Surface structure and Deep Structure for the following sentences:
The police will catch the snatchers.
She saw stars in the sky.
21. Explain the different levels of NLP with example.
22. Explain Paninian frame work and their issues.
23. Explain different smoothing techniques to handle the data sparseness problem in
n-gram model.
24. Explain Lexical Functional Grammar (LFG)
25. Write the c-structure and f-structure for the following sentence “ she saw stars”.
Consider the CFG rules:
S → NP VP
VP → V {NP} {NP} PP* {Sꞌ}
PP → P NP
NP → Det N {PP}
Sꞌ → comp S

Module 2:
1. Explain the working of two-step morphological parser. Write a simple Finite State
Transducer (FST) for mapping English nouns.
2. Illustrate parts of speech Tagging and explain different categories of POS tagging.
3. Explain the Minimum Edit Distance (MED) algorithm and compute the minimum edit
distance between EXECUTION and INTENTION.
4. Design CYK algorithm. Tabulate the sequence of states created by CYK algorithm
while parsing “A pilot likes flying planes.”
Consider the following simplified grammar in CNF.
S → NP VP NN → Pilot VBG → flying
NP → DT NN NNS → planes
NP → JJ NNS JJ → flying
VP → VBG NNS DT → a VP → VBZ NP VBZ → likes
5. Explain top-down parsing and bottom-up parsing with an example.
6. List out the disadvantages of Probabilistic Context Free Grammar (PCFG).
7. Explain the Minimum Edit Distance (MED) algorithm and compute the minimum edit
distance between ‘tumour’ and ‘tutor’.
8. List POS tagging methods. Explain Rule-based tagger with example.
9. Explain Probabilistic CYK algorithm. List any two problems associated with PCFG.
10. Write a short note on:
a) Phrase level construction.
b) Sentence level construction.
11. The parse tree for the sentence “A restaurant serves dosa” is given below. Perform
semantic analysis and show the semantic interpretations of the constituents. Explain

the process.

12. Perform parsing using simple top down parsing for the sentence “The dogs cried”
using the grammar given below:
S → NP VP
NP → ART N
NP → ART ADJ N
VP → V
VP → V NP
13. Derive a top-down, depth-first, left-to-right parse tree for the given sentence:
The angry bear chased the frightened little squirrel
Use the following grammar rules to create the parse tree:

14.
15. What is meant by spelling correction? How are spelling errors generated and different
algorithms used for spelling correction?
16. Explain with a neat diagram Finite state Transducer.
17. What are Unknown words
18. Explain the different sources used for morphological parser.
19. Explain minimum edit distance Algorithm
20. Compute the minimum edit distance between “tutor” and “tumor” and “peaceful” and
“peaceful” also write the Algorithm.
21. What is morphological parsing? Explain 2 level morphological model with an example
22. Explain the different spelling correction algorithms
23. Explain the different types of parsers
i. Rule based parsers
ii. Stochastic parsers
iii. Hybrid parsers
24. What is meant by POS tagging? How it can be done.
25. Explain CFG with a sample rule and parse tree
26. What is parsing? Explain Top down and Bottom up parsing with an example parse tree
with the grammar rules used.
27. Explain the basic Top-down Parser and the advantage of using it with the algorithm
required.
28. Comment on the validity of he following statements
Rule based taggers are non-deterministic
Stochastic taggers are language dependent
29. Construct the parse tree for the sentence. The girls plucked the flower with a long
stick. Discuss the ambiguity arises from the parse tree.
30. Explain the different smoothie techniques to handle data sparseness problem in n-gram
model
31. Explain the statistical Model: n-gram model with an example sentence for word
prediction
32. Explain Spelling Correction Algorithms.
33. Explain Hybrid tagger.
34. With example, explain basic top down, depth first algorithm.
35. Explain Levensthein minimum edit distance algorithm.

Module 3:
1. Assume the following likelihoods for each word being part of a positive or negative
movie review, and equal prior probabilities for each class.

What class will Naive bayes assign to the sentence “I always like foreign
films.”?
2. Given the following short movie reviews, each labeled with a genre, either
comedy or action:
1. fun, couple, love, love comedy
2. fast, furious, shoot action
3. couple, fly, fast, fun, fun comedy
4. furious, shoot, shoot, fun action
5. fly, fast, shoot, love action
and a new document D: fast, couple, shoot, fly
Compute the most likely class for D. Assume a naive Bayes classifier and use
add-1 smoothing for the likelihoods.
3. Explain the working of the Naïve Bayes classifier in text classification. How does the
assumption of conditional independence impact its performance?
4. Derive the mathematical formulation of Naïve Bayes classification and explain how
it can be applied to sentiment analysis.
5. Describe the training process of a Naïve Bayes classifier. What are the key steps
involved in estimating probabilities from text data?
6. Consider a dataset with positive and negative movie reviews. Explain with a worked
example how the Naïve Bayes classifier determines the sentiment of a new review.
7. Discuss the role of Laplace smoothing in training a Naïve Bayes classifier. How does
it help in handling zero probabilities?
8. Train two models, multinomial naive Bayes and binarized naive Bayes, both
with add-1 smoothing, on the following document counts for key sentiment
words, with positive or negative class assigned as noted.
doc “good” “poor” “great” (class)
d1. 3 0 3 pos
d2. 0 1 2 pos
d3. 1 3 0 neg
d4. 1 5 2 neg
d5. 0 2 0 neg
Use both naive Bayes models to assign a class (pos or neg) to this sentence:
A good, good plot and great characters, but poor acting.
9. What are the key challenges in optimizing Naïve Bayes for sentiment analysis? How
can feature engineering improve classification accuracy?
10. Compare and contrast the performance of Naïve Bayes with other machine learning
models for text classification. When is Naïve Bayes a preferred choice?
11. Explain how Naïve Bayes can be extended beyond sentiment analysis to other text
classification tasks, such as spam detection or topic categorization.
12. Describe how Naïve Bayes can be used as a language model. What are the
advantages and limitations of this approach?
13. Discuss the limitations of Naïve Bayes in handling complex linguistic structures and
dependencies in text. How can these limitations be mitigated?
14. Explain the fundamental concept of the Naïve Bayes classifier. How does it work,
and why is it called ‘Naïve’?
15. Derive the Naïve Bayes formula using Bayes’ Theorem. Explain the assumptions
made in Naïve Bayes classification.
16. Discuss the advantages and disadvantages of the Naïve Bayes classifier in
comparison to other machine learning algorithms for text classification.
17. Describe the steps involved in training a Naïve Bayes classifier for text classification.
What preprocessing techniques are commonly used?
18. Explain how probability estimates for words in a Naïve Bayes classifier are
computed. What challenges arise in estimating these probabilities?
19. What is Laplace smoothing (additive smoothing) in Naïve Bayes? Why is it
necessary, and how does it affect classification performance?
20. Given a set of labeled text documents, illustrate the step-by-step working of a Naïve
Bayes classifier for text classification using a numerical example.
21. Explain how the prior probability and likelihood are computed in Naïve Bayes text
classification with an example dataset.
22. Discuss feature selection techniques that can improve the accuracy of a Naïve Bayes
classifier in sentiment analysis.
23. What role does the choice of n-grams (unigrams, bigrams, trigrams) play in Naïve
Bayes sentiment analysis? Provide examples to illustrate.
24. Explain how stopword removal, stemming, and lemmatization impact the
performance of a Naïve Bayes sentiment analysis model.
25. Describe how Naïve Bayes can be applied to spam detection. What are the key
features used in such a model?
26. How can Naïve Bayes be adapted for topic classification in large datasets? Discuss
its scalability and efficiency.
27. Discuss the application of Naïve Bayes for document classification in Natural
Language Processing. How does it compare with other approaches?
28. Explain how Naïve Bayes can be used as a probabilistic language model. What are its
strengths and limitations in this context?

Module 4:
1. Explain design features of information retrieval systems with a neat diagram.
2. Define term weighting. Consider a document represented by the 3 terms {tornado,
swirl, wind} with the raw tf 4, 1 and 1 respectively. In a collection of 100 documents,
15 documents contain the term tornado, 20 contains swirl, 10 contains wind. Find the
idf and term weight of the 3 terms.
3. Explain the benefits of eliminating the stop words. Give example in which stop word
elimination may be harmful.
4. List different IR models. Explain classification Information Retrieval models.
5. Explain Wordnet and list the applications of Wordnet.
6. Explain six criteria that can be used for evaluation of IR (Information Retrieval)
System.
7. Write a short note on:
i. Indexing
ii. Eliminating stop words.
iii. Stemming
iv. Zipf’s law
8. Define the following with respect to Information Retrieval:
i. Vector Space Model
ii. Term Frequency
iii. Inverse Document Frequency
9. Explain the architecture of an Information Retrieval system with a neat diagram.
10. Write the hypernym chain for “RIVER” extracted from the wordnet 2.0
11. How stemming affects, the performance of IR systems?
12. With example, explain Boolean model for classical Information Retrieval
13. Explain the Probabilistic model of Information Retrieval
14. Explain how stemming affects the performance of IR system.
15. Explain Non-classical model of IR (Information Retrieval).
16. Write short note on:
i. Word Net
ii. Frame Net
iii. Stemmer
iv. PoS tagging
17. A user submitted a query to an IR system. Out of the 1st 15 documents returned by
the system, those ranked 1, 2, 5, 8, 12 were relevant. Compute non-interpolated
average precision for this retrieval. Assume there are six relevant documents.
Module 5:

1. What are language divergences in machine translation? Explain different types of


divergences with examples.
2. Discuss language typology and its impact on machine translation. How does
structural variation among languages affect translation quality?
3. Compare and contrast word-order differences in English and other languages. How do
machine translation models handle these variations?
4. Explain morphological richness in different languages and its impact on machine
translation. How does MT handle languages with complex morphology?
5. Explain the working of the Encoder-Decoder architecture in neural machine
translation (NMT). How does it improve upon traditional MT approaches?
6. Discuss the role of the encoder in the encoder-decoder framework. How does it
represent input sentences for translation?
7. What are the limitations of a simple Encoder-Decoder model in machine translation?
How can they be addressed?
8. Compare statistical machine translation (SMT) and neural machine translation
(NMT). What advantages does NMT offer over SMT?
9. Describe the key components of an Encoder-Decoder model. How do recurrent neural
networks (RNNs) or transformers enhance its performance?
10. What is the role of attention mechanisms in encoder-decoder models for MT? Explain
with an example.
11. Compare the effectiveness of different types of attention mechanisms (e.g., Bahdanau
vs. Luong attention) in machine translation.
12. What challenges arise in machine translation for low-resource languages? Discuss
possible solutions.
13. How does transfer learning help in machine translation for low-resource languages?
Provide examples of successful implementations.
14. Discuss the use of data augmentation techniques such as back-translation in low-
resource machine translation.
15. How does multilingual machine translation help improve translation quality for low-
resource languages? Provide examples of multilingual NMT models.
16. Explain different evaluation metrics for machine translation, such as BLEU,
METEOR, and TER. What are their advantages and limitations?
17. What are human evaluation techniques in machine translation? How do they compare
to automated evaluation methods?
18. What are the challenges of evaluating machine translation quality? Discuss why
human evaluation is still necessary despite automated metrics.
19. Discuss bias in machine translation models. How do gender and cultural biases
manifest, and what measures can be taken to mitigate them?
20. How can ethical concerns such as misinformation and manipulation arise in machine
translation? Discuss strategies to ensure fairness and reliability in MT systems.

You might also like