0% found this document useful (0 votes)
15 views15 pages

NLP Unit 1 PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views15 pages

NLP Unit 1 PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

1. Explain what is NLP.

i. Natural Language Processing (NLP) is a branch of AI that enables computers to


understand, interpret, and generate human language in both text and speech.
ii. It bridges the gap between human communication and machine understanding,
supporting tasks like text classification, sentiment analysis, translation, speech
recognition, named entity recognition, and question answering.
iii. NLP has two major subfields: Natural Language Understanding (NLU), which focuses
on interpreting intent and context, and Natural Language Generation (NLG),
which creates human-like responses from data.
iv. A key strength of NLP is its ability to understand context, resolving ambiguities like
multiple meanings of a word (e.g., “bank”).
v. Modern NLP incorporates grammar, syntax, semantics, and pragmatics, enhanced by
deep learning and transformer-based models like BERT and GPT, enabling
nuanced understanding.
vi. It powers applications such as chatbots, virtual assistants, translators, voice-controlled
devices, and automated support systems.

2. What are the applications of NLP?

i. Sentiment Analysis: This involves analyzing text to determine the sentiment behind it—
whether it is positive, negative, or neutral. It is widely used in social media
monitoring, product reviews, and customer feedback analysis.

ii. Named Entity Recognition (NER): NER identifies and categorizes entities in text such as
names of people, organizations, locations, dates, and other proper nouns. It is used in
information extraction, news categorization, and search engines. The input to such a
model is generally text, and the output is the various named entities along with their
start and end positions. Named entity recognition is useful in applications such as
summarizing news articles and combating disinformation.

iii. Machine Translation: NLP powers automatic translation tools like Google Translate by
converting text from one language to another while preserving meaning and context.

iv. Speech Recognition: Converts spoken language into text. This is used in voice assistants
like Siri, Alexa, and Google Assistant, as well as in transcription services.

v. Text Summarization: Automatically generates a concise summary of a long document


while retaining its main ideas. It is useful for news articles, research papers, and legal
documents.

vi. Text Classification: Involves categorizing text into predefined labels, such as spam
detection in emails or topic classification in articles.

3. Explain what is Topic Modeling.

i. Topic modeling is an unsupervised Natural Language Processing (NLP) technique used to


automatically discover the hidden thematic structure or topics present in a corpus, which
is a large collection of text documents.
ii. It helps in understanding and organizing unstructured textual data by identifying patterns of
word co-occurrence and grouping similar words and documents under common topics.
iii. Unlike text classification, topic modeling does not rely on predefined labels, making it ideal
for exploring large, unlabeled datasets.
iv. One of the most popular algorithms used for topic modeling is Latent Dirichlet Allocation
(LDA), which assumes that each document in the corpus is a mixture of several topics,
and each topic is a distribution over words.
v. For example, if the corpus consists of thousands of news articles, topic modeling can
automatically detect topics such as politics, sports, technology, and health, without any
prior knowledge of the content.
vi. Topic modeling is widely applied in areas such as content recommendation, document
clustering, academic research, trend analysis, and organizing digital libraries.
vii. It allows users to summarize, explore, and extract meaningful themes from large volumes of
text, making it a powerful tool for text mining and information retrieval.

4. Explain what is tokenization.

i. Tokenization is a fundamental step in Natural Language Processing (NLP) that involves


breaking down a large piece of text into smaller units called tokens.
ii. These tokens can be words, sentences, or even characters, depending on the task.
iii. The main purpose of tokenization is to simplify the processing of text by converting it
into manageable pieces that a machine can understand and analyze.
iv. For example, consider the sentence: "NLP is very interesting."
Word-level tokenization would split it into: ["NLP", "is", "very", "interesting", "."]
v. Each of these tokens can then be used for further analysis such as part-of-speech
tagging, sentiment analysis, or machine translation.
vi. Tokenization is an important preprocessing step because it helps in standardizing text,
removing unnecessary punctuation, and preparing the data for algorithms to
extract meaning.
vii. It also helps in handling tasks like word frequency analysis, text classification, and
information retrieval. Without tokenization, machines would struggle to
understand the structure and meaning of natural language.

5. Explain what is Part-of-speech Tagging.

i. Part-of-speech (POS) tagging is a fundamental technique in Natural Language Processing


(NLP) where each word in a sentence is labeled with its correct grammatical category
or part of speech, such as noun, verb, adjective, adverb, pronoun, etc.
ii. The goal of POS tagging is to help computers understand the grammatical structure and
meaning of a sentence.
iii. For example, in the sentence: "The quick brown fox jumps over the lazy dog."
POS tagging would assign labels like:
The (Determiner), quick (Adjective), brown (Adjective), fox (Noun), jumps (Verb),
over (Preposition), the (Determiner), lazy (Adjective), dog (Noun)
iv. POS tagging is important because many words in English can have more than one
grammatical role depending on the context. For instance, the word "play" can be a
noun ("I watched the play") or a verb ("Children play outside").
v. POS taggers use rules or machine learning models to decide the correct tag based on the
context.
vi. This technique is used in many NLP applications such as syntactic parsing, information
extraction, machine translation, and text-to-speech systems.

6. Explain what is Stemming.

i. Stemming is a text preprocessing technique used in Natural Language Processing (NLP) to


reduce a word to its root or base form by removing prefixes or suffixes.
ii. The resulting root word, called the stem, may not always be a valid word in the dictionary,
but it helps in treating related words as the same during analysis.
iii. For example: "running", "runner", and "ran" may all be reduced to the stem "run"
And "happily", "happiness", and "happy" might be reduced to "happi"
iv. Stemming helps in reducing the complexity of textual data by minimizing the number of
unique words.
v. This is especially useful in tasks like information retrieval, search engines, and text
classification, where different forms of a word should be treated similarly.
vi. One common stemming algorithm is the Porter Stemmer, which uses a set of rules to strip
suffixes from words.
vii. However, stemming is often considered a rough approach because it may produce stems
that are not actual words and can sometimes lead to over-stemming (merging
unrelated words) or under-stemming (failing to merge similar words).

7. Explain what is lemmatization.

i. Lemmatization is a text preprocessing technique in Natural Language Processing (NLP)


that involves reducing a word to its base or dictionary form, known as the lemma.
ii. Unlike stemming, which may simply chop off word endings, lemmatization uses linguistic
knowledge (such as vocabulary and grammar) to ensure that the root word is a valid
word with meaning.
iii. For example:
"running", "ran", and "runs" are all reduced to "run"
"better" is reduced to "good" (which cannot be done by simple stemming)
iv. Lemmatization considers the part of speech and the context of the word to determine its
correct base form.
v. This makes it more accurate than stemming, although it is also more computationally
intensive.
vi. Lemmatization is important for tasks like information retrieval, text classification,
question answering, and machine translation, where understanding the actual meaning
of words is essential.
vii. It helps in treating different forms of a word as a single item, improving the performance
and accuracy of NLP models.

8. What is Chunking?

i. Chunking is a process in Natural Language Processing where groupings of related words


(called chunks) are formed based on part-of-speech (POS) tags.
ii. It is also known as shallow parsing because it captures partial syntactic structures rather
than complete parse trees.
iii. The most common type of chunking is identifying noun phrases (NPs) — groups of words
that act as a noun.
iv. For example, in the sentence:
"The quick brown fox jumps over the lazy dog."
Chunking might extract:
[The quick brown fox] → a noun phrase
[the lazy dog] → another noun phrase
v. Chunking helps in grouping meaningful pieces of information that can be further analyzed
in tasks like named entity recognition (NER), information extraction, and question
answering.

9. What is Chinking?

i. While chunking includes a sequence of words into a chunk, chinking removes specific
words or POS patterns from an already chunked phrase.
ii. It’s useful when the chunk includes unwanted words that should be excluded based on
their POS tags.
iii. For example, suppose a chunk includes all words between a determiner and a noun, but
you want to exclude verbs from that group. You would use chinking to "cut out" those
verbs.

10. What is Named Entity Recognition (NER)?

i. Named Entity Recognition (NER) is a key technique in Natural Language Processing


(NLP) that involves identifying and classifying named entities in a text into
predefined categories such as person names, organizations, locations, dates, time
expressions, monetary values, percentages, and more.
ii. For example, in the sentence:
"Sachin Tendulkar scored a century at Wankhede Stadium in Mumbai on April 2,
2011."
NER would identify:
Sachin Tendulkar → Person
Wankhede Stadium → Location
Mumbai → Location
April 2, 2011 → Date
iii. The main goal of NER is to extract structured information from unstructured text, making
it easier for machines to understand and analyze data.
iv. It is widely used in applications like information retrieval, question answering systems,
news categorization, resume parsing, chatbots, and knowledge graph construction.
v. NER systems often rely on machine learning, deep learning, or rule-based approaches to
recognize and classify entities accurately.
vi. Advanced NER models can also handle context, disambiguate similar names, and even
detect new entity types in domain-specific texts.

11. Explain what do you mean by finding collocations.

i. Finding collocations in Natural Language Processing (NLP) refers to the process of


identifying pairs or groups of words that frequently occur together in a language more
often than would be expected by chance.
ii. These word combinations often form meaningful expressions or phrases and are important
for understanding natural language as they convey specific meanings.
iii. Examples of collocations include:
"fast food" (not quick food)
"make a decision" (not do a decision)
"strong tea" (not powerful tea)
iv. Collocations can be:
Bigrams (two-word combinations), e.g., "high school", "climate change"
Trigrams (three-word combinations), e.g., "New York City", "as soon as"
v. Finding collocations helps in:
Improving language models
Enhancing search engines
Generating more natural text in chatbots or translators
Reducing ambiguity in language processing
vi. Statistical methods such as Pointwise Mutual Information (PMI), t-score, and frequency
counts are commonly used to detect collocations by analyzing large corpora of text.

12. What are the steps in preprocessing for NLP.

i. Breaking sentences into tokens: Splitting text into smaller units like words or phrases for
easier analysis.

ii. Tagging parts of speech (POS): Assigning grammatical roles (noun, verb, adjective, etc.)
to each token based on context.
iii. Building an appropriate vocabulary: Creating a set of unique words or tokens present in
the text corpus.
iv. Linking the components of a created vocabulary: Mapping tokens to indices or
embeddings for computational processing.
v. Understanding the context: Analyzing surrounding words and structure to grasp the
meaning of tokens accurately.
vi. Extracting semantic meaning: Identifying the deeper meaning or intent behind words and
phrases.
vii. Named Entity Recognition (NER): Detecting and classifying proper nouns like names,
places, dates, etc., in text.
viii. Transforming unstructured data into structured data: Converting raw text into organized
formats suitable for machine learning.
ix. Ambiguity in speech: Addressing words or phrases that have multiple meanings
depending on the context.

13. Explain Word2Vec in detail.

i. Word2Vec is a widely used word embedding technique in Natural Language Processing


(NLP) that transforms words into meaningful continuous-valued vectors, capturing
their semantic and syntactic relationships.
ii. Developed by Tomas Mikolov at Google in 2013, Word2Vec allows machines to
understand the contextual meaning of words by placing similar words close to each
other in a high-dimensional vector space.
iii. The core idea behind Word2Vec is based on the distributional hypothesis — “words that
occur in similar contexts tend to have similar meanings.”
iv. Unlike traditional approaches like one-hot encoding (which creates sparse and high-
dimensional vectors), Word2Vec creates dense, low-dimensional vectors where each
word is represented by a fixed-length real-valued vector that reflects its usage in the
corpus.
v. Word2Vec uses a shallow, two-layer neural network and is trained using a large text
corpus. It has two main architectures:
a) Continuous Bag of Words (CBOW): This model predicts a target word based on its
surrounding context words. For example, in the sentence "The cat sits on the mat", if
the context is ["The", "sits", "on", "the", "mat"], CBOW will try to predict the word
"cat". CBOW is efficient for large datasets and works well with frequent words.
b) Skip-gram: This model does the opposite — it uses a target word to predict the context
words. For example, given the word "cat", Skip-gram will try to predict words like
"The", "sits", "on", etc. Skip-gram performs better for infrequent or rare words and
gives more accurate word representations.

vi. During training, Word2Vec uses optimization techniques such as Negative Sampling or
Hierarchical Softmax to improve efficiency when working with large vocabularies.
vii. One of the most powerful features of Word2Vec is its ability to capture linguistic
regularities and vector arithmetic. For example:
viii. Vector("King") - Vector("Man") + Vector("Woman") ≈ Vector("Queen")
ix. This means that the relationship between "king" and "man" is similar to the relationship
between "queen" and "woman".
x. These vector operations reflect real-world relationships and make Word2Vec highly useful
in tasks requiring semantic understanding.
xi. Word2Vec has several applications in NLP, including sentiment analysis, document
classification, machine translation, text clustering, question answering, and
recommendation systems.
xii. By converting unstructured text into structured vector representations, it enables
machines to understand word similarity, context, and meaning in a human-like way.

14. Explain CBOW in detail.

i. Continuous Bag of Words (CBOW) is one of the two main architectures used in the
Word2Vec model for learning word embeddings — the other being Skip-gram.
ii. The CBOW model aims to predict a target word based on its surrounding context words
within a given window size.
iii. It is called "Bag of Words" because the order of context words is ignored, and only their
presence matters.
iv. For example, consider the sentence: "The cat sits on the mat."
If we choose the context window size = 2, and the target word is "sits", then the
context words are ["The", "cat", "on", "the"]. The CBOW model will try to predict the
word "sits" using these four surrounding words.

v. How it Works:
a) Input Layer: The model takes the context words as input, which are first converted into
one-hot encoded vectors. These vectors are then mapped to dense representations
using a shared weight matrix (also known as the embedding matrix).

b) Hidden Layer: The embeddings of the context words are averaged to produce a single
vector representation. This step captures the general meaning of the context.

c) Output Layer: This averaged vector is passed through another weight matrix followed by a
softmax function to produce a probability distribution over the entire vocabulary.

d) Prediction: The model selects the word with the highest probability as the predicted target
word. During training, the model adjusts its weights to minimize the error between the
predicted and actual target word.
vi. Key Features:
a) CBOW is faster to train than Skip-gram because it works well with frequent words and
uses fewer parameters per training example.

b) It is best suited for large corpora where most words appear often.

c) It captures the overall meaning of surrounding words rather than focusing on individual
pairwise relationships.

vii. Advantages:
a) Efficient and quick for large datasets.

b) Performs well when context is informative and words are frequent.

viii. Limitations:
a) Less effective for learning representations of rare words.

b) Since word order is ignored, it may lose some syntactic information.

15. Explain n-gram in detail.

16. What are the text preprocessing techniques.

i. Noise Removal -
Involves removing irrelevant information such as HTML tags, special characters,
emojis, URLs, or metadata that do not contribute to the meaning of the text.
ii. Tokenization -
Splits the text into smaller units called tokens (such as words, sentences, or
characters), which form the basis for further processing.
iii. Lowercasing -
Converts all characters in the text to lowercase to avoid treating the same words in
different cases (e.g., "Apple" and "apple") as separate tokens.
iv. Normalization (Stemming and Lemmatization) -
Stemming reduces words to their root form by chopping off suffixes (e.g., "playing"
→ "play").
Lemmatization returns the base or dictionary form of a word using linguistic
knowledge (e.g., "better" → "good").
v. Stop Word Removal -
Removes commonly used words (like is, the, and, a) that do not carry significant
meaning and are often considered irrelevant for analysis.
vi. Object Standardization -
Converts different forms of the same concept into a standard format (e.g., converting
"₹", "Rs.", and "INR" all to "rupees").
vii. Removing Punctuation -
Eliminates punctuation marks (like ., !, ?, ", etc.) which are generally not useful in
most text processing tasks.
viii. Removing Extra Whitespaces -
Trims unnecessary spaces, tabs, or newlines to ensure uniformity in the text and avoid
misleading tokenization.

17. Explain what is tokenization and what are the types of tokenization.

i. Tokenization is a fundamental step in Natural Language Processing (NLP) that involves


breaking down a large piece of text into smaller units called tokens.
ii. These tokens can be words, sentences, or even characters, depending on the requirement.
iii. The main goal of tokenization is to simplify the text for further analysis by allowing
algorithms to process and understand it in parts.
iv. For example, the sentence: "Natural Language Processing is fun!"
Can be tokenized into words:
["Natural", "Language", "Processing", "is", "fun", "!"]
v. Tokenization helps in tasks like text classification, sentiment analysis, language modeling,
and more by preparing the raw text into a structured format.
vi. Types of Tokenization:

a) Word Tokenization -
Splits a sentence into individual words.
Example:
"I love NLP." → ["I", "love", "NLP", "."]

b) Sentence Tokenization -
Splits a paragraph or document into individual sentences.
Example:
"NLP is interesting. It has many applications."
→ ["NLP is interesting.", "It has many applications."]

c) Character Tokenization -
Splits text into individual characters.
Example:
"Chat" → ['C', 'h', 'a', 't']

d) Subword Tokenization -
Splits words into meaningful subword units (like prefixes, suffixes, or roots).
Used in advanced NLP models like BERT and GPT.
Example:
"unhappiness" → ["un", "happi", "ness"]

e) Whitespace Tokenization -
Splits tokens wherever there is a space or tab character.
Simple but can break on punctuation or special symbols.

f) Regular Expression Tokenization -


Uses custom regex rules to define how the text should be split.
Useful for complex or domain-specific tokenization tasks.

18. Explain Bag-of-Words model.

i. A bag of words is a representation of text, that describes the occurrence of words within a
document.
ii. It keeps track of words counts, and disregard the grammatical details and the words order
hence it is called “Bag” of words.
iii. It is concerned with whether known words occur in the document or not.
iv. Bag of words is at text modelling, that describes a process of generating a sentence.
v. How it works –

19. Explain GloVe.

i. GloVe, which stands for Global Vectors for Word Representation, is an unsupervised
learning algorithm used to generate word embeddings—dense vector representations
of words that capture their semantic relationships.
ii. It is similar in purpose to Word2Vec but differs significantly in the way it learns these
representations.
iii. Unlike Word2Vec, which relies on local context windows (i.e., predicting a word from its
neighbors or vice versa), GloVe is based on global word co-occurrence statistics.
iv. It constructs a large matrix from a corpus that counts how often pairs of words occur
together in different contexts.
v. For example, if two words frequently appear in the same context across the entire corpus
(like "doctor" and "hospital"), they are likely to have similar vector representations.
vi. The key idea behind GloVe is that the ratio of co-occurrence probabilities between words
can reveal meaningful relationships.
vii. For instance, the ratio of how often "ice" co-occurs with "solid" compared to how often
"steam" co-occurs with "solid" helps the model learn that "ice" is more closely related
to cold or solidity, while "steam" is not. GloVe uses these relationships to train word
vectors so that the dot product of two word vectors approximates their co-occurrence
probability.
viii. GloVe uses a log-bilinear regression model to minimize a cost function that captures the
difference between the actual co-occurrence of words and the dot product of their
corresponding vectors.
ix. This allows it to produce word vectors that capture both semantic similarity (words used
in similar contexts) and linear relationships (e.g., vector("king") - vector("man") +
vector("woman") ≈ vector("queen")), just like Word2Vec.
x. One of the strengths of GloVe is that it produces a single embedding per word, trained on
the entire corpus, and performs well even with rare words if enough global co-
occurrence data is available.

20. What are Smoothing techniques in NLP.

i. Smoothing techniques in Natural Language Processing (NLP) are mathematical methods


used to handle the problem of zero probabilities when working with probabilistic
language models, especially n-gram models.
ii. Example –

In a bigram model:

You have seen "I am" → 10 times

But never seen "I swim"

Then,

P("swim" | "I") = 0 → This will make the entire sentence's probability 0.

Smoothing avoids this by assigning a small non-zero probability to "I swim".

21. What are the different types of Smoothing techniques.


22. Explain the architecture of neural network in detail.

i. A neural network is a computational model inspired by the human brain's structure and
function.
ii. It is made up of layers of interconnected units known as neurons or nodes. These neurons
are organized into three main types of layers: the input layer, one or more hidden
layers, and the output layer.
iii. The design and flow of data through these layers is what defines the architecture of the
neural network.
iv. The input layer is the entry point for data into the network.
v. Each neuron in this layer represents a feature of the input data. For example, if we are
working with images of 28×28 pixels, the input layer will have 784 neurons, one for
each pixel.
vi. The input layer does not perform any computation; it simply passes the raw feature values
to the next layer.
vii. The hidden layers are where the majority of computation occurs. Each neuron in a hidden
layer takes inputs from all neurons in the previous layer, multiplies them by weights,
adds a bias, and passes the result through an activation function.
viii. These activation functions introduce non-linearity into the model, enabling it to learn
complex and non-linear patterns in data.
ix. Common activation functions include ReLU (Rectified Linear Unit), sigmoid, and tanh.
ReLU is widely used because it is simple and helps prevent the vanishing gradient
problem during training.
x. The output layer is the final layer of the network. It produces the network’s prediction.
xi. The number of neurons in the output layer depends on the task at hand. For binary
classification, a single output neuron with a sigmoid activation function is typically
used, producing values between 0 and 1, interpreted as probabilities.
xii. For multi-class classification, a softmax activation function is applied. The softmax
function converts raw output scores (logits) into probabilities that sum up to 1,
ensuring that no output exceeds 1.
xiii. This makes it easier to interpret the output as a probability distribution over different
classes, which is crucial for decision-making in classification problems.
xiv. The network learns using a process called forward propagation and backpropagation. In
forward propagation, the input is passed through the network layer by layer, and
predictions are made.
xv. These predictions are compared to the actual labels using a loss function (such as cross-
entropy for classification or mean squared error for regression), which calculates the
error in prediction.
xvi. This error is then used in backpropagation to update the weights and biases using
gradients computed via the chain rule of calculus.
xvii. To optimize the network parameters (weights and biases), algorithms such as Stochastic
Gradient Descent (SGD), Adam, or RMSprop are used.
xviii. These algorithms iteratively update the parameters to minimize the loss function,
improving the network’s accuracy over time.
xix. Training is usually done over multiple epochs, where the entire dataset is repeatedly fed
into the network, and may be divided into smaller batches to make the process more
efficient and stable.

OR

23. Explain what is feedforward in Neural network.

i. The feedforward process is the fundamental mechanism by which a neural network makes
predictions or computes outputs based on input data.
ii. It refers to the unidirectional flow of information from the input layer, through the hidden
layers, to the output layer—without any cycles or loops.
iii. This is why such networks are also called Feedforward Neural Networks (FNNs).
iv. In feedforward, each neuron in a layer receives input only from the previous layer,
performs a calculation, and passes its output to the next layer.
v. The process starts when raw input data, such as an image, text, or numerical values, is fed
into the input layer.
vi. Each input neuron simply passes its data to the first hidden layer without applying any
transformation.
vii. Within the hidden layers, each neuron takes a weighted sum of its inputs, adds a bias, and
then passes the result through an activation function.
viii. This activation function introduces non-linearity, which allows the network to learn
complex patterns.
ix. For example, the ReLU (Rectified Linear Unit) function outputs 0 for negative inputs and
the input itself for positive values, helping the network to avoid problems like
vanishing gradients.
x. This process continues layer by layer until the output layer is reached.
xi. The output neurons also compute weighted sums and apply activation functions like
sigmoid (for binary classification) or softmax (for multi-class classification). The final
output represents the prediction made by the network.

24. Explain N-gram language modelling with examples.

i. N-gram language modeling is a statistical approach used in Natural Language Processing


(NLP) to predict the next word in a sequence based on the previous N – 1 words.
ii. An N-gram is simply a sequence of N words.
iii. The core idea is that the probability of a word depends only on the last N – 1 words, rather
than the entire sentence.
iv. This simplifies computation and helps in tasks like text generation, spell checking,
machine translation, and speech recognition.
25. Explain how does NLP integrates with Deep Learning and Machine learning with a
set diagram

i. The set diagram illustrates the relationship between Artificial Intelligence (AI),
Machine Learning (ML), Deep Learning (DL), and Natural Language Processing
(NLP).
ii. At the broadest level, AI encompasses all technologies that aim to simulate human
intelligence in machines. This includes everything from rule-based systems to
learning-based models. Within this broad AI domain lies Machine Learning,
which refers to the subset of AI that enables systems to learn from data and
improve their performance over time without being explicitly programmed for
every specific task.
iii. Within Machine Learning lies a further specialization known as Deep Learning, which
involves the use of artificial neural networks with multiple layers (deep
architectures). Deep learning has shown exceptional performance in handling
large-scale data and complex tasks like image recognition, speech processing, and
most notably, sophisticated NLP tasks. This part of the diagram highlights that
while all deep learning is machine learning, not all machine learning is deep
learning.
iv. The NLP (Natural Language Processing) section overlaps both ML and DL areas of
the diagram, indicating that NLP makes use of techniques from both subfields.
NLP is the discipline concerned with enabling machines to understand, interpret,
generate, and interact using human languages.
v. This overlapping region emphasizes how NLP benefits from both traditional ML
approaches and cutting-edge DL models.
vi. NLP is a crucial application area of both ML and DL, sitting at the intersection of
language and computation.
26. What are the advantages and disadvantages of NLP.

You might also like