NLP Introduction

Natural Language Processing (NLP) is a field of AI that focuses on the interaction between computers and human languages, enabling tasks like translation, summarization, and sentiment analysis. It involves components such as Natural Language Understanding (NLU) and Natural Language Generation (NLG), and faces challenges like ambiguity and the need for extensive data. NLP applications include smart assistants, spam detection, and chatbots, utilizing techniques from both classical and deep learning approaches.

Uploaded by

nitinpandey.dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views36 pages

NLP Introduction

Uploaded by

nitinpandey.dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Prerequisite

for
Natural language processing
● Python
● Basic Concept of Machine
Learning and Deep Learning
Natural language processing
Natural language processing (NLP) is a subfield of computer science,
information engineering, and artificial intelligence concerned with
the interactions between computers and human (natural) languages,
in particular how to program computers to process and analyze large
amounts of natural language data.

Challenges in natural language processing frequently involve

speech recognition, natural language understanding, and natural
language generation.
It helps developers to organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
•Translation: Automatically converting text or speech from one language to another, like translating
English sentences into French.

•Automatic Summarization: Generating a condensed version of a longer text while retaining its key
points. It can be extractive (picking important sentences directly) or abstractive (creating new
sentences that summarize the text).

•Named Entity Recognition (NER): Identifying and classifying entities (like names of people,
organizations, dates, etc.) in text. For example, in the sentence "John works at Microsoft," NER would
label "John" as a person and "Microsoft" as an organization.

•Speech Recognition: Converting spoken language into text. It's what happens when you use voice
assistants like Siri or Google Assistant.

•Relationship Extraction: Identifying relationships between entities in a text. For example, in "John
works at Microsoft," this technique would extract the relationship "works_at" between "John" and
"Microsoft.“

•Topic Segmentation: Dividing a text into segments, each dealing with a specific topic. This is useful
in longer documents, like news articles or research papers, to identify different themes or subjects.
Natural language processing
Applications of NLP
There are the following applications of NLP -
1. Smart Assistants
Smart Assistants focuses on building systems that automatically
answer the questions asked by humans in a natural language.

2. Spam Detection
Spam detection is used to detect unwanted e-mails getting to a
user's inbox.
3. Sentiment Analysis
Sentiment Analysis is also known as opinion mining.
It is used on the web to analyse the attitude,
behaviour, and emotional state of the sender. This
application is implemented through a combination of
NLP (Natural Language Processing) and statistics by
assigning the values to the text (positive, negative, or
natural), identify the mood of the context (happy, sad,
angry, etc.)

4. Language Translation
Language translation is used to translate text or
speech from one natural language to another
natural language.
Example: Google Translator
5. Spelling correction
Microsoft Corporation provides word
processor software like MS-word,
PowerPoint for the spelling correction.

6. Speech Recognition
Speech recognition is used for
converting spoken words into text. It is
used in applications, such as mobile,
home automation, video recovery,
dictating to Microsoft Word, voice
biometrics, voice user interface, and so
on.
7. Chatbot
Implementing the Chatbot is one of the
important applications of NLP. It is used by
many companies to provide the customer's
chat services.

8. Information extraction
Information extraction is one of the most
important applications of NLP. It is used for
extracting structured information from
unstructured or semi-structured machine-
readable documents.
Applications
Used by
Advantages of NLP
•NLP helps users to ask questions about any subject and get a direct response within seconds.
•NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
•NLP helps computers to communicate with humans in their languages.
•It is very time efficient.
•Most of the companies use NLP to improve the efficiency of documentation processes, accuracy of
documentation, and identify the information from large databases.

Disadvantages of NLP
•For the training of the NLP model, A lot of data and computation are required.
•Many issues arise for NLP when dealing with informal expressions, idioms, and cultural jargon.
•NLP results are sometimes not to be accurate, and accuracy is directly proportional to the accuracy
of data.
•NLP is designed for a single, narrow job since it cannot adapt to new domains and has a limited
function.
Classical NLP
•Overview: Classical NLP uses rule-based, statistical, and machine learning methods to process
language. These approaches were dominant before the rise of deep learning techniques.
•Key Features:
• Focus on linguistic rules: Uses grammar, syntax rules, and domain-specific dictionaries.
• Heavy reliance on handcrafted features and manual feature engineering.
• Often employs shallow machine learning models such as:
• Naive Bayes
• Support Vector Machines (SVM)
• Hidden Markov Models (HMM)
• Conditional Random Fields (CRF)
• Models like n-grams, Bag of Words (BoW), and TF-IDF (Term Frequency-Inverse Document
Frequency) are commonly used for text representation.
•Examples:
• Spam classification using TF-IDF with an SVM classifier.
• Named Entity Recognition (NER) using rule-based methods or CRFs.
• POS tagging using HMMs or CRFs.
•Limitations:
• Requires extensive feature engineering.
• Not very good at capturing long-term dependencies or the contextual meaning of words.
• Struggles with large datasets and complex tasks like machine translation or sentiment analysis
at scale.
Deep NLP (Deep Learning-based NLP)
•Overview: Deep NLP leverages neural networks, particularly deep learning architectures, to learn
patterns in language data without requiring handcrafted features. It has gained prominence due to its
superior performance in complex NLP tasks.
•Key Features:
• Uses deep learning models like:
• Recurrent Neural Networks (RNN) and variants like LSTMs and GRUs.
• Convolutional Neural Networks (CNN) (applied to text).
• Transformers (e.g., BERT, GPT).
• No need for manual feature engineering—models automatically learn features from raw text
data.
• Able to capture context and long-term dependencies in language.
• Often relies on word embeddings such as Word2Vec, GloVe, or BERT embeddings for text
representation.
How NLP, DNLP and DL involves in!!!
How NLP, DNLP and DL involves in!!!
How NLP work?
Components of NLP
There are the following two components of NLP -
1. Natural Language Understanding (NLU)
Natural Language Understanding (NLU) helps the machine to understand and analyse human
language by extracting the metadata from content such as concepts, entities, keywords, emotion,
relations, and semantic roles.
NLU mainly used in Business applications to understand the customer's problem in both spoken and
written language.
NLU involves the following tasks -
•It is used to map the given input into useful representation.
•It is used to analyze different aspects of the language.
2. Natural Language Generation (NLG)
Natural Language Generation (NLG) acts as a translator that converts the computerized data into
natural language representation. It mainly involves Text planning, Sentence planning, and Text
Realization.
NLP Working
Natural Language Understanding
Phonology – This science helps to deal with patterns present in the sound and speeches related
to the sound as a physical entity.

Pragmatics – This science studies the different uses of language.

Morphology – This science deals with the structure of the words and the systematic relations
between them.

Syntax – This science deal with the structure of the sentences.

Semantics – This science deals with the literal meaning of the words, phrases as well as
sentences.
Phases of NLP
There are the following five phases of NLP:
1. Lexical Analysis and Morphological
The first phase of NLP is the Lexical Analysis. This phase scans the source code as a stream of
characters and converts it into meaningful lexemes. It divides the whole text into paragraphs,
sentences, and words.
2. Syntactic Analysis (Parsing)
Syntactic Analysis is used to check grammar, word arrangements, and shows the relationship
among the words.
Example: Agra goes to the Poonam
In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected
by the Syntactic analyzer.
3. Semantic Analysis
Semantic analysis is concerned with the meaning representation. It mainly focuses on the literal
meaning of words, phrases, and sentences.
4. Discourse Integration
Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of
the sentences that follow it.
5. Pragmatic Analysis
Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended effect by applying
a set of rules that characterize cooperative dialogues.
For Example: "Open the door" is interpreted as a request instead of an order.
Natural Language Generation
Based on NL-Understanding, it will suggest about:
● What should say to user.
● Should be Intelligent and Covervational as like human
● Usage of Structured data.
● With text and Sentence like planning.
Why NLP is difficult?
NLP is difficult because Ambiguity and Uncertainty exist in the language.
Ambiguity
There are the some following ambiguity –
Ambiguity:
Lexical Ambiguity : The Tank is full of water.
Syntactic Ambiguity : ill men and women get to hospital.
Semantic Ambiguity : The Bike hit the pole while it was running.
Pragmatic Ambiguity : The Army is coming.
•Lexical Ambiguity
Lexical Ambiguity exists in the presence of two or more possible meanings of the sentence within a
single word.
Example:
Manya is looking for a match.
In the above example, the word match refers to that either Manya is looking for a partner or Manya
is looking for a match. (Cricket or other match)
•Syntactic Ambiguity
Syntactic Ambiguity exists in the presence of two or more possible meanings within the sentence.
Example:
I saw the moon with the binocular.
In the above example, did I have the binoculars? Or did the moon have the binoculars?
•Referential Ambiguity
Referential Ambiguity exists when you are referring to something using the pronoun.
Example: Kiran went to Sunita. She said, "I am hungry."
In the above sentence, you do not know that who is hungry, either Kiran or Sunita.
How to build an NLP pipeline
There are the following steps to build an NLP pipeline -
Step1: Sentence Segmentation
Sentence Segment is the first step for building the NLP pipeline. It breaks the paragraph into
separate sentences.
Example: Consider the following paragraph -
Independence Day is one of the important festivals for every Indian citizen. It is celebrated on
the 15th of August each year ever since India got independence from the British rule. The
day celebrates independence in the true sense.
Sentence Segment produces the following result:
1."Independence Day is one of the important festivals for every Indian citizen."
2."It is celebrated on the 15th of August each year ever since India got independence from the
British rule."
3."This day celebrates independence in the true sense."
Step2: Word Tokenization
Word Tokenizer is used to break the sentence into separate words or tokens.
Example:
They offers Corporate Training, Summer Training, Online Training, and Winter Training.
Word Tokenizer generates the following result:
“They", "offers", "Corporate", "Training", "Summer", "Training", "Online", "Training", "and", "Winter",
"Training", "."
Step3: Stemming
Stemming is used to normalize words into its base form or root form. For example, celebrates,
celebrated and celebrating, all these words are originated with a single root word "celebrate." The
big problem with stemming is that sometimes it produces the root word which may not have any
meaning.
For Example, intelligence, intelligent, and intelligently, all these words are originated with a single
root word "intelligen." In English, the word "intelligen" do not have any meaning.
Step 4: Lemmatization
Lemmatization is quite similar to the Stemming. It is used to group different inflected forms of the
word, called Lemma. The main difference between Stemming and lemmatization is that it produces
the root word, which has a meaning.
For example: In lemmatization, the words intelligence, intelligent, and intelligently has a root word
intelligent, which has a meaning.
Step 5: Identifying Stop Words
In English, there are a lot of words that appear very frequently like "is", "and", "the", and "a". NLP
pipelines will flag these words as stop words. Stop words might be filtered out before doing any
statistical analysis.
Example: He is a good boy.
Step 6: Dependency Parsing
Dependency Parsing is used to find that how all the words in the sentence are related to each other.
Step 7:Count vectorizer
Count Vectorizer is a way to convert a given set of strings into a frequency representation. In the
above two examples you have Texts that are Tagged respectively. This is a very simple case of NLP
where you get tagged text data set and then using it you have to predict the tag of another text data.
Step 8: Part-of-speech tagging (POS tags)
POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that
how a word functions with its meaning as well as grammatically within the sentences. A word has one
or more parts of speech based on the context in which it is used.
Example: "Google" something on the Internet.
In the above example, Google is used as a verb, although it is a proper noun.
Step 9: Named Entity Recognition (NER)
Named Entity Recognition (NER) is the process of detecting the named entity such as person name,
movie name, organization name, or location.
Example: Steve Jobs introduced iPhone at the Macworld Conference in San Francisco, California.
Step 10: Chunking
Chunking is used to collect the individual piece of information and grouping them into bigger pieces of
sentences.
NLP Libraries
Scikit-learn: It provides a wide range of algorithms for building machine learning models in Python.
Natural language Toolkit (NLTK): NLTK is a complete toolkit for all NLP techniques.
Pattern: It is a web mining module for NLP and machine learning.
TextBlob: It provides an easy interface to learn basic NLP tasks like sentiment analysis, noun
phrase extraction, or pos-tagging.
Quepy: Quepy is used to transform natural language questions into queries in a database query
language.
SpaCy: SpaCy is an open-source NLP library which is used for Data Extraction, Data Analysis,
Sentiment Analysis, and Text Summarization.
Gensim: Gensim works with large datasets and processes data streams.

Foundation For NLP
No ratings yet
Foundation For NLP
14 pages
NLP Basics for Computer Science Students
No ratings yet
NLP Basics for Computer Science Students
87 pages
NLP for AI and Tech Enthusiasts
No ratings yet
NLP for AI and Tech Enthusiasts
30 pages
NLP Lecture
No ratings yet
NLP Lecture
18 pages
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
No ratings yet
Course Code HUM1012 Logic and Language Structure BL202425040 0921 D21+D22
55 pages
NLP Notes
No ratings yet
NLP Notes
37 pages
NLP Exam Notes
No ratings yet
NLP Exam Notes
15 pages
DLNLP Chapter-1
No ratings yet
DLNLP Chapter-1
38 pages
Unit 4
No ratings yet
Unit 4
39 pages
Natural Language Processing (NPL) : Group Name: Goal Diggers
No ratings yet
Natural Language Processing (NPL) : Group Name: Goal Diggers
22 pages
Lesson 1 Introduction To Natural Language Processing
No ratings yet
Lesson 1 Introduction To Natural Language Processing
93 pages
Natural Language Processing
100% (1)
Natural Language Processing
6 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
NLP Unit 1 1
No ratings yet
NLP Unit 1 1
67 pages
Ai Unit4
No ratings yet
Ai Unit4
36 pages
NLP MODULE 1 Chapter1 &2
100% (1)
NLP MODULE 1 Chapter1 &2
83 pages
Natural Language Processing
No ratings yet
Natural Language Processing
30 pages
Natural Language Processing
No ratings yet
Natural Language Processing
16 pages
NLP - Natural Language Processing and APPLICATION
No ratings yet
NLP - Natural Language Processing and APPLICATION
31 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
1 Natural Language Processing-Intro
No ratings yet
1 Natural Language Processing-Intro
16 pages
TOPIC 4 Natural Language Processing
No ratings yet
TOPIC 4 Natural Language Processing
26 pages
UNIT - 03 (All Topics)
No ratings yet
UNIT - 03 (All Topics)
54 pages
Natural Language Processing
No ratings yet
Natural Language Processing
4 pages
Chapter 6.
No ratings yet
Chapter 6.
31 pages
Natural Language Processing
No ratings yet
Natural Language Processing
73 pages
Natural Language Processing
No ratings yet
Natural Language Processing
43 pages
What Is NLP?
No ratings yet
What Is NLP?
5 pages
Natural Language Processing
No ratings yet
Natural Language Processing
14 pages
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
No ratings yet
What Is NLP?: Natural Language Processing Computer Science, Human Language, Artificial Intelligence
10 pages
NLP Unit 1 To 5
No ratings yet
NLP Unit 1 To 5
91 pages
Unit-I NLP
No ratings yet
Unit-I NLP
15 pages
A Beginner's Introduction To Natural Language Processing (NLP)
100% (1)
A Beginner's Introduction To Natural Language Processing (NLP)
15 pages
CH 5 NLP
No ratings yet
CH 5 NLP
12 pages
Natural Language Processing
No ratings yet
Natural Language Processing
29 pages
BE02000041 Funda of AI Unit 2 NLP
No ratings yet
BE02000041 Funda of AI Unit 2 NLP
16 pages
NLP1 Lecture1
No ratings yet
NLP1 Lecture1
22 pages
NLP UNIT 1 Part 1
No ratings yet
NLP UNIT 1 Part 1
24 pages
NLP Textbook Star Edu
No ratings yet
NLP Textbook Star Edu
103 pages
1 NLP
No ratings yet
1 NLP
26 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
Natural Language Processing
No ratings yet
Natural Language Processing
5 pages
Unit - 1
No ratings yet
Unit - 1
55 pages
Unit V
No ratings yet
Unit V
16 pages
NLP Meterial 5 Units
No ratings yet
NLP Meterial 5 Units
151 pages
6CS4 AI Unit-5
No ratings yet
6CS4 AI Unit-5
65 pages
Chapter 1
No ratings yet
Chapter 1
31 pages
Hadi Pres, 21-12-24-1
No ratings yet
Hadi Pres, 21-12-24-1
16 pages
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
No ratings yet
Natural Language Processing: Bachelor of Technology Computer Science and Engineering
7 pages
NLP PPT1
No ratings yet
NLP PPT1
29 pages
2 Introduction
No ratings yet
2 Introduction
15 pages
Lec 1
No ratings yet
Lec 1
18 pages
NLP Merged
100% (1)
NLP Merged
975 pages
Class 1 - NLP
No ratings yet
Class 1 - NLP
28 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
Unit 1 Extra
No ratings yet
Unit 1 Extra
6 pages
Brief History of NLP
No ratings yet
Brief History of NLP
7 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
11 pages
Bcs501 Database Management System
No ratings yet
Bcs501 Database Management System
3 pages
Stemming and Lemmatization
No ratings yet
Stemming and Lemmatization
18 pages
Ann DL
No ratings yet
Ann DL
20 pages
Tokenization
No ratings yet
Tokenization
13 pages
Newmark's Procedures in Persian Translation of Golding's Lord of The Flies
No ratings yet
Newmark's Procedures in Persian Translation of Golding's Lord of The Flies
13 pages
TR 403 Si̇multaneous Int. Şule Düzdemi̇r 1ST Week
No ratings yet
TR 403 Si̇multaneous Int. Şule Düzdemi̇r 1ST Week
36 pages
Urdu Idioms
No ratings yet
Urdu Idioms
13 pages
Identifying Sentence Fragments 2 PDF
No ratings yet
Identifying Sentence Fragments 2 PDF
4 pages
Church Logic Sense Denotation 1
100% (1)
Church Logic Sense Denotation 1
11 pages
What Is An Interjection - RECAPENTEEEEE
No ratings yet
What Is An Interjection - RECAPENTEEEEE
20 pages
Subject Verb Agreement
No ratings yet
Subject Verb Agreement
6 pages
Part-38 - Conjuctions - Complete English Grammar by YET
No ratings yet
Part-38 - Conjuctions - Complete English Grammar by YET
20 pages
Grammar Elements & Structures
No ratings yet
Grammar Elements & Structures
9 pages
A1 Holiday Homework
No ratings yet
A1 Holiday Homework
52 pages
3 Family Ties
No ratings yet
3 Family Ties
5 pages
Past Simple, Past Continuous, Would, Used To and Irregular Verbs
No ratings yet
Past Simple, Past Continuous, Would, Used To and Irregular Verbs
24 pages
Std12 Hist EM
No ratings yet
Std12 Hist EM
15 pages
Romeo and Juliet Vocabulary List: Act I
No ratings yet
Romeo and Juliet Vocabulary List: Act I
3 pages
Unit 36: 1. Lesson One
No ratings yet
Unit 36: 1. Lesson One
5 pages
Godin, B. - Social Innovation:Utopias of Innovationfrom c.1830 To The Present
No ratings yet
Godin, B. - Social Innovation:Utopias of Innovationfrom c.1830 To The Present
52 pages
Gerunds and Infinitives New
No ratings yet
Gerunds and Infinitives New
7 pages
D. M. Armstrong Belief, Truth and Knowledge 1973
100% (5)
D. M. Armstrong Belief, Truth and Knowledge 1973
238 pages
An Object Oriented Petri Net Language For Embedded System Design
No ratings yet
An Object Oriented Petri Net Language For Embedded System Design
8 pages
Afaan Oromo - Chapter 10 - Wikibooks, Open Books For An Open World
No ratings yet
Afaan Oromo - Chapter 10 - Wikibooks, Open Books For An Open World
7 pages
English Grammar Master in 30 Days
100% (1)
English Grammar Master in 30 Days
181 pages
Sematics A Coursebook
No ratings yet
Sematics A Coursebook
9 pages
Parts of Speech
No ratings yet
Parts of Speech
6 pages
Propositional Logic and Truth Tables
No ratings yet
Propositional Logic and Truth Tables
5 pages
1st Step Into SFG - Halliday-Matthiessen
No ratings yet
1st Step Into SFG - Halliday-Matthiessen
44 pages
English Exam: Adjectives & Verbs
No ratings yet
English Exam: Adjectives & Verbs
12 pages
Perfect Prepositions A Real Life Guide To Using English Prepositions Galina Kimber Complete Edition
100% (2)
Perfect Prepositions A Real Life Guide To Using English Prepositions Galina Kimber Complete Edition
131 pages
Quantification 1st Edition Anna Szabolcsi 2024 Scribd Download
100% (30)
Quantification 1st Edition Anna Szabolcsi 2024 Scribd Download
83 pages
? (AC-S12) Week 12 - Task Assignment - The Greatest Place
100% (1)
? (AC-S12) Week 12 - Task Assignment - The Greatest Place
30 pages
Unit 2 Language Processing (Comprehension and Language Expression)
No ratings yet
Unit 2 Language Processing (Comprehension and Language Expression)
15 pages

NLP Introduction

Uploaded by

NLP Introduction

Uploaded by

Prerequisite

Challenges in natural language processing frequently involve

Pragmatics – This science studies the different uses of language.

Syntax – This science deal with the structure of the sentences.

You might also like