0% found this document useful (0 votes)
69 views7 pages

NLP Basics for Beginners

Language is a key part of human communication and development. As technology advanced, new programming languages were developed to allow communication between humans and machines. More recently, natural language processing (NLP) techniques aim to allow machines to understand human languages to enhance human lives. NLP involves natural language understanding to interpret meanings and intents, and natural language generation to produce relevant and coherent responses. It uses techniques like tokenization, stemming, tagging parts of speech, and identifying entities to help machines comprehend human language.

Uploaded by

Dhruba Barman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views7 pages

NLP Basics for Beginners

Language is a key part of human communication and development. As technology advanced, new programming languages were developed to allow communication between humans and machines. More recently, natural language processing (NLP) techniques aim to allow machines to understand human languages to enhance human lives. NLP involves natural language understanding to interpret meanings and intents, and natural language generation to produce relevant and coherent responses. It uses techniques like tokenization, stemming, tagging parts of speech, and identifying entities to help machines comprehend human language.

Uploaded by

Dhruba Barman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Language has been one of the most intelligent invention by human

beings.It is the Language which helps us to communicate with


each other, know each other and eventually develop a relation with
each other. Dictionary meaning says Language is “the method of
human communication, either spoken or written, consisting of
the use of words in a structured and conventional way”.
Language makes us unique from other living beings and I would
say, this language has been one of the most important factor in
shaping us as the dominant species of this planet. It has somehow
eased the lifestyle of human beings to a great extent. And as time
evolved, languages also evolved — from sign language to letters,
alphabets,words,sentences,dialects and so many languages for so
many communities and tribes.

Image Source
With the invention of computers and introduction of information
technology,human beings started thinking of ways to enhance and
ease their life with the help of machines and there arose the need
for a new language, which is referred to as Programming Language
-a way of communicating with computers. This automated and
simplified many aspects of human life.

But as technology started advancing, human beings now want to


talk to machines. They want to make the machine understand the
human language and think the way humans think, reply the way
humans reply. And the need for this is, obviously, to ease our lives,
to avoid human errors, to complete tasks faster, and so on.This
thinking has led to the birth of the new concept of processing our
natural language so as to make the machines understand us and
revert back in our way so that humans and machines can
communicate with ease and create a much more advanced society
enhancing our lifestyle. And this led to the introduction of the
technique, which we call as Natural Language Processing or NLP.

Talk with Machines in Natural Language


Applications of NLP:

Speech recognition : This is found in most of the smart phones


in form of Google Assistant or Siri, who can understand and
communicate with humans in natural language.
Sentiment Analysis: Interprets sentiments of users by
analyzing the Twitter and Facebook posts/comments,Movie
reviews.Interpreting the sentiments of users using this technique
help the political and business leaders to make their decisions
accordingly.
Chatbot: Most customer care is now using this to interact with
users and has been successful in answering the basic queries and
concerns of users.
Translation: Google translate can translate a specific language
into different languages.
Advertisement matching: This one is quite interesting. Based
on our past search history it helps in recommending
movies/series/songs of our interest.

There are more such applications of NLP and all of these has
enhanced and eased the lifestyle of human civilization to a great
extent.

NLP consists of 2 components:


1. Natural Language Understanding-NLU :
**************************************
It involves understanding the speech, language and the intent and
meaning of it. We all know that natural language is always
ambiguous.The same word or sentence may have different
meanings, intents,emotions and so understanding the correct
meaning and intent of the language is complex,specially when it is
for machines. For these challenges, it is referred to as Artificial
Intelligence hard problem. NLU faces the following ambiguities:
Lexical ambiguity -We often see that a single word can have
many meanings which leads to confusion in understanding
it. Example:He is going to the bank.Here ‘bank’ may refer to the
bank where we deposit/withdraw money or the bank alongside a
river.
Syntactical ambiguity -When a single sentence may have
more than one interpretation,it leads to this semantic or structural
ambiguity. Example: Mumpy saw someone on the hill with a
telescope.Did Mumpy use a telescope to see someone on the hill or
did she see someone on the hill holding a telescope?
Referential ambiguity -When the correct reference in a
sentence is not known. Example: Munna went to meet his father.
He was very excited. Here, who is ‘He’ referred to? Is it Munna or
father?

2. Natural Language Generation — NLG :


*************************************
It involves generating a language to present to the user.After
recognizing and understanding the input, the machine should
provide the output in same language in such a way that the output
is intelligent,logical, relevant and conversational. For this, NLG
follows below steps:
Text planning : The relevant words are selected after
understanding the input and they are selected from the corpus or
knowledge base.
Sentence planning : The selected words are framed into a
sentence in such a way that the output looks meaningful and
referential.The words are placed in proper sequence to have a
structured and meaningful language.

To overcome the different challenges, NLP follows some process


and below are some of the important processes:
Tokenization: It is the process of breaking a sequence of strings
into pieces known as tokens. The tokens can be words/phrases of a
sentence OR sentences of a paragraph.

Stemming: It is the process of reducing a word into its root or


base form.The process involves chopping off the suffixes or
prefixes of a word to get the base word. Although it may result in a
word which has no dictionary meaning. Example: Smoker-
Smoking-Smoked when stemmed will result in ‘Smok’ which is
not an actual word. Stemming is generally useful for information
retrieval systems like search engines.

Lemmatization: It is same as stemming BUT the difference is


that lemmatization reduces the word into its meaningful base form
or ‘lemma’. Example: Smoker-Smoking-Smoked when
lemmatized will result in ‘Smoke’ which is an actual word. The
list of lemma is taken from knowledge base of NLP.

POS Tags: Parts-of-speech tags are given to each words in a


series which helps in processing and interpreting the natural
languages by machines. It identifies each word as Noun or verb or
pronoun or adjective, etc.
Example: The cat is running. In this sentence POS tags are as
follows:
‘The’ -Determiner, ‘cat’-Noun, ‘is running’-verb.

Name Entity Relation(NER): This process involves


identifying and segmenting each word/phrase of a sentence and
classifying or categorizing them under various predefined classes.
Example: Mumpy watched Titanic during her stay in Sweden in
2016. Here NER technique identifies Mumpy as Noun, Titanic as
movie,Sweden as location and 2016 as date.

Chunking: Now that strings are reduced into pieces and


analyzed, it is time to combine different pieces into chunks and tag
them so as to get a larger picture. Good chunking facilitates
comprehension and retrieval of meaningful information.
Apart from the processes mentioned above, NLP also involves
removal of punctuation,site links and stop words.Stop words are
words which are very helpful to frame an interpret-able sentence
but even if we remove them the basic meaning can be understood.
Some common stopwords are : “the”, “is”, “in”, “for”,“when”, “to”,
“at” etc.

**Although I mentioned about removing these features, it


actually depends on the use case and problem we are dealing
with. Some use cases can have high importance of stop words or
the site links or even the punctuation (for understanding
sentiments). So we need to clearly understand the relevant case
or problem and accordingly decide whether to remove these
features or not**

All the NLP processes that are discussed here can be implemented
by using the Natural Language Toolkit (NLTK). This is a very
useful and interesting tool to implement NLP using Python
language. It contains all the libraries of the different NLP
processes,like tokenizing,stemming,lemmatizing etc. and these
libraries come as a package with NLTK when downloaded.

NOTE: The details are provided in the simplest and easiest way
possible so as to make it understandable for the beginners. Please
feel free to leave your feedback and comments.

You might also like