0% found this document useful (0 votes)

89 views59 pages

NLP CH 2

The document discusses language modeling and part-of-speech tagging. It introduces n-gram language models which assign probabilities to sequences of words. It also discusses techniques for smoothing n-gram probabilities like Add-1 smoothing. Part-of-speech tagging is defined as assigning a part of speech like noun, verb, adjective to each word. Rule-based and stochastic taggers are described along with hidden Markov models and morphological parsing.

Uploaded by

coffee princess

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views59 pages

NLP CH 2

Uploaded by

coffee princess

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

CHAPTER – 2

Language Modeling and Part of Speech

Tagging

Subject: NLP Prepared By:

Asst. Prof. Chaitali Bhoi
Code: 3170723 CE, NIT
Language Model
• Predicting is difficult—especially about the
future

• But how about predicting something that

seems much easier, like the next few words
someone is going to say?

• Ex: hey, hi !! , How are……you?

Language Model
• In the following sections we will formalize this
intuition by introducing models that assign a
probability to each possible next word.

• The same models will also serve to assign a

probability to an entire sentence.
Language Model
• example, could predict that the following
sequence has a much higher probability of
appearing in a text

1 all of a sudden I notice three guys standing on

the sidewalk - it makes sense
2 on guys all I of notice sidewalk three a sudden
standing the
Application
• speech recognition
• spelling correction
• grammatical error correction
• machine translation
Language Model
• Models that assign probabilities of sequences
and words are called language models or LMs.

• we introduce the simplest model that assigns

probabilities to sentences and sequences of
words, the n-gram
N-gram
• An n-gram is a sequence of n words

• 2-gram(bigram) - is a two-word sequence of

words
• 3-gram(trigram) - is a three-word sequence of
words

• n-gram models to estimate the probability of the

last word of an n-gram given the previous words,
and also to assign probabilities to entire
sequences.
Conditional Probability
• We will use the P(A|B)notation to represent
the conditional probability of A given that the
event B has occurred. B is the “conditioning
event.”
Conditional Probability Example
• Suppose that of all individuals buying a certain
digital camera, 60% include an optional
memory card in their purchase, 40% include
an extra battery, and 30% include both a card
and battery Given that the selected individual
purchased an extra battery, the probability
that an optional card was also purchased is
Conditional Probability solution
• A = {memory card purchased}
• B = {battery purchased}
• P(A)=0.60
• P(B)=0.40
• P(A intersection B) = 0.30
• That is, of all those purchasing an extra
battery, 75% purchased an optional memory
card
Joint Probability
Joint Probability Example
Bigram
• <s> I am Jack </s>
• <s> Jack am I </s>
• <s> Jack I like </s>
• <s> Jack I do like </s>
• <s> Do I like Jack </s>

• Assume we use Bigram

• find Probable word

• 1)Jack ... 2)Jack I do....
• 3) Jack I am Jack ....
• 4) do I like Jack ....

• find sentence probability

• 1) I like Jack
• 2) Jack like nothing
Evaluating Language Models
• extrinsic evaluation
• intrinsic evaluation
extrinsic evaluation
• The best way to evaluate the performance of a
language model is to embed it in an
application and measure how much the
application improves. Such end-to-end
evaluation is called extrinsic evaluation
intrinsic evaluation
• An intrinsic evaluation metric is one that mea-
intrinsic evaluation sures the quality of a
model independent of any application.
Smoothing Techniques
• Add-1 / Laplace
• Add-K
• Backoff and Interpolation

• Advanced:
• Good- Turing
• Kneser-Ney
Add-1 / Laplace Smoothing
• Add-1 smoothing
• Equation:
• Unigram: Ci + 1 / N + V
• Bigram:
• PAdd-1(wn|wn−1) = C(wn−1wn) +1 /
C(wn−1) +1V
Add-K Smoothing
• Add-K smoothing , k=0.5

• Equation:
• Unigram: Ci + k / N + kV
• Bigram:
• PAdd-k(wn|wn−1) = C(wn−1wn) +k /
C(wn−1) +kV
Example
• I live in India
• I live in
• I live
• in India
• India I live
• I in

• I love to live in India

Unigram
• I == 5 + 0.5 / 16 + 0.5(4) = 0.3
• Love == 0 + 0.5 / 18 = 0.02
• to == 0.02
• live == 4+0.5/18 = 0.25
• in == 0.25
• India ==0.19
Bigram
• I love == 0+0.5 / 5+0.5(4) =0.07
• love to == 0+0.5 / 0+2=0.25
• to live == 0+0.5/ 0+2=0.25
• live in ==2+0.5/ 4+2=0.4
• in India == 2+0.5/ 4+2=0.4
Backoff and Interpolation
• Backoff

• Use n-1 gram instead of n gram if find

probability zero for n gram model.
Backoff and Interpolation
• Interpolation

• Pˆ(wn|wn−2wn−1) = λ1P(wn|wn−2wn−1)
• +λ2P(wn|wn−1)
• +λ3P(wn)

• such that the λs sum to 1

• λi = 1
• I Love to == (0.2)(0) + 0.4.(0) + 0.4(0) = 0
• love to live= 0.2*0+0.4*0+0.4*4=1.6
• to live in = 0.2*0+0.4*0.5+0.4*4=1.8
• live in india= 1.5
• 0.2* to live in india/ to live in == 0.2 *(0/0) = 0
• 0.2 * live in India / live in = 0.2 * (1/2) =
0.2*0.5 =0.1
• 0.3 * in India / in = 0.3 * (2/4) = 0.3*0.5=0.15
• 0.3 * India = 0.3*3=0.9
• 0.1* to live in india/ to live in == 0.1 *(0/0) = 0
• 0.1 * live in India / live in = 0.1 * (1/2) =
0.1*0.5 =0.05
• 0.4 * in India / in = 0.4 * (2/4) = 0.1*0.5=0.05
• 0.4 * India = 0.7*3=2.1

• 2.2
KNESER- Ney smoothing
POS Tagging
• Part of Speech Tagging - defined as the process
of assigning one of the parts of speech to the
given word.

• POS tagging is a task of labelling each word in a

sentence with its appropriate part of speech.

• parts of speech include nouns, verb, adverbs,

adjectives, pronouns, conjunction and their sub-
categories
POS Tagging

• noun --> place/ name/ organization name

• madal verb --> will, can , could, should, might,
must
• verb -> action // run,eat,speak, listen ,do
• adjective--> quality describe for noun
Rule-based
• Rule-based taggers use dictionary or lexicon
for getting possible tags for tagging each
word. If the word has more than one possible
tag, then rule-based taggers use hand-written
rules to identify the correct tag.
Stochastic
• Word frequency
• Sentence sequences
Transmission based
• Combination of rule based and stochastic.
POS Tagging Example
• Collecting Data- labeled data
• Create lookup table- tag each word with most
common POS
• Tag our sentence statement.
Hidden Markov Model (HMM)
• Need two things :
• Emission Probabilities:
• How likely Jane will be Noun, or Modal or Verb

• Transition Probabilities:
• How likely is Noun followed by modal which is
followed by a verb
Morphology
• Study of how words can be created.

Different type of words:

- Have exact meaning: pen, board, phone
- Combination of different meaningful word:
showcase(show + case), useless(use + case)
- Have no meaning: ing, s, es
Morphology parsing
• Collect morphene ( small meaningful word
unit which further not divided) from world.

• Morphene
– Steam word
– Affix (suffix: loved , prefix: reform infix: passersby)
Morphology parser
• Lexicon – information stored like which word
is stem word and affix formation

• Morphotactics – which word is suitable for

before word / after word / in between.
– It is set of rules.
– Ex: three words : Use Able Ness
– From this meaningful word is: useableness
Morphology parser
• Orthographic Rules – used to change words
• Speeling rules

• Ex: lady + s = ladys (not proper word)

• lady + s = ladies ( proper word)
Types of Morphene
• 1) Free morphene : independent word

• Two types of free morphene.

A) Lexical : pictured / visual words like: pen, book,
yellow, eyes

B) Grammatical : AND, OR, NOT

Types of Morphene
• 2) Bound – combined with free morphene and
make meaningful word.
– Ex: love + ing = loving

• Two types of Bound morphene:

• A) Inflection – if words added to free
morphene and POS tag will not be changed.
– Ex: cat(Noun) + s = cats(Noun)
Types of Morphene
B) Derivational
Class Changing – POS tag Change
Ex: danger (Noun) + ous = dangerous(ADJ.)
Class maaintain – word change not POS
Ex: law(Noun) + yers = lawyers(Noun)

Lecture 4
No ratings yet
Lecture 4
87 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Language Models & N-Gram Analysis
No ratings yet
Language Models & N-Gram Analysis
41 pages
N Grams
No ratings yet
N Grams
51 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
Lecture04-Ngram Lang Models
No ratings yet
Lecture04-Ngram Lang Models
39 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
Language Models L3-6
No ratings yet
Language Models L3-6
49 pages
Unit 2b
No ratings yet
Unit 2b
22 pages
Language Models
No ratings yet
Language Models
59 pages
NLP Units Iv V
No ratings yet
NLP Units Iv V
30 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Lecture 03
No ratings yet
Lecture 03
41 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
Natural Language Processing - Notes - Unit 2
No ratings yet
Natural Language Processing - Notes - Unit 2
19 pages
Language Models
No ratings yet
Language Models
34 pages
Ngram
No ratings yet
Ngram
41 pages
2.1 Chap NLP Ngrams
No ratings yet
2.1 Chap NLP Ngrams
37 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
NLP Techniques for Word Prediction
No ratings yet
NLP Techniques for Word Prediction
77 pages
CME4408 P5 N-Grams Smooting
No ratings yet
CME4408 P5 N-Grams Smooting
43 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
Unit Vapplications Notes
No ratings yet
Unit Vapplications Notes
13 pages
AI Unit V
No ratings yet
AI Unit V
64 pages
NLP UNIT III (Part 1)
No ratings yet
NLP UNIT III (Part 1)
15 pages
NLP 4
No ratings yet
NLP 4
83 pages
Unit2pdfpdf 2024 08 19 20 12 27
No ratings yet
Unit2pdfpdf 2024 08 19 20 12 27
80 pages
Language Modelling
No ratings yet
Language Modelling
17 pages
Video v3
No ratings yet
Video v3
34 pages
Lec 03
No ratings yet
Lec 03
31 pages
5-N Gram
No ratings yet
5-N Gram
35 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
Describe Ambiguity and Its Types
No ratings yet
Describe Ambiguity and Its Types
3 pages
Statistical Inference
No ratings yet
Statistical Inference
38 pages
6.chapter6 LanguageModel
No ratings yet
6.chapter6 LanguageModel
33 pages
NLP-Lectures 4,5,6
No ratings yet
NLP-Lectures 4,5,6
85 pages
5) Lecture Feb11&13&17&18
No ratings yet
5) Lecture Feb11&13&17&18
21 pages
04 Language Modeling
No ratings yet
04 Language Modeling
70 pages
Lecture 4 N Grams
No ratings yet
Lecture 4 N Grams
29 pages
Module 2
No ratings yet
Module 2
26 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Cs383 Lecture16 PDF
No ratings yet
Cs383 Lecture16 PDF
46 pages
08 NLP - N-Gram Language Models
No ratings yet
08 NLP - N-Gram Language Models
65 pages
LM 24 Aug
No ratings yet
LM 24 Aug
75 pages
IJISRT18DC138
No ratings yet
IJISRT18DC138
6 pages
Unit 2
No ratings yet
Unit 2
75 pages
N-gram Models in NLP Explained
No ratings yet
N-gram Models in NLP Explained
28 pages
LM 24 Aug
No ratings yet
LM 24 Aug
84 pages
Analysis of Statistical Parsing in Natural Language Processing
No ratings yet
Analysis of Statistical Parsing in Natural Language Processing
6 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
Module-1 ch-2
No ratings yet
Module-1 ch-2
31 pages
Session 2-3 Language Modeling
No ratings yet
Session 2-3 Language Modeling
69 pages
Language Model Evaluation Methods
No ratings yet
Language Model Evaluation Methods
21 pages
Unit 1 NLP KCS072
No ratings yet
Unit 1 NLP KCS072
12 pages
NLP for Language Model Enthusiasts
No ratings yet
NLP for Language Model Enthusiasts
74 pages
Seven ElevenSupplyChainanalysis
No ratings yet
Seven ElevenSupplyChainanalysis
12 pages
IAL Physics SB2 Assessment 5A
No ratings yet
IAL Physics SB2 Assessment 5A
3 pages
CSR Fire Seal
100% (1)
CSR Fire Seal
16 pages
SPS Plans PDF
No ratings yet
SPS Plans PDF
7 pages
Challenges
No ratings yet
Challenges
1 page
International Students in Vietnam: Listening Lesson
No ratings yet
International Students in Vietnam: Listening Lesson
6 pages
Zero Breakdown Concepts
50% (4)
Zero Breakdown Concepts
24 pages
Air Conditioning Load Calculation For Science Lab: 9in + 1in Plasters
100% (1)
Air Conditioning Load Calculation For Science Lab: 9in + 1in Plasters
8 pages
Outline Informative Speech
No ratings yet
Outline Informative Speech
3 pages
Jim Lovell
No ratings yet
Jim Lovell
8 pages
SAP BDC Customer Master Script
No ratings yet
SAP BDC Customer Master Script
11 pages
RWSModule 9
No ratings yet
RWSModule 9
5 pages
Fototransistor Datasheet
No ratings yet
Fototransistor Datasheet
5 pages
Asphalt Distress: Paver
No ratings yet
Asphalt Distress: Paver
47 pages
Backend Developer
No ratings yet
Backend Developer
2 pages
Mission U500 Service Manual
67% (3)
Mission U500 Service Manual
67 pages
2024 Writing Skills 1
100% (1)
2024 Writing Skills 1
37 pages
My Favorite Room 2
No ratings yet
My Favorite Room 2
2 pages
Final Counselling Instructions
No ratings yet
Final Counselling Instructions
4 pages
Science and Technology in The Soviet Union - Wikipedia
No ratings yet
Science and Technology in The Soviet Union - Wikipedia
7 pages
Post Box No.: 1038, Street No. 41, New Industrial Area, Doha, Qatar
No ratings yet
Post Box No.: 1038, Street No. 41, New Industrial Area, Doha, Qatar
8 pages
District Manager Operations Consulting in Sacramento CA Resume Art Taketa
No ratings yet
District Manager Operations Consulting in Sacramento CA Resume Art Taketa
2 pages
Management Skills Sheet 1 Answers
No ratings yet
Management Skills Sheet 1 Answers
7 pages
Seeking Customer Centricity The Omni Business Model
No ratings yet
Seeking Customer Centricity The Omni Business Model
60 pages
Why We Do What We Do Understanding Self-Motivation - EDWARD L DECI WITH RICHARD FLASTE
No ratings yet
Why We Do What We Do Understanding Self-Motivation - EDWARD L DECI WITH RICHARD FLASTE
248 pages
High Conductivity Copper, Hard, UNS C10200 (MatWeb)
No ratings yet
High Conductivity Copper, Hard, UNS C10200 (MatWeb)
3 pages
Physics - Horizontal Projectile Motion NOTES
No ratings yet
Physics - Horizontal Projectile Motion NOTES
19 pages
Worksheets in Genmath
No ratings yet
Worksheets in Genmath
5 pages
A Study of Algorithm and Application in Transient Signals Wavelet Post-Analysis Methods
No ratings yet
A Study of Algorithm and Application in Transient Signals Wavelet Post-Analysis Methods
6 pages
Chapter 1 - Word Formation
No ratings yet
Chapter 1 - Word Formation
32 pages

NLP CH 2

Uploaded by

NLP CH 2

Uploaded by

CHAPTER – 2

Language Modeling and Part of Speech

Subject: NLP Prepared By:

• But how about predicting something that

• Ex: hey, hi !! , How are……you?

• The same models will also serve to assign a

1 all of a sudden I notice three guys standing on

• we introduce the simplest model that assigns

• 2-gram(bigram) - is a two-word sequence of

• n-gram models to estimate the probability of the

• Assume we use Bigram

• find Probable word

• find sentence probability

• I love to live in India

• Use n-1 gram instead of n gram if find

• such that the λs sum to 1

• POS tagging is a task of labelling each word in a

• parts of speech include nouns, verb, adverbs,

• noun --> place/ name/ organization name

Different type of words:

• Morphotactics – which word is suitable for

• Ex: lady + s = ladys (not proper word)

• Two types of free morphene.

B) Grammatical : AND, OR, NOT

• Two types of Bound morphene:

You might also like