0% found this document useful (0 votes)

4 views3 pages

NLP Experiment 3

Uploaded by

laali34569

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views3 pages

NLP Experiment 3

Uploaded by

laali34569

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

EXPERIMENT -3

AIM: To construct a Part-of-Speech (POS) tagger using the Hidden Markov Model (HMM) and
implement the Viterbi algorithm to decode the most probable sequence of tags for a given sentence.

DESCRIPTION: Part of Speech (POS) tagging is the process of assigning a grammatical category
to each word in a sentence based on its role in the context. These categories include noun, verb,
adjective, adverb, preposition, and others. POS tagging helps computers understand the structure of
language, making it easier for them to process and analyze text.

Example:
Sentence: “The cat is sitting on the mat.”
POS tags:
 The → Determiner
 cat → Noun
 is → Verb
 sitting → Verb
 on → Preposition
 the → Determiner
 mat → Noun
POS tagging with Hidden Markov Model:
A Hidden Markov Model is a statistical model used to understand systems where the actual states
are not directly visible, but we can see outcomes that depend on those hidden states. In simple
terms, HMM helps us figure out what is going on behind the scenes based on what we observe.
example. Suppose you are trying to guess the weather each day, but you can't look outside. Instead,
you watch what people do. If someone is walking, shopping, or cleaning, these actions give you
clues about what the weather might be like.
Step 1: Define the Training Data
We start by creating a small dataset of sentences. Each word is labeled with its correct part of
speech.

train_data = [

[("the", "DET"), ("cat", "NOUN"), ("sat", "VERB")],

[("the", "DET"), ("dog", "NOUN"), ("barked", "VERB")],
[("a", "DET"), ("dog", "NOUN"), ("sat", "VERB")],
]

Step 2: Calculate Probabilities

This step builds the statistical foundation for the HMM. The model counts how likely:
 A sentence starts with each tag
 One tag follows another (transition)
 A word appears with a tag (emission)
Then it converts these counts into probabilities, which are used by the Viterbi algorithm later.

from collections import defaultdict

import math
transition = defaultdict(lambda: defaultdict(int))
emission = defaultdict(lambda: defaultdict(int))
start_prob = defaultdict(int)
tag_counts = defaultdict(int)

for sentence in train_data:

prev_tag = None
for i, (word, tag) in enumerate(sentence):
tag_counts[tag] += 1
emission[tag][word] += 1

if i == 0:
start_prob[tag] += 1
else:
transition[prev_tag][tag] += 1
prev_tag = tag

def normalize(d):
total = sum(d.values())
return {k: v / total for k, v in d.items()}

start_prob = normalize(start_prob)
for tag in emission:
emission[tag] = normalize(emission[tag])
for prev in transition:
transition[prev] = normalize(transition[prev])

Step 3: Define the Viterbi Algorithm

This is the Viterbi algorithm. It uses the probabilities from Step 2 to figure out the most likely
sequence of tags for a given sentence. At each word, it checks all possible tags and selects the
path that has the highest probability so far. It continues this process until the end of the
sentence and returns the best sequence of POS tags.
def viterbi(sentence, states, start_p, trans_p, emit_p):
V = [{}]
path = {}

for state in states:

V[0][state] = start_p.get(state, 0) * emit_p[state].get(sentence[0], 1e-6)
path[state] = [state]

for t in range(1, len(sentence)):

V.append({})
new_path = {}

for curr_state in states:

max_prob, prev_state = max(
(V[t - 1][y0] * trans_p[y0].get(curr_state, 1e-6) *
emit_p[curr_state].get(sentence[t], 1e-6), y0)
for y0 in states
)
V[t][curr_state] = max_prob
new_path[curr_state] = path[prev_state] + [curr_state]
path = new_path

n = len(sentence) - 1
prob, final_state = max((V[n][y], y) for y in states)
return path[final_state]
Step 4: Run the Tagger on a New Sentence
Now we test the HMM model on a new sentence. Even though the model hasn’t seen this exact
sentence before, it will use what it learned to predict the most likely part of speech tags for each
word.
test_s = ["a", "cat", "barked"]
states = list(tag_counts.keys())

predicted_tags = viterbi(test_s, states, start_prob, transition, emission)

print("Sentence:", test_s)
print("Predicted Tags:", predicted_tags)

OUTPUT:
Sentence: ['a', 'cat', 'barked']
Predicted Tags: ['DET', 'NOUN', 'VERB']

5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
L8-10 Intro POS HMM
No ratings yet
L8-10 Intro POS HMM
22 pages
HMM Model
No ratings yet
HMM Model
11 pages
CS563-NLP-2024 - Assignment 1
No ratings yet
CS563-NLP-2024 - Assignment 1
2 pages
Unit 5
No ratings yet
Unit 5
8 pages
HMM Detailed
No ratings yet
HMM Detailed
41 pages
PoSTagging-HMM
No ratings yet
PoSTagging-HMM
24 pages
L4 Tagging
No ratings yet
L4 Tagging
107 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Ai TXT Unit5
No ratings yet
Ai TXT Unit5
7 pages
Csci544 2023 HW2
No ratings yet
Csci544 2023 HW2
3 pages
Introduction To Hidden Markov Models
No ratings yet
Introduction To Hidden Markov Models
11 pages
POS Tagging HMM Notes With Diagrams
No ratings yet
POS Tagging HMM Notes With Diagrams
4 pages
Lec PoS Tagging 2022
No ratings yet
Lec PoS Tagging 2022
67 pages
POS HMM Viterbi Algo 2025
No ratings yet
POS HMM Viterbi Algo 2025
52 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
46 pages
Unit-3.Word Level Analysis AIML
No ratings yet
Unit-3.Word Level Analysis AIML
5 pages
Classical NLP Optimization Techniques
No ratings yet
Classical NLP Optimization Techniques
23 pages
NLP Session 6
No ratings yet
NLP Session 6
5 pages
HMM Stochastic Tagger
No ratings yet
HMM Stochastic Tagger
8 pages
Hidden Markov Model
No ratings yet
Hidden Markov Model
13 pages
Example POS Tagging With HMM
No ratings yet
Example POS Tagging With HMM
2 pages
NLP Algorithms for Students
No ratings yet
NLP Algorithms for Students
5 pages
Pos Tagging of Punjabi Language Using Hidden Markov Model
No ratings yet
Pos Tagging of Punjabi Language Using Hidden Markov Model
9 pages
2021 25 Pos Tagging NLP
No ratings yet
2021 25 Pos Tagging NLP
8 pages
Week 4
No ratings yet
Week 4
50 pages
NLP Programming en 04 HMM
No ratings yet
NLP Programming en 04 HMM
24 pages
NLP Assignment 5
No ratings yet
NLP Assignment 5
5 pages
Lec 10
No ratings yet
Lec 10
77 pages
Parts of Speech Tagging HMM
No ratings yet
Parts of Speech Tagging HMM
5 pages
Issues in Pos Tagging
No ratings yet
Issues in Pos Tagging
14 pages
S1 Chp1 Slides
No ratings yet
S1 Chp1 Slides
8 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
Lab 10
No ratings yet
Lab 10
2 pages
Natural Language Processing (Weekly Laboratory Assignments) : Sumit Kumar Banerjee
No ratings yet
Natural Language Processing (Weekly Laboratory Assignments) : Sumit Kumar Banerjee
8 pages
Python
No ratings yet
Python
9 pages
POS Tagging and NER Methods
No ratings yet
POS Tagging and NER Methods
51 pages
2 cs626 Pos Tagging Week of 1aug22
No ratings yet
2 cs626 Pos Tagging Week of 1aug22
57 pages
Corpus Analysis
No ratings yet
Corpus Analysis
8 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
Lecture 5
No ratings yet
Lecture 5
56 pages
Week 9
No ratings yet
Week 9
36 pages
Enriching The Knowledge Sources Used in A Maximum Entropy Part-of-Speech Tagger
No ratings yet
Enriching The Knowledge Sources Used in A Maximum Entropy Part-of-Speech Tagger
8 pages
POS Tagging with Hidden Markov Models
No ratings yet
POS Tagging with Hidden Markov Models
37 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
7 pages
POS Tagging with HMM and Viterbi
No ratings yet
POS Tagging with HMM and Viterbi
11 pages
HMM in NLP
No ratings yet
HMM in NLP
3 pages
Петрова, Ив. 2009. Синтактичен анализ на просто съобщително изречение с прав словоред. Дисертация за присъждане на научната и образователна степен "Доктор".
No ratings yet
Петрова, Ив. 2009. Синтактичен анализ на просто съобщително изречение с прав словоред. Дисертация за присъждане на научната и образователна степен "Доктор".
8 pages
PoS Tagging and HMM in NLP
No ratings yet
PoS Tagging and HMM in NLP
50 pages
POS Tagging and HMM in NLP
No ratings yet
POS Tagging and HMM in NLP
93 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
1 - Introduction - Rec
No ratings yet
1 - Introduction - Rec
32 pages
NLP 4
No ratings yet
NLP 4
83 pages
Lab 9
No ratings yet
Lab 9
2 pages
5 Natural Language Processing
No ratings yet
5 Natural Language Processing
7 pages
Project Acknowledgement
100% (2)
Project Acknowledgement
2 pages
Reasoning Under Uncertainty Guide
No ratings yet
Reasoning Under Uncertainty Guide
3 pages
Speech Recognition System - A Review
No ratings yet
Speech Recognition System - A Review
10 pages
Chapter 5 - Graphical Models
No ratings yet
Chapter 5 - Graphical Models
65 pages
Data Science Distributions & Models
100% (1)
Data Science Distributions & Models
5 pages
Forex Forecasting with Neural Networks
No ratings yet
Forex Forecasting with Neural Networks
9 pages
Hidden Markov Models (HMMS) : Prabhleen Juneja Thapar Institute of Engineering & Technology
No ratings yet
Hidden Markov Models (HMMS) : Prabhleen Juneja Thapar Institute of Engineering & Technology
36 pages
Question Bank
No ratings yet
Question Bank
67 pages
Tracing Eye Movement Protocols With Cognitive Process Models
No ratings yet
Tracing Eye Movement Protocols With Cognitive Process Models
6 pages
Noise Reduction in Speech Processing 1st Edition Israel Cohen Instant Download
No ratings yet
Noise Reduction in Speech Processing 1st Edition Israel Cohen Instant Download
77 pages
NLP Merged
No ratings yet
NLP Merged
135 pages
Major Project Report Template
No ratings yet
Major Project Report Template
44 pages
Robotic Arc Welding
100% (2)
Robotic Arc Welding
38 pages
BTech Advanced AI Unit03
No ratings yet
BTech Advanced AI Unit03
109 pages
Ding D 2019 PHD Thesis
No ratings yet
Ding D 2019 PHD Thesis
235 pages
SAS Viya 3.5 New Features Updated 10082019
No ratings yet
SAS Viya 3.5 New Features Updated 10082019
38 pages
A4I3
No ratings yet
A4I3
165 pages
Advanced ML for Researchers
No ratings yet
Advanced ML for Researchers
57 pages
DNA and RNA Structure FAQs
No ratings yet
DNA and RNA Structure FAQs
11,493 pages
Kaldi Speech Recognition Toolkit Overview
No ratings yet
Kaldi Speech Recognition Toolkit Overview
4 pages
Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
100% (2)
Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
378 pages
A Survey On Deep Multimodal Learning For Computer Vision Advances, Trends, Applications, and Datasets
No ratings yet
A Survey On Deep Multimodal Learning For Computer Vision Advances, Trends, Applications, and Datasets
32 pages
Implementation of Discrete Hidden Markov Model For Sequence Classification in C++ Using Eigen
No ratings yet
Implementation of Discrete Hidden Markov Model For Sequence Classification in C++ Using Eigen
8 pages
Information Retrival Final Exam
0% (1)
Information Retrival Final Exam
16 pages
RNNs for Unsegmented Data Labeling
No ratings yet
RNNs for Unsegmented Data Labeling
8 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
NLP Question Bank
No ratings yet
NLP Question Bank
3 pages
ASR Brief History: Trends Followed at Different Point of Time
No ratings yet
ASR Brief History: Trends Followed at Different Point of Time
2 pages
Unit 2 - Speech and Video Processing (SVP) - 1
No ratings yet
Unit 2 - Speech and Video Processing (SVP) - 1
23 pages
Voice Assistance for Elderly
No ratings yet
Voice Assistance for Elderly
59 pages

NLP Experiment 3

Uploaded by

NLP Experiment 3

Uploaded by

EXPERIMENT -3

[("the", "DET"), ("cat", "NOUN"), ("sat", "VERB")],

Step 2: Calculate Probabilities

from collections import defaultdict

for sentence in train_data:

Step 3: Define the Viterbi Algorithm

for state in states:

for t in range(1, len(sentence)):

for curr_state in states:

predicted_tags = viterbi(test_s, states, start_prob, transition, emission)

You might also like