Lab NLP 4

This assignment for the B. Tech course in Natural Language Processing focuses on text normalization and preprocessing techniques. Students will perform tasks such as lowercasing, punctuation removal, tokenization, and stopword elimination on raw English text. The assignment includes practical coding exercises using NLTK to prepare text for NLP tasks like classification or sentiment analysis.

Uploaded by

vikrammadhad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views2 pages

Lab NLP 4

Uploaded by

vikrammadhad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

School of Computer Science Engineering and Technology

Assignment-4
Course-B. Tech. Type- Specialization Elective
Course Code- CSET246 Course Name-Natural Language Processing

Year- 2025 Semester- Even

Date- Batch-All

Text Normalization and Preprocessing

Objective:
This assignment focuses on performing essential text preprocessing steps including lowercasing,
punctuation removal, tokenization, stopword elimination, and basic text analysis. Through
hands-on tasks, students will learn how to clean and prepare raw English text for downstream
NLP tasks like classification or sentiment analysis.

Q1. Perform basic text normalization on English text.

 Input sentence: "The Examination's RESULT was Declared!!"
 Normalize:
o Lowercase
o Remove punctuation
o Remove stopwords

import nltk
import string
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

Download NLTK data (only once)

nltk.download('punkt')
nltk.download('stopwords')
Input sentence
sentence = "The Examination's RESULT was Declared!!"

Step 1: Lowercase
sentence = sentence.lower()

Step 2: Remove punctuation

sentence = sentence.translate(str.maketrans('', '', string.punctuation))

Step 3: Tokenize the sentence

tokens = word_tokenize(sentence)

Step 4: Remove stopwords

stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word not in stop_words]

Final Output
print("Normalized Tokens:", filtered_tokens)

Q2. Tokenize sentences and words using NLTK.

 Input: 2–3 sentences
 Use nltk.sent_tokenize() and nltk.word_tokenize()
 Show sentence and word tokens

Q3. Identify if characters in the text are alphabets, digits, or special characters.
 Input: "Student123 scored 95%! Great job!!"
 Output:
o Alphabets: Student, scored, Great, job
o Digits: 123, 95
o Special Characters: %, !, !!

M6L2 Lyst1662
No ratings yet
M6L2 Lyst1662
24 pages
Assignment-9 (NLP)
No ratings yet
Assignment-9 (NLP)
2 pages
NLP 1
No ratings yet
NLP 1
6 pages
NLP Techniques for Students
No ratings yet
NLP Techniques for Students
55 pages
NLB Lab Manuel 2
No ratings yet
NLB Lab Manuel 2
71 pages
Experiment 2
No ratings yet
Experiment 2
4 pages
NLP1
No ratings yet
NLP1
4 pages
NLP Applications and Preprocessing
No ratings yet
NLP Applications and Preprocessing
56 pages
H7 W5 NLP - Merged
No ratings yet
H7 W5 NLP - Merged
17 pages
Lab 2 NLP
No ratings yet
Lab 2 NLP
2 pages
Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment
No ratings yet
Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment
8 pages
Ai & ML Week-11
No ratings yet
Ai & ML Week-11
32 pages
Prog 1
No ratings yet
Prog 1
2 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
NLP - Course EDC 1 29
No ratings yet
NLP - Course EDC 1 29
29 pages
Lab Prgms Weel1-Output
No ratings yet
Lab Prgms Weel1-Output
4 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
NLP Smitpatel
No ratings yet
NLP Smitpatel
32 pages
NLP Day1
No ratings yet
NLP Day1
4 pages
Token Ization
No ratings yet
Token Ization
5 pages
NLP Lab Manual 3-2 Aiml R22 Update
100% (2)
NLP Lab Manual 3-2 Aiml R22 Update
20 pages
NLP Pipeline
No ratings yet
NLP Pipeline
50 pages
Tsa Labmanual
No ratings yet
Tsa Labmanual
26 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
QB IA1 NLP Qs
No ratings yet
QB IA1 NLP Qs
1 page
Theory of Computation
No ratings yet
Theory of Computation
33 pages
Natural Language Processing Manual
No ratings yet
Natural Language Processing Manual
39 pages
NLP Experiment 2
No ratings yet
NLP Experiment 2
5 pages
NLP Lab - Manual
No ratings yet
NLP Lab - Manual
33 pages
NLP Lab Guide for Students
No ratings yet
NLP Lab Guide for Students
103 pages
NLP with Python Lab Manual
No ratings yet
NLP with Python Lab Manual
15 pages
NLP 1
No ratings yet
NLP 1
8 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
9 pages
AMLTA
No ratings yet
AMLTA
17 pages
Lab2 IR
No ratings yet
Lab2 IR
16 pages
NLP Tokenization Techniques Guide
No ratings yet
NLP Tokenization Techniques Guide
50 pages
Jal Patel NLP
No ratings yet
Jal Patel NLP
32 pages
Lab Syllabus NLP Lab
No ratings yet
Lab Syllabus NLP Lab
2 pages
Python NLP: Word & Sentence Tokenization
No ratings yet
Python NLP: Word & Sentence Tokenization
2 pages
Tinywow Pythass3 77951173
No ratings yet
Tinywow Pythass3 77951173
17 pages
M.Tech NLP Course Overview
No ratings yet
M.Tech NLP Course Overview
2 pages
NLP Record
No ratings yet
NLP Record
15 pages
Wk-5 cl-10 NLP
No ratings yet
Wk-5 cl-10 NLP
1 page
2 - 6N302 Natural Language Processing
No ratings yet
2 - 6N302 Natural Language Processing
6 pages
NLP Syllabus
No ratings yet
NLP Syllabus
2 pages
Write A Python Program For The Following Preprocessing of Text in NLP: Tokenization Filtration Script Validation Stop Word Removal Stemming
No ratings yet
Write A Python Program For The Following Preprocessing of Text in NLP: Tokenization Filtration Script Validation Stop Word Removal Stemming
2 pages
NLP Tasks for MCA Students
No ratings yet
NLP Tasks for MCA Students
16 pages
Final Summary NLP
No ratings yet
Final Summary NLP
446 pages
NLP 02
No ratings yet
NLP 02
6 pages
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
No ratings yet
AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem AM604PC Natural Language Processing LAB R22 AI&ML 3rd Yr 2nd Sem
20 pages
LP Vi Manual
No ratings yet
LP Vi Manual
77 pages
InfoSec Lab Manual for Students
No ratings yet
InfoSec Lab Manual for Students
25 pages
NLP Lab Manual - Final
No ratings yet
NLP Lab Manual - Final
15 pages
NLP Module 1
No ratings yet
NLP Module 1
71 pages
Sample Paper Questions - NLP (Part 2)
No ratings yet
Sample Paper Questions - NLP (Part 2)
7 pages
NLP-Lab Manual - Ashwini - Kachare
No ratings yet
NLP-Lab Manual - Ashwini - Kachare
41 pages
Foundation (Week 4) - DeepTech - Ready Upskilling Program
No ratings yet
Foundation (Week 4) - DeepTech - Ready Upskilling Program
12 pages
WSMA Lab
No ratings yet
WSMA Lab
21 pages
Week11 1-1
No ratings yet
Week11 1-1
22 pages
Coursera CMWGEQ3QR0ZY
No ratings yet
Coursera CMWGEQ3QR0ZY
1 page
Lab 1 Introduction
No ratings yet
Lab 1 Introduction
2 pages
Lab NLP5
No ratings yet
Lab NLP5
2 pages
Hostel Management System Assignment
No ratings yet
Hostel Management System Assignment
2 pages
Assignment 4 Adversarial Attacks
No ratings yet
Assignment 4 Adversarial Attacks
2 pages

Lab NLP 4

Uploaded by

Lab NLP 4

Uploaded by

School of Computer Science Engineering and Technology

Year- 2025 Semester- Even

Text Normalization and Preprocessing

Q1. Perform basic text normalization on English text.

Download NLTK data (only once)

Step 2: Remove punctuation

Step 3: Tokenize the sentence

Step 4: Remove stopwords

Q2. Tokenize sentences and words using NLTK.

You might also like