0% found this document useful (0 votes)
10 views2 pages

NLP

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

NLP

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

NLP :

Week 01 (3 hours) - Introduction to Natural Language Processing (NLP)

Course Introduction & Motivation: Overview of why NLP is important and its
applications.
Multilingualism: The challenges and importance of handling multiple languages in NLP.
Morphology in Languages: Understanding word structure and formation in different
languages.
Part-of-Speech (PoS) Tagging: Introduction to PoS tagging, which assigns parts of speech
to each word in a sentence.

Week 02 (3 hours) - PoS Tagging Layer of NLP

Mathematics of PoS tagging: The underlying math used in PoS tagging models.
Sequences in NLP: Understanding sequential data, such as sentences, and their
processing in NLP.
NLP Lab 1 (Non-graded): Focuses on simple matrix operations using NumPy and scikit-
learn for basic NLP tasks.

Week 03 (3 hours) - Hidden Markov Models (HMM) in NLP

PoS Tagging (HMM): Using HMMs for tagging parts of speech.


Viterbi Decoding for Tagging and Sequences: Applying the Viterbi algorithm for efficient
sequence tagging.
NLP Lab 2 (Non-graded): A PoS tagging task using the most frequent tag assignment
technique.

Week 04 (3 hours) - Handling Sequential Tasks

Shallow Parsing: Breaking down a sentence into smaller chunks without deep syntactic
structure.
Named Entity Recognition (NER): Identifying named entities (people, organizations,
locations) within text.
Introduction to Conditional Random Field (CRF): An advanced method for sequence
prediction tasks like PoS tagging and NER.
Challenges due to Morphological Richness: Handling the complexity of languages with
rich morphological structures (e.g., inflections, prefixes, suffixes).
Week 05 (3 hours) - Feature Engineering

CRF (contd.): Continuation of CRF methods.


Maximum Entropy Markov Model (MEMM): A probabilistic sequence model that extends
the HMM.
Feature Extraction and Engineering: Techniques for extracting meaningful features from
text data to improve NLP models.
NLP Lab 3 (Non-graded): A task focused on performing NER in multiple languages.

Week 06 (3 hours) - Knowledge Bases and Ambiguity

Ambiguity and NLP: Understanding and addressing the inherent ambiguities in language
processing.
Knowledge Bases (WordNet, FrameNet, VerbNet): Exploring resources like WordNet for
semantic relationships, FrameNet for event structures, and VerbNet for verb
categorization.
Word Sense Disambiguation (WSD): Techniques for determining the correct meaning of a
word in a given context.
NLP Lab 4 (Graded): A lab focused on disambiguating word senses in context.

Week 07 (3 hours) - Applications of Neural Networks (NN) in NLP

Cognate Detection and its applications: Identifying cognates (words in different


languages that share a common origin) using NLP techniques.
NER using NNs: Applying neural networks to perform NER.
Text Classification using NNs: Using neural networks to classify text into predefined
categories.
Transformer Architecture: An introduction to the Transformer model, which has
revolutionized NLP tasks like machine translation and text generation.
Introduction to Distributional Semantics: Understanding how word meaning can be
represented in vector spaces based on context.

Week 08 (3 hours) - Distributional Semantics

word2vec, doc2vec, sent2vec: Techniques for representing words, documents, and


sentences as vectors in continuous space.
sub-words in NLP: Handling smaller units of text (sub-words) for languages with complex
morphology.
FastText: A model that improves upon word2vec by incorporating sub-word information.

You might also like