This repository contains a collection of Python scripts, prepared by me, demonstrating key Natural Language Processing (NLP) techniques and workflows. Each file focuses on a specific stage of the NLP pipeline โ from data cleaning and preprocessing to tokenization, text normalization, and sentiment analysis. The goal of this project is to provide clear, modular, and reproducible code examples that help learners and practitioners understand how to implement core NLP concepts in Python.
๐น Text Cleaning
๐น Tokenization
๐น Stemming and Lemmatization
๐น Stop Words Removal
๐น Bag of Words
๐น TF-IDF
๐น N-Grams (Text Representation)
๐น Word2Vec (Word Embedding)
๐น FastText (Word Embedding)
๐น N-Gram Models (Probabilistic Language Models)
๐น Hidden Markov Models (Probabilistic Language Models)
๐น Maximum Entropy Models (Probabilistic Language Models)
๐น RNN (Deep Learning)
๐น LSTM (Deep Learning)
๐น BERT (Transformers)
๐น GPT (Transformers)
๐น LLaMA (Transformers)
๐น T5 (Transformers)
๐น Text Classification w/ Decision Trees
๐น Morphological Analysis (spaCy)
๐น Part-of-Speech (POS) Tagging
๐น Word Sense Disambiguation (Lesk)
๐น Sentiment Analysis
๐น Question Answering with Transformers
๐น Semantic Information Retrieval
๐น Content Based Recommendation System with BERT
๐น Machine Translation (NLLB)
๐น Machine Translation (Marian)
๐น Text Summarization
๐น ChatBot with OpenAI