Skip to content
#

lemmatization

Here are 480 public repositories matching this topic...

Documents and queries are represented as vectors. Each dimension corresponds to a separate term. If a term occurs in the document, its value in the vector is non-zero. Several different ways of computing these values, also known as (term) weights, have been developed. One of the best known schemes is tf-idf weighting (see the example below). The…

  • Updated Apr 23, 2020
  • Python

A spam classifier is a software or machine learning model that categorizes incoming messages or content as either "spam" (unwanted or irrelevant) or "ham" (legitimate or relevant), using automated techniques.

  • Updated Nov 30, 2023
  • Jupyter Notebook

Project for the subject Data Laboratories, done in Python, using Web Scraping techniques, curation of Data Frames, Data Visualization and Classification, Natural Language Processing and Regression Models.

  • Updated May 5, 2024
  • Jupyter Notebook

NLP Explorer is an interactive Streamlit app that lets users explore various NLP techniques like Tokenization, POS Tagging, Stemming, Lemmatization, and NER. It provides real-time analysis of text, making it a great tool for learning and experimenting with NLP concepts.

  • Updated Nov 16, 2024
  • Python

Improve this page

Add a description, image, and links to the lemmatization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the lemmatization topic, visit your repo's landing page and select "manage topics."

Learn more