Topic Modelling for Humans
-
Updated
Nov 1, 2025 - Python
Topic Modelling for Humans
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
🦆 Contextually-keyed word vectors
Data repository for pretrained NLP models and NLP corpora.
Log Anomaly Detection - Machine learning to detect abnormal events logs
A fast, efficient universal vector embedding utility package.
ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python
ADAM - A Question Answering System. Inspired from IBM Watson
Compute Sentence Embeddings Fast!
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.
A practical guide to topic mining and interactive visualizations
Web-ify your word2vec: framework to serve distributional semantic models online
D-Lab's 9 hour introduction to text analysis with Python. Learn how to perform bag-of-words, sentiment analysis, topic modeling, word embeddings, and more, using scikit-learn, NLTK, gensim, and spaCy in Python.
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).
Add a description, image, and links to the gensim topic page so that developers can more easily learn about it.
To associate your repository with the gensim topic, visit your repo's landing page and select "manage topics."