text-clustering

Here are 44 public repositories matching this topic...

jameswniu / research_doc_extraction_rag_agent

Turn messy survey responses into clean research insights. Dual-model pipeline: Claude Opus 4.5 extracts themes and assigns participants, GPT-5.1 writes executive summaries. Tuned temperatures for precision where it matters.

nlp text-analysis survey-analysis text-clustering qualitative-research openai-api llm thematic-analysis research-automation claude-api llm-pipeline dual-model

Updated Dec 3, 2025
Python

TranTungDuong1611 / CTAI_MachineLearning_Project

Star

A comprehensive news aggregation and text analysis system that leverages advanced machine learning techniques to process Vietnamese news articles.

machine-learning mvc deep-learning text-classification text-summarization system-design text-clustering mlops stacking-ensemble

Updated Sep 29, 2025
Python

ArikReuter / TopicGPT

Star

TopicGPT allows to integrate the benefits of LLMs into Topic Modelling

nlp natural-language-processing text-mining topic-modeling gpt text-clustering topic-modeling-analysis gpt-3 openai-api gpt-4 chatgpt

Updated Sep 19, 2025
Python

JuanLara18 / Text-Classification-System

Star

Modular pipeline for text clustering, classification, and evaluation using TF-IDF and unsupervised ML techniques

nlp unsupervised-learning tfidf text-clustering

Updated Jun 30, 2025
Python

yuuusha / topic-modeling

Star

The repository contains files (notebooks, data) for the course work of the 2nd course: "Topic modeling for text document analysis".

data-science machine-learning topic-modeling data-analysis nlp-machine-learning lda-model text-clustering nmf-matrix-factorization lsa-model

Updated Jun 13, 2025
Python

Navy10021 / Parallel_Clustering_based_TM

Star

Parallel clustering-based Topic Modeling

nlp topic-modeling keyword-extraction text-clustering bert-model bert-embeddings

Updated Mar 18, 2025
Python

xlang-ai / instructor-embedding

Star

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

information-retrieval text-classification embeddings language-model text-embedding text-clustering text-semantic-similarity text-evaluation prompt-retrieval text-reranking

Updated Jan 15, 2025
Python

michellemashutian / clusteringText

Star

The repository provides a pipeline for preprocessing text data, extracting features, and applying clustering algorithms like K-means, DBSCAN, or hierarchical clustering.

python lda kmeans-clustering dbscan-clustering text-clustering lsi-model sklearn-library sklearn-clustering

Updated Dec 26, 2024
Python

ScottishFold007 / FastThresholdClustering

Star

FastThresholdClustering is an efficient vector clustering algorithm based on FAISS, particularly suitable for large-scale vector data clustering tasks. The algorithm features intuitive and easy-to-select hyperparameters, uses cosine similarity as its distance metric, and supports GPU acceleration.

clustering-algorithm text-clustering

Updated Dec 17, 2024
Python

KeremZaman / semantic-sh

Star

semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).

text-similarity simhash transformer locality-sensitive-hashing fasttext bert text-search word-vectors text-clustering

Updated Jul 25, 2024
Python

plkmo / NLP_Toolkit

Sponsor

Star

Library of state-of-the-art models (PyTorch) for NLP tasks

nlp natural-language-processing text-classification machine-translation pytorch style-transfer speech-recognition text-summarization nlp-library text-clustering punctuation-restoration

Updated Jul 25, 2024
Python

jwchoi95 / matsciexp

Star

Official source codes for implementing "Quantitative Topic Analysis of Materials Science Literature Using Natural Language Processing"

topic-modeling materials-science text-clustering megatrends

Updated Jul 18, 2024
Python

LMU-Seminar-LLMs / TopicGPT

Star

TopicGPT allows to integrate the benefits of LLMs into Topic Modelling

nlp natural-language-processing text-mining topic-modeling gpt text-clustering openai-api large-language-models chatgpt

Updated Jun 22, 2024
Python

binhetech / text_clustering

Star

Text Clustering 文本聚类

nlp clustering text-clustering

Updated Jun 21, 2024
Python

saaadiqh / NLP-Learning_Analytics

Star

Developing Natural Language Processing tools to enhance Learning Analytics. Creating an automated dashboard that diagnoses strengths and weaknesses from educational data.

natural-language-processing deep-learning text-classification learning-analytics text-summarization topic-modeling text-clustering transformer-architecture large-language-models