You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Turn messy survey responses into clean research insights. Dual-model pipeline: Claude Opus 4.5 extracts themes and assigns participants, GPT-5.1 writes executive summaries. Tuned temperatures for precision where it matters.
The repository provides a pipeline for preprocessing text data, extracting features, and applying clustering algorithms like K-means, DBSCAN, or hierarchical clustering.
FastThresholdClustering is an efficient vector clustering algorithm based on FAISS, particularly suitable for large-scale vector data clustering tasks. The algorithm features intuitive and easy-to-select hyperparameters, uses cosine similarity as its distance metric, and supports GPU acceleration.
semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).
Developing Natural Language Processing tools to enhance Learning Analytics. Creating an automated dashboard that diagnoses strengths and weaknesses from educational data.