Korean R&D document pipeline for patents, grants, papers, and PDFs with KSS-based preprocessing, KoSentenceBERT embeddings, and Qdrant semantic search via FastAPI.
-
Updated
Nov 18, 2025 - Python
Korean R&D document pipeline for patents, grants, papers, and PDFs with KSS-based preprocessing, KoSentenceBERT embeddings, and Qdrant semantic search via FastAPI.
Python CLI & library for automated journal vetting — GPT‑4.1 summarization, YAML configuration, reproducible analysis.
A Fast, Adaptive, Stable, and Transferable Topic Model (NeurIPS 2024)
Top2Vec learns jointly embedded topic, document and word vectors.
This Streamlit application demonstrates the integration of ChatGroq (Llama3 model), OpenAIEmbeddings, and FAISS for document embedding and retrieval.
🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴
Content-based book recommendation system
Expose a Top2Vec model with a REST API.
An open-source framework to create and test document embeddings using topic models.
Container-first, JSON-configurable, NLP REST service based on Flair
Experiments on Neural Language Embeddings
Add a description, image, and links to the document-embedding topic page so that developers can more easily learn about it.
To associate your repository with the document-embedding topic, visit your repo's landing page and select "manage topics."