- Moscow
Stars
State-of-the-Art Text Embeddings
A feature-rich command-line audio/video downloader
C library for generating audio fingerprints used by AcoustID
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
An open-source RAG-based tool for chatting with your documents.
"DeepDPM: Deep Clustering With An Unknown Number of Clusters" [Ronen, Finder, and Freifeld, CVPR 2022]
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports compβ¦
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning β¦
Streamlit β A faster way to build and share data apps.
π« Industrial-strength Natural Language Processing (NLP) in Python
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
π OpenHands: AI-Driven Development
Steering vectors for transformer language models in Pytorch / Huggingface
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
A scikit-learn compatible neural network library that wraps PyTorch
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
An official code for the paper "CASteer: Steering Diffusion Models for Controllable Generation"
π Scalable embedding, reasoning, ranking for images and sentences with CLIP
Simple script for downloading Youtube comments without using the Youtube API
Convert code repos into an LLM prompt-friendly format. Mostly built by GPT-4.