- Berlin
- in/romangrebennikov
Stars
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Code and documentation to train Stanford's Alpaca models, and generate the data.
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
MTEB: Massive Text Embedding Benchmark
Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.
pytest fixture for benchmarking code
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Metric learning and retrieval pipelines, models and zoo.
Finetuning Large Language Models on One Consumer GPU in 2 Bits
Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search
Full text search that feels like a numpy array
Pure-Python Server Side Events (SSE) client
An efficient PyTorch implementation of the evaluation metrics in recommender systems.
Experimental code for our paper on informative and diverse sampling of negative examples for dense retrieval