Skip to content
View tomaarsen's full-sized avatar

Organizations

@nltk @huggingface @embeddings-benchmark @Hugging-Face-Helping-Hand

Block or report tomaarsen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

SSE (Stable Static Embedding): Unlocking the Potential of Static Embeddings, A Dynamic Tanh Normalization Approach without Speed Penalty

Python 3 Updated Jun 11, 2026

An extensive and commented list of resources on Late-Interaction Multivector Retrieval.

TeX 60 7 Updated Jun 17, 2026

Personal-Model First Self Evolving AI Agent 🐘

Python 565 61 Updated Jun 1, 2026

A Minimalistic Search Agent

TypeScript 75 5 Updated May 12, 2026

How Fast can you pull from Hugging Face?

Python 24 Updated May 12, 2026

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

Python 10,489 1,114 Updated Jun 18, 2026

Official Python library for the TeraflopAI API

Python 2 Updated Mar 9, 2026

Robust and fast topic models with sentence-transformers.

Python 114 9 Updated Jun 11, 2026

Give your agents the power of the Hugging Face ecosystem

Python 10,693 703 Updated Jun 18, 2026

AI agents running research on Hugging Face infra

Python 175 17 Updated Mar 29, 2026

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 749 54 Updated Jun 18, 2026

Hundreds of models & providers. One command to find what runs on your hardware.

Rust 28,265 1,725 Updated Jun 17, 2026

Fine-tune SPLADE sparse embedding models for your product catalog. CLI, web dashboard, and Python API.

Python 11 Updated Jun 10, 2026

A missing piece of the Python multitask (both threads and processes) API: An extension that supports stateful worker pools & size-aware iterators.

Python 29 2 Updated Mar 8, 2026

A lightweight inference engine supporting speculative speculative decoding (SSD).

Python 956 72 Updated May 10, 2026
Python 255 10 Updated Apr 17, 2026

Build compute kernels and load them from the Hub.

Python 697 105 Updated Jun 18, 2026

Implementation for Revela: Dense Retriever Learning via Language Modeling - ICLR 2026 Oral

Python 20 3 Updated Mar 26, 2026

Benchmark for vector databases.

Python 1,127 395 Updated Jun 17, 2026

Text and code embeddings research from CodeFuse: C2LLM, D2LLM, E2LLM, F2LLM, ML-Embed

Python 564 74 Updated May 22, 2026

👷 Build compute kernels

Nix 214 36 Updated Apr 6, 2026

Fast BM25 search engine with category theory abstractions

Python 12 Updated Feb 22, 2026
Python 24 10 Updated Apr 29, 2026
Python 81 7 Updated Dec 12, 2025

Mutlimodal reranker training and benchmarks

Python 4 Updated Dec 1, 2025

HSEB: Hybrid Search Engine Benchmark

Python 21 2 Updated Oct 5, 2025

Nearly Inference Free Embeddings: make your RAG queries 500x faster

Python 78 4 Updated Apr 27, 2026

AI Agent Framework, the Pydantic way

Python 17,844 2,230 Updated Jun 18, 2026

Fast Diversification for Search & Retrieval

Python 493 27 Updated May 24, 2026
Next