Stars
ROSA+: RWKV's ROSA implementation with fallback statistical predictor
Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)
Novel reinforcement learning based local planner that accounts for the dynamic constraints of the robot to enable smooth robot trajectories. Reward shaping is done to enable a spatially aware navig…
Language modeling with linear-cost context
Simplifying reinforcement learning for complex game environments
FastAPI framework, high performance, easy to learn, fast to code, ready for production
Python fast on-disk dictionary / RocksDB & SpeeDB Python binding
bottle.py is a fast and simple micro-framework for python web-applications.
pickleDB is an in memory key-value store using Python's orjson module for persistence.
A lightweight, pure Python implementation of Hierarchical Navigable Small World (HNSW) algorithm for approximate nearest neighbor search.
Accessible large language models via k-bit quantization for PyTorch.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Unofficial implementation of Tiny Recursive Model (TRM), improvement to HRM from Sapient AI, by Alexia Jolicoeur-Martineau
A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
Build, evaluate and train General Multi-Agent Assistance with ease
TransMLA: Multi-Head Latent Attention Is All You Need (NeurIPS 2025 Spotlight)
Causal Attention with Lookahead Keys
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System