Stars
Inference repo for Falcon-Perception and Falcon-OCR model, early-fusion, natively multimodal, dense Autoregressive Transformer models.
Run BitNet b1.58 ternary LLMs with WebGPU β in browsers and native apps
Inference server for MioTTS, a lightweight and fast LLM-based TTS model.
A framework for few-shot evaluation of language models.
SGLang is a high-performance serving framework for large language models and multimodal models.
Ongoing research training transformer models at scale
Renderer for the harmony response format to be used with gpt-oss
Pure Rust engine for BitNet LLMs β Conversion, Inference, Training and Research. With streaming and GPU/CPU support
250+ Fine-tuning & RL Notebooks for text, vision, audio, embedding, TTS models.
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
π Efficient implementations for emerging model architectures
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
All information and news with respect to Falcon-H1 series
Lightweight toolkit package to train and fine-tune 1.58bit Language models
Build compute kernels and load them from the Hub.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
An extremely fast Python linter and code formatter, written in Rust.
Segment Anything for Microscopy
Minimalistic 4D-parallelism distributed training framework for education purpose
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Official inference framework for 1-bit LLMs
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM π
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!