Stars
Compile programs directly into transformer weights. Includes a 2D convex-hull KV cache with O(log n) inference.
Minimal and scalable research codebase in JAX, designed for rapid iteration on frontier research in LLM and other autoregressive models.
Fully autonomous AI Agents system capable of performing complex penetration testing tasks
👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
The Destructive Command Guard (dcg) is for blocking dangerous git and shell commands from being executed by agents.
Shaping capabilities with token-level pretraining data filtering
A Foundation Model for Generalist Gaming Agents
SWE-bench: Can Language Models Resolve Real-world Github Issues?
MoE training for Me and You and maybe other people
LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.
Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models
A practical guide to diffusion models, implemented from scratch.
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
⚡ TabPFN: Foundation Model for Tabular Data ⚡
Official inference repo for FLUX.2 models
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
Official implementation of the Spatial Mask Merging (SMM) algorithm, a post-processing algorithm designed to improve instance segmentation in high-resolution images. It addresses the limitations of…
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
Using a U-Net for image segmentation, blending predicted patches smoothly is a must to please the human eye.
A character-level language diffusion model trained on Tiny Shakespeare
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
(CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"
Switch the backbone of mask2former to DINOv3 for instance segmentation