Starred repositories
JAX reimplementation of the DeepMind paper "Genie: Generative Interactive Environments"
p-doom / jasmine
Forked from FLAIROx/jafarA simple, performant and scalable JAX-based world modeling codebase
[ICCV 2023] Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction
Enjoy the magic of Diffusion models!
MiniMax-M2, a model built for Max coding & agentic workflows.
L4P -- a feed-forward foundational model designed for multiple low-level 4D vision perception tasks.
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Salesforce Enterprise Deep Research
[EMNLP 2025 Oral] Official codebase for Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors.
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
Discrete Wavelet Transform as a Facilitator for Expressive Latent Space Representation in Variational Autoencoders in Satellite Imagery
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
Fast and Universal 3D reconstruction model for versatile tasks
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Just another reasonably minimal repo for class-conditional training of pixel-space diffusion transformers.
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
NVIDIA DeepStream SDK 8.0 / 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Claude Code superpowers: core skills library
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.