Stars
[CVPR 2026 Highlight] A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Wrap Gemini CLI, Antigravity, ChatGPT Codex, Claude Code, Grok Build as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 3.1 Pro, GPT 5.5, Grok 4.3, Claud…
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
Latent Spatial Memory for Video World Models
A compilation of the best multi-agent papers
Reproduction code for Lattice Deduction Transformers
Miso TTS is an 8 billion, highly emotive text-to-speech model
Gaze-LLE-DINOv3: Gaze Target Estimation via Large-Scale Learned Encoders with DINOv3.
SOTA small LMs tuned for Adreno 6xx GPUs on non-flagship Android phones. Pure C++/OpenCL.
PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion
Create stunning demos for free. Open-source, no subscriptions, no watermarks, and free for commercial use. An alternative to Screen Studio.
🔥 Official impl. of "DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing".
A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
[CVPR 2026] Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
SenseNova-U series: Native Unified Paradigm with NEO-unify from the First Principles
gah is an GitHub Releases app installer, that does not require sudo
BigStationW / ComfyUI-ppm
Forked from pamparamm/ComfyUI-ppmAttention Couple for SDXL and Anima; NegPiP (negative weights in prompts) for SDXL and Anima; etc.
Real-time stream editing pipeline powered by the FLUX.2-klein-4B model, optimized for consumer GPUs
Cuda kernels for leveraging LLM sparsity to improve throughput and decrease the memory requirements during inference and training.
Official Repo of "D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models"
ExTV / rikkahub-agent
Forked from rikkahub/rikkahubRikkaHub Agent -- is RikkaHub fork that have Full agent mode .
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++
Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation