Starred repositories
Jandal AI — Local-first Android AI assistant with on-device LLM inference, semantic memory, and extensible skill framework
Mano-P: Open-source GUI-VLA agent for edge devices. #1 on OSWorld (specialized, 58.2%). Runs locally on Apple M4 Mac mini/MacBook — no data leaves your device.Mano-P 是一个开源 GUI-VLA 项目,支持在 Mac mini/M…
AlpaSim is an open-source autonomous vehicle simulation platform designed for development and testing of end-to-end AV policies
A framework for efficient model inference with omni-modality models
Persistent file-based planning for AI coding agents and long-running agentic tasks. Crash-proof markdown plans that survive context loss and /clear, plus a deterministic completion gate and multi-a…
QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
Code for MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data (CVPR 2025)
[SIGGRAPH 2025] One Model to Rig Them All: Diverse Skeleton Rigging with UniRig
Simulation platform for general-purpose robotics & embodied AI learning.
[ECCV2024] [3DV Nectar 2025] FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
[ICML 2024] GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
[CVPR'24] Interactive3D: Create What You Want by Interactive 3D Generation
A curated list of awesome LLM/VLM/VLA/World Model for Autonomous Driving(LLM4AD) resources (continually updated)
super-ai: unified-vision, math-think/mathink; private
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Collection of Remote Sensing Vision-Language Models
Infinite Photorealistic Worlds using Procedural Generation
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Earth observation tools for Meta AI Segment Anything
A Unified Framework for Image-to-Graph Generation. Paper accepted @ ECCV22.
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision