-
CASIA
- Beijing, China
- https://wangguanan.github.io/
Starred repositories
[CVPR 2026] NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
📰 Let ChatGPT Summarize Hacker News for You
AI时代的WordPress,东半球首个积木式AI应用搭建系统,人人都可免费搭建自己的AI应用系统,例如企业智能体系统、AI漫剧系统、AI论文学术系统、AI客服系统...
Master programming by recreating your favorite technologies from scratch.
💫 Toolkit to help you get started with Spec-Driven Development
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
real time face swap and one-click video deepfake with only a single image
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Crawl and convert any website into clean markdown
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
The official Python SDK for Model Context Protocol servers and clients
Sample microservices application for playing with
The Robot Operating System, is a meta operating system for robots.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
[ICLR 2026] When it comes to optimizers, it's always better to be safe than sorry
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Python tool for converting files and office documents to Markdown.
[ICCV2025] MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS2024 Oral] PyTorch implementation of DenoiseRep
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
[CVPR2024 Highlight] The official repo for paper "Abductive Ego-View Accident Video Understanding for Safe Driving Perception"
🔊 Text-Prompted Generative Audio Model
[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation