-
Peking University
- Beijing, China
-
18:22
(UTC +08:00) - https://azx030512.github.io
- https://orcid.org/0009-0007-2625-5190
Highlights
- Pro
Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
程序员鱼皮的 AI 资源大全 + Vibe Coding 零基础教程,分享 OpenClaw 保姆级教程、大模型玩法(DeepSeek / GPT / Gemini / Claude / GLM)、最新 AI 资讯、Prompt 提示词大全、AI 知识百科(Agent Skills / RAG / MCP / A2A)、AI 编程教程(Harness Engineering)、AI 工具用法…
MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion (CVPR 2025)
A synthetic satellite imagery dataset for semantic segmentation and domain adaptation.
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…
SkillOpt is a text-space optimizer that trains reusable natural-language skills for frozen LLM agents through trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts.
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
Code & Models for 3DETR - an End-to-end transformer model for 3D object detection
[CVPR 2024] EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
A Minimal and Elegant Framework & Tutorial for Real-Time Interactive World Models
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
[ICCV 2025, Oral] TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation [Siggraph Asian 2025]
PartFlow: two-stage image-conditioned 3D editing (inference code)
The implementation of Extreme Viewpoint 4D Video Generation
[SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, LTX-2, Qwen Image, Hunyuan Video, LTX Video and Flux.
[TPAMI 2025] ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis