-
University of Chinese Academy of Sciences
- Beijing
-
06:49
(UTC +08:00) - https://baizey.rvosuke.com
- https://orcid.org/0009-0003-8776-6980
Highlights
- Pro
Stars
This project is the official implementation of "UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation"
Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko,…
[NeurIPS 2024] Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images
Official implementation for "Stable Flow: Vital Layers for Training-Free Image Editing" [CVPR 2025]
The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"
[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).
(IETIP) Stroke-Seg: A Deep Learning-Based Framework for Chinese Stroke Segmentation
(MM 2025, Oral) GraphSplat: Sparse-View Generalizable 3D Gaussian Splatting is Worth Graph of Nodes
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
Pointcept: Perceive the world with sparse points, a codebase for point cloud perception research. Latest works: Concerto (NeurIPS'25), Sonata (CVPR'25 Highlight), PTv3 (CVPR'24 Oral)
[CVPR'25] DepthSplat: Connecting Gaussian Splatting and Depth
[CVPR 2025 Highlight] TinyFusion: Diffusion Transformers Learned Shallow
【CVPR 2025 Highlight】MonSter: Marry Monodepth to Stereo Unleashes Power
CVPR 2024: AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation
[CVPR'2024] Official implementation of the paper "ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation"
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Official repository of CVPR 2024 paper "EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation"
Official repository for Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs
🌊 [ECCV'24 Oral] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Official repository of CVPR 2024 paper "GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs"
[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacch…
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Official repository of HUGS: Human Gaussian Splats (CVPR 2024)
[CVPR 2024] Code release for TransNeXt model
A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.