-
Harbin Institute of Technology (Shenzhen)
- Shenzhen, China
- https://lanncx.github.io/
Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
3D
Embodied AI
Robotics, VLN, VLA, etc.Multimodal
Neural Network Architectures
Other Research Papers
Useful Tools
Video Analysis
Projects for video recognition, detection, segmentation, super resolution, generation, etc.Starred repositories
A multi-label temporal action detection method based on frequency estimation.
[NeurIPS25] Official Implementation for [Embodied Crowd Counting].
Micro-Influencer Recommendation by Multi-Perspective Account Representation Learning
Discover Micro-influencers for Brands via Better Understanding
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
Momentum Human Rig is an anatomically-inspired parametric full-body digital human model developed at Meta. It includes: A parametric body skeletal model; A realistic 3D mesh skinned to the skeleton…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
[ICCV2025] LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
IDOL: Instant Photorealistic 3D Human Creation from a Single Image. An open-source project for fast, high-fidelity, and generalizable 3D human reconstruction from a single image.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
在没有sudo权限的情况下,在linux上使用clash
PyTorch implementation of JiT https://arxiv.org/abs/2511.13720
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
This is an official implementation for 'Embodied Human Activity Recognition' (WACV 2024). Code is actively being processed and finalized.
[AAAI 2025] Official codes of "ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models".
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
Cambrian-S: Towards Spatial Supersensing in Video
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Official PyTorch implementation for "Large Language Diffusion Models"
[SIGGRAPH Asia 2025] Learning to Ball: Composing Policies for Long-Horizon Basketball Moves
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model