Jiankang Deng
My research focuses on two main areas: (1) multi-modal foundation models, with an emphasis on perceiving, understanding, and modelling complex multi-sensory signals such as visual, acoustic, tactile, and EEG data; and (2) generative modeling of the physical world, with the goal of synthesizing scalable and reliable digital assets that reproduce real-world entities. My research is at the intersection of computer vision and real-world applications, striving to pioneer transformative technologies for social benefit. I actively serve as an Area Chair for leading conferences in computer vision and machine learning, including CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML and AAAI. I am also an Associate Editor for IEEE Transactions on Image Processing and Neural Networks.
Previously, I obtained my Ph.D.(2020) from the IBUG group under the supervision of Prof. Stefanos Zafeiriou. My doctoral research focused on deep face analysis and modeling, encompassing efficient geometry estimation, robust feature embedding, and photorealistic texture modeling. I developed algorithms and systems (InsightFace) to capture, represent, and synthesize diverse human faces with high efficiency, robustness, and fidelity. Additionally, I worked in the field of visual representation learning and won many visual perception challenges (e.g., ImageNet and ActivityNet) in the past years.
news
| Dec 19, 2025 | GenForce, a transferable force-sensing framework across diverse tactile sensors, has been accepted by Nature Communications. |
|---|---|
| Dec 18, 2025 | We release a comprehensive survey on Vision-Language-Action (VLA) models. |
| Dec 12, 2025 | Awarded an NVIDIA Academic Grant under the NVIDIA Academic Grant Program. |
| Sep 27, 2025 | We launch LLaVA-OneVision 1.5, a fully open framework for democratized multimodal training. |
| Sep 18, 2025 | We launch Embodied Arena, a comprehensive evaluation platform for Embodied AI. |