Junke Wang 「王君可」

I'm a final-year Ph.D. student at Fudan University, luckily supervised by Prof. Zuxuan Wu and Prof. Yu-Gang Jiang. I have interned at frontier AI labs including ByteDance Seed, Meta FAIR. I'm the recipient of 2025 Bytedance Fellowship.

My research interest lies in multimodal general intelligence. Recently, I work on generative models, world (and action) models. Feel free to reach out if you are interested in working with me.

Email: jkwang0724 [at] gmail [dot] com

Technical Reports

* denotes equal contribution.

RepWAM: World Action Modeling with Representation Visual-Action Tokenizers. [Code] [Web Page]
Junke Wang, Qihang Zhang, Shuai Yang, Yiming Luo, Yujun Shen, Zuxuan Wu, Yu-Gang Jiang, Yinghao Xu.
ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations. [Web Page]
Junke Wang*, Xiao Wang*, Jiacheng Pan*, Xuefeng Hu*, Feng Li, et al.
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL. [Code]
Junke Wang, Zhi Tian, Xun Wang, Xinyu Zhang, Weilin Huang, Zuxuan Wu, Yu-Gang Jiang.
Perception Encoder: The best visual embeddings are not at the output of the network. [Code]
Core contributor, FAIR perception.
Publications
OmniGen-AR: AutoRegressive Any-to-Image Generation. [Code]
Junke Wang, Xun Wang, Qiushan Guo, Peize Sun, Weilin Huang, Zuxuan Wu, Yu-Gang Jiang.
Advances in Neural Information Processing Systems (NeurIPS), 2025.
OmniTracker: Unifying Object Tracking by Tracking-with-Detection.
Junke Wang*, Zuxuan Wu*, Dongdong Chen, Chong Luo, Xiyang Dai, Lu Yuan, Yu-Gang Jiang.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025.
Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection.
Junke Wang, Zhenxin Li, Chao Zhang, Jingjing Chen, Zuxuan Wu, Larry S. Davis, Yu-Gang Jiang.
Proceedings of the IEEE, 2025.
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation. [Code]
Junke Wang, Yi Jiang, Zehuan Yuan, Binyue Peng, Zuxuan Wu, Yu-Gang Jiang.
Advances in Neural Information Processing Systems (NeurIPS), 2024.
OmniVid: A Generative Framework for Universal Video Understanding. [Code]
Junke Wang, Dongdong Chen, Chong Luo, Bo He, Lu Yuan, Zuxuan Wu, Yu-Gang Jiang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Look Before You Match: Instance Understanding Matters in Video Object Segmentation.
Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Chuanxin Tang, Xiyang Dai, Yucheng Zhao,
Yujia Xie, Lu Yuan, Yu-Gang Jiang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
OmniVL: One Foundation Model for Image-Language and Video-Language Tasks.
Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Luowei Zhou, Yucheng Zhao,
Yujia Xie, Ce Liu, Yu-Gang Jiang, Lu Yuan.
Advances in Neural Information Processing Systems (NeurIPS), 2022.
Efficient Video Transformers with Spatial-Temporal Token Selection. [Code]
Junke Wang*, Xitong Yang*, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang.
European Conference on Computer Vision (ECCV), 2022.
ObjectFormer for Image Manipulation Detection and Localization. [Code]
Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Yu-Gang Jiang, Ser-Nam Li.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
FT-TDR: Frequency-Guided Transformer and Top-Down Refinement Network for Blind Face Inpainting.
Junke Wang, Shaoxiang Chen, Zuxuan Wu, Yu-Gang Jiang.
IEEE Transactions on Multimedia (TMM), 2022.
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection.
Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen, Ser-Nam Lim, Yu-Gang Jiang.
International Conference on Multimedia Retrieval (ICMR), 2022.
Services

Conference Reviewer for CVPR, ICCV, ICML, NeurIPS, ICLR, ECCV, CoRL, etal.

Journal Reviewer for TPAMI, TIP, etal.

Awards

Bytedance PhD Fellowship (20 people in China and Singapore). 2025.

CCF-CV Academic Rising Star Award (3 people in China). 2025.

National Scholarship (Top 1%). 2022, 2025.