default search action
Ying Shan
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j27]Weihao Cheng, Ying Shan:
Learning layout generation for virtual worlds. Comput. Vis. Media 10(3): 577-592 (2024) - [j26]Ziqi Zhang, Zongyang Ma, Chunfeng Yuan, Yuxin Chen, Peijin Wang, Zhongang Qi, Chenglei Hao, Bing Li, Ying Shan, Weiming Hu, Stephen J. Maybank:
Chinese Title Generation for Short Videos: Dataset, Metric and Algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 46(7): 5192-5208 (2024) - [j25]Yihua Huang, Yan-Pei Cao, Yu-Kun Lai, Ying Shan, Lin Gao:
NeRF-Texture: Synthesizing Neural Radiance Field Textures. IEEE Trans. Pattern Anal. Mach. Intell. 46(9): 5986-6000 (2024) - [j24]Chong Mou, Xintao Wang, Yanze Wu, Ying Shan, Jian Zhang:
Empowering Real-World Image Super-Resolution With Flexible Interactive Modulation. IEEE Trans. Pattern Anal. Mach. Intell. 46(11): 7317-7330 (2024) - [j23]Yuxin Chen, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Jie Wang, Ying Shan, Bing Li, Weiming Hu, Xiaohu Qie, Jianping Wu:
DARTScore: DuAl-Reconstruction Transformer for Video Captioning Evaluation. IEEE Trans. Circuits Syst. Video Technol. 34(4): 2041-2055 (2024) - [j22]Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, Ping Luo:
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos. IEEE Trans. Image Process. 33: 5676-5687 (2024) - [j21]Dan Zhang, Wenzheng Feng, Yuandong Wang, Zhongang Qi, Ying Shan, Jie Tang:
DropConn: Dropout Connection Based Random GNNs for Molecular Property Prediction. IEEE Trans. Knowl. Data Eng. 36(2): 518-529 (2024) - [j20]Chen Li, Yixiao Ge, Dian Li, Ying Shan:
Vision-Language Instruction Tuning: A Review and Analysis. Trans. Mach. Learn. Res. 2024 (2024) - [j19]Jiashuo Yu, Junfu Pu, Ying Cheng, Rui Feng, Ying Shan:
Learning Music-Dance Representations Through Explicit-Implicit Rhythm Synchronization. IEEE Trans. Multim. 26: 8454-8463 (2024) - [j18]Jingyu Zhuang, Di Kang, Yan-Pei Cao, Guanbin Li, Liang Lin, Ying Shan:
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts. ACM Trans. Graph. 43(4): 121:1-121:12 (2024) - [j17]Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong:
ToonCrafter: Generative Cartoon Interpolation. ACM Trans. Graph. 43(6): 245:1-245:11 (2024) - [j16]Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Yibo Wang, Xintao Wang, Ying Shan, Yujiu Yang:
StyleCrafter: Taming Artistic Video Diffusion with Reference-Augmented Adapter Learning. ACM Trans. Graph. 43(6): 251:1-251:10 (2024) - [c177]Weihao Cheng, Yan-Pei Cao, Ying Shan:
SparseGNV: Generating Novel Views of Indoor Scenes with Sparse RGB-D Images. AAAI 2024: 1308-1316 - [c176]Shi-Sheng Huang, Zi-Xin Zou, Yichi Zhang, Yan-Pei Cao, Ying Shan:
SC-NeuS: Consistent Neural Surface Reconstruction from Sparse and Noisy Views. AAAI 2024: 2357-2365 - [c175]Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, Ying Shan:
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models. AAAI 2024: 4296-4304 - [c174]Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li:
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model. AAAI 2024: 6126-6134 - [c173]Yiyu Zhuang, Qi Zhang, Xuan Wang, Hao Zhu, Ying Feng, Xiaoyu Li, Ying Shan, Xun Cao:
A Pre-convolved Representation for Plug-and-Play Neural Illumination Fields. AAAI 2024: 7828-7836 - [c172]Zixin Zou, Weihao Cheng, Yan-Pei Cao, Shi-Sheng Huang, Ying Shan, Song-Hai Zhang:
Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views. AAAI 2024: 7900-7908 - [c171]Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ying Shan, Ping Luo:
LLaMA Pro: Progressive LLaMA with Block Expansion. ACL (1) 2024: 6518-6537 - [c170]Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong:
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models. CVPR 2024: 958-968 - [c169]Hanchao Liu, Xiaohang Zhan, Shaoli Huang, Tai-Jiang Mu, Ying Shan:
Programmable Motion Generation for Open-Set Motion Control Tasks. CVPR 2024: 1399-1408 - [c168]Jingbo Zhang, Xiaoyu Li, Qi Zhang, Yanpei Cao, Ying Shan, Jing Liao:
HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion. CVPR 2024: 1844-1854 - [c167]Xiaohan Ding, Yiyuan Zhang, Yixiao Ge, Sijie Zhao, Lin Song, Xiangyu Yue, Ying Shan:
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition. CVPR 2024: 5513-5524 - [c166]Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, Xiangyu Yue:
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities. CVPR 2024: 6108-6117 - [c165]Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu:
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting. CVPR 2024: 6646-6657 - [c164]Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models. CVPR 2024: 7310-7320 - [c163]Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CVPR 2024: 7631-7640 - [c162]Jia-Wei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CVPR 2024: 7664-7674 - [c161]Yuzhou Huang, Liangbin Xie, Xintao Wang, Ziyang Yuan, Xiaodong Cun, Yixiao Ge, Jiantao Zhou, Chao Dong, Rui Huang, Ruimao Zhang, Ying Shan:
SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models. CVPR 2024: 8362-8371 - [c160]Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang:
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-Based Image Editing. CVPR 2024: 8488-8497 - [c159]Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, Ying Shan:
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding. CVPR 2024: 8640-8650 - [c158]Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang, Qi Zhang, Yanpei Cao, Ying Shan, Long Quan:
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis. CVPR 2024: 10084-10094 - [c157]Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan:
SEED-Bench: Benchmarking Multimodal Large Language Models. CVPR 2024: 13299-13308 - [c156]Ruyang Liu, Chen Li, Yixiao Ge, Thomas H. Li, Ying Shan, Ge Li:
BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning. CVPR 2024: 13658-13667 - [c155]Lin Song, Yukang Chen, Shuai Yang, Xiaohan Ding, Yixiao Ge, Ying-Cong Chen, Ying Shan:
Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs. CVPR 2024: 13763-13773 - [c154]Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan:
YOLO-World: Real-Time Open-Vocabulary Object Detection. CVPR 2024: 16901-16911 - [c153]Zhihao Liang, Qi Zhang, Ying Feng, Ying Shan, Kui Jia:
GS-IR: 3D Gaussian Splatting for Inverse Rendering. CVPR 2024: 21644-21653 - [c152]Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond Chan, Ying Shan:
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models. CVPR 2024: 22139-22149 - [c151]Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
VIT-LENS: Towards Omni-modal Representations. CVPR 2024: 26637-26647 - [c150]Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Bing Li, Junfu Pu, Ying Shan, Xiaojuan Qi, Weiming Hu:
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? CVPR 2024: 26984-26993 - [c149]Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li:
ST-LLM: Large Language Models Are Effective Temporal Learners. ECCV (57) 2024: 1-18 - [c148]Tian-Xing Xu, Wenbo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhang:
Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing. ECCV (25) 2024: 37-53 - [c147]Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen:
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation. ECCV (36) 2024: 39-55 - [c146]Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Chunfeng Yuan, Bing Li, Yingmin Luo, Xu Li, Xiaojuan Qi, Ying Shan, Weiming Hu:
EA-VTR: Event-Aware Video-Text Retrieval. ECCV (52) 2024: 76-94 - [c145]Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, Qiang Xu:
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion. ECCV (20) 2024: 150-168 - [c144]Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, Wenbo Hu, Quan Long, Ying Shan, Yonghong Tian:
HiFi-123: Towards High-Fidelity One Image to 3D Content Generation. ECCV (73) 2024: 258-274 - [c143]Qinyu Yang, Hao Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan:
Noise Calibration: Plug-and-Play Content-Preserving Video Enhancement Using Pre-trained Video Diffusion Models. ECCV (36) 2024: 307-326 - [c142]Jinbo Xing, Menghan Xia, Yong Zhang, Hao Chen, Wangbo Yu, Hanyuan Liu, Gongye Liu, Xintao Wang, Ying Shan, Tien-Tsin Wong:
DynamiCrafter: Animating Open-Domain Images with Video Diffusion Priors. ECCV (46) 2024: 399-417 - [c141]Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang, Jie Yang, Ying Shan, Yan-Pei Cao, Lin Gao:
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes. ECCV (55) 2024: 436-453 - [c140]Yunpeng Bai, Xintao Wang, Yan-Pei Cao, Yixiao Ge, Chun Yuan, Ying Shan:
DreamDiffusion: High-Quality EEG-to-Image Generation with Temporal Masked Signal Modeling and CLIP Alignment. ECCV (31) 2024: 472-488 - [c139]Shansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Ying Shan:
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning. ICASSP 2024: 286-290 - [c138]Tianjun Mao, Shansong Liu, Yunxuan Zhang, Dian Li, Ying Shan:
Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information. ICASSP 2024: 7890-7894 - [c137]Shansong Liu, Xu Li, Dian Li, Ying Shan:
Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond. ICASSP 2024: 7915-7919 - [c136]Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. ICASSP 2024: 12577-12581 - [c135]Yuying Ge, Sijie Zhao, Ziyun Zeng, Yixiao Ge, Chen Li, Xintao Wang, Ying Shan:
Making LLaMA SEE and Draw with SEED Tokenizer. ICLR 2024 - [c134]Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan:
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models. ICLR 2024 - [c133]Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang:
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models. ICLR 2024 - [c132]Haonan Qiu, Menghan Xia, Yong Zhang, Yingqing He, Xintao Wang, Ying Shan, Ziwei Liu:
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling. ICLR 2024 - [c131]Jiaxu Zhang, Shaoli Huang, Zhigang Tu, Xin Chen, Xiaohang Zhan, Gang Yu, Ying Shan:
TapMo: Shape-aware Motion Generation of Skeleton-free Characters. ICLR 2024 - [c130]Chaolei Tan, Zihang Lin, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, Jian-Fang Hu:
SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses. ACM Multimedia 2024: 8383-8392 - [c129]Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan:
CustomNet: Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models. ACM Multimedia 2024: 10976-10984 - [c128]Zhouxia Wang, Ziyang Yuan, Xintao Wang, Yaowei Li, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan:
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation. SIGGRAPH (Conference Paper Track) 2024: 114 - [c127]Dan Zhang, Yangliao Geng, Wenwen Gong, Zhongang Qi, Zhiyu Chen, Xing Tang, Ying Shan, Yuxiao Dong, Jie Tang:
RecDCL: Dual Contrastive Learning for Recommendation. WWW 2024: 3655-3666 - [i209]Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ping Luo, Ying Shan:
LLaMA Pro: Progressive LLaMA with Block Expansion. CoRR abs/2401.02415 (2024) - [i208]Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, Weisi Lin, Wynne Hsu, Ying Shan, Mike Zheng Shou:
Towards A Better Metric for Text-to-Video Generation. CoRR abs/2401.07781 (2024) - [i207]Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models. CoRR abs/2401.09047 (2024) - [i206]Xiaohu Jiang, Yixiao Ge, Yuying Ge, Chun Yuan, Ying Shan:
Supervised Fine-tuning in turn Improves Visual Foundation Models. CoRR abs/2401.10222 (2024) - [i205]Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, Xiangyu Yue:
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities. CoRR abs/2401.14405 (2024) - [i204]Jingyu Zhuang, Di Kang, Yan-Pei Cao, Guanbin Li, Liang Lin, Ying Shan:
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts. CoRR abs/2401.14828 (2024) - [i203]Dan Zhang, Yangliao Geng, Wenwen Gong, Zhongang Qi, Zhiyu Chen, Xing Tang, Ying Shan, Yuxiao Dong, Jie Tang:
RecDCL: Dual Contrastive Learning for Recommendation. CoRR abs/2401.15635 (2024) - [i202]Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan:
YOLO-World: Real-Time Open-Vocabulary Object Detection. CoRR abs/2401.17270 (2024) - [i201]Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shan:
Advances in 3D Generation: A Survey. CoRR abs/2401.17807 (2024) - [i200]Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang:
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing. CoRR abs/2402.02583 (2024) - [i199]Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen:
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation. CoRR abs/2402.10491 (2024) - [i198]Xiuzhe Wu, Xiaoyang Lyu, Qihao Huang, Yong Liu, Yang Wu, Ying Shan, Xiaojuan Qi:
DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos. CoRR abs/2403.05895 (2024) - [i197]Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, Qiang Xu:
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion. CoRR abs/2403.06976 (2024) - [i196]Ang Li, Qiugen Xiao, Peng Cao, Jian Tang, Yi Yuan, Zijie Zhao, Xiaoyuan Chen, Liang Zhang, Xiangyang Li, Kaitong Yang, Weidong Guo, Yukang Gan, Xu Yu, Daniell Wang, Ying Shan:
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback. CoRR abs/2403.08309 (2024) - [i195]Duotun Wang, Hengyu Meng, Zeyu Cai, Zhijing Shao, Qianxi Liu, Lin Wang, Mingming Fan, Ying Shan, Xiaohang Zhan, Zeyu Wang:
HeadEvolver: Text to Head Avatars via Locally Learnable Mesh Deformation. CoRR abs/2403.09326 (2024) - [i194]Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Li:
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model. CoRR abs/2403.10044 (2024) - [i193]Tian-Xing Xu, Wenbo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhang:
Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing. CoRR abs/2403.10050 (2024) - [i192]Yujiao Jiang, Qingmin Liao, Xiaoyu Li, Li Ma, Qi Zhang, Chaopeng Zhang, Zongqing Lu, Ying Shan:
UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling. CoRR abs/2403.11589 (2024) - [i191]Ruyang Liu, Chen Li, Haoran Tang, Yixiao Ge, Ying Shan, Ge Li:
ST-LLM: Large Language Models Are Effective Temporal Learners. CoRR abs/2404.00308 (2024) - [i190]Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, Ying Shan:
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models. CoRR abs/2404.07191 (2024) - [i189]Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan:
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation. CoRR abs/2404.14396 (2024) - [i188]Bohao Li, Yuying Ge, Yi Chen, Yixiao Ge, Ruimao Zhang, Ying Shan:
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension. CoRR abs/2404.16790 (2024) - [i187]Zidong Cao, Zhan Wang, Yexin Liu, Yan-Pei Cao, Ying Shan, Wei Zeng, Lin Wang:
Learning High-Quality Navigation and Zooming on Omnidirectional Images in Virtual Reality. CoRR abs/2405.00351 (2024) - [i186]Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan:
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing. CoRR abs/2405.04007 (2024) - [i185]Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo:
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots. CoRR abs/2405.07990 (2024) - [i184]Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang:
ReVideo: Remake a Video with Motion and Content Control. CoRR abs/2405.13865 (2024) - [i183]Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, Wenbo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan:
Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh. CoRR abs/2405.17811 (2024) - [i182]Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong:
ToonCrafter: Generative Cartoon Interpolation. CoRR abs/2405.17933 (2024) - [i181]Hanchao Liu, Xiaohang Zhan, Shaoli Huang, Tai-Jiang Mu, Ying Shan:
Programmable Motion Generation for Open-Set Motion Control Tasks. CoRR abs/2405.19283 (2024) - [i180]Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, Yinqiang Zheng:
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model. CoRR abs/2405.20222 (2024) - [i179]Sijie Zhao, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Muyao Niu, Xiaoyu Li, Wenbo Hu, Ying Shan:
CV-VAE: A Compatible Video VAE for Latent Generative Video Models. CoRR abs/2405.20279 (2024) - [i178]Shaoshu Yang, Yong Zhang, Xiaodong Cun, Ying Shan, Ran He:
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation. CoRR abs/2406.00908 (2024) - [i177]Yicheng Xiao, Lin Song, Shaoli Huang, Jiangshan Wang, Siyu Song, Yixiao Ge, Xiu Li, Ying Shan:
GrootVL: Tree Topology is All You Need in State Space Model. CoRR abs/2406.02395 (2024) - [i176]Tao Yang, Yingmin Luo, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen:
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM. CoRR abs/2406.02884 (2024) - [i175]Xubing Ye, Yukang Gan, Xiaoke Huang, Yixiao Ge, Ying Shan, Yansong Tang:
VoCo-LLaMA: Towards Vision Compression with Large Language Models. CoRR abs/2406.12275 (2024) - [i174]Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, Ying Shan:
Image Conductor: Precision Control for Interactive Video Synthesis. CoRR abs/2406.15339 (2024) - [i173]Xuan Ju, Yiming Gao, Zhaoyang Zhang, Ziyang Yuan, Xintao Wang, Ailing Zeng, Yu Xiong, Qiang Xu, Ying Shan:
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions. CoRR abs/2407.06358 (2024) - [i172]Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Chunfeng Yuan, Bing Li, Yingmin Luo, Xu Li, Xiaojuan Qi, Ying Shan, Weiming Hu:
EA-VTR: Event-Aware Video-Text Retrieval. CoRR abs/2407.07478 (2024) - [i171]Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Bing Li, Junfu Pu, Ying Shan, Xiaojuan Qi, Weiming Hu:
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? CoRR abs/2407.07479 (2024) - [i170]Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen:
SEED-Story: Multimodal Long Story Generation with Large Language Model. CoRR abs/2407.08683 (2024) - [i169]Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan:
Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models. CoRR abs/2407.10285 (2024) - [i168]Xuan Ju, Junhao Zhuang, Zhaoyang Zhang, Yuxuan Bian, Qiang Xu, Ying Shan:
Image Inpainting Models are Effective Tools for Instruction-guided Image Editing. CoRR abs/2407.13139 (2024) - [i167]Chaolei Tan, Zihang Lin, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, Jian-Fang Hu:
SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses. CoRR abs/2408.01669 (2024) - [i166]Yuzhou Huang, Yiran Qin, Shunlin Lu, Xintao Wang, Rui Huang, Ying Shan, Ruimao Zhang:
Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models. CoRR abs/2408.11801 (2024) - [i165]Tao Wu, Yong Zhang, Xintao Wang, Xianpan Zhou, Guangcong Zheng, Zhongang Qi, Ying Shan, Xi Li:
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities. CoRR abs/2408.13239 (2024) - [i164]Wangbo Yu, Jinbo Xing, Li Yuan, Wenbo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian:
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis. CoRR abs/2409.02048 (2024) - [i163]Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan:
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos. CoRR abs/2409.02095 (2024) - [i162]Zhuoyan Luo, Fengyuan Shi, Yixiao Ge, Yujiu Yang, Limin Wang, Ying Shan:
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation. CoRR abs/2409.04410 (2024) - [i161]Sijie Zhao, Wenbo Hu, Xiaodong Cun, Yong Zhang, Xiaoyu Li, Zhe Kong, Xiangjun Gao, Muyao Niu, Ying Shan:
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos. CoRR abs/2409.07447 (2024) - [i160]Ye Liu, Zongyang Ma, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen:
E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding. CoRR abs/2409.18111 (2024) - [i159]Siyuan Hou, Shansong Liu, Ruibin Yuan, Wei Xue, Ying Shan, Mangsuo Zhao, Chao Zhang:
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer. CoRR abs/2410.05151 (2024) - [i158]Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, Ping Luo:
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos. CoRR abs/2410.11828 (2024) - 2023
- [j15]Hao Ren, Ziqiang Zheng, Yang Wu, Hong Lu, Yang Yang, Ying Shan, Sai-Kit Yeung:
ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based Image Retrieval. IEEE Trans. Circuits Syst. Video Technol. 33(9): 5022-5035 (2023) - [j14]Xiao Wang, Weirong Ye, Zhongang Qi, Guangge Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Hanzi Wang:
Task-Aware Dual-Representation Network for Few-Shot Action Recognition. IEEE Trans. Circuits Syst. Video Technol. 33(10): 5932-5946 (2023) - [c126]Yizhen Chen, Jie Wang, Lijian Lin, Zhongang Qi, Jin Ma, Ying Shan:
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval. AAAI 2023: 396-404 - [c125]Lijian Lin, Xintao Wang, Zhongang Qi, Ying Shan:
Accelerating the Training of Video Super-resolution Models. AAAI 2023: 1595-1603 - [c124]Liangbin Xie, Xintao Wang, Shuwei Shi, Jinjin Gu, Chao Dong, Ying Shan:
Mitigating Artifacts in Real-World Video Super-resolution Models. AAAI 2023: 2956-2964 - [c123]Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. AAAI 2023: 3393-3400 - [c122]Zhihan Yang, Zhiyong Wu, Ying Shan, Jia Jia:
What Does Your Face Sound Like? 3D Face Shape towards Voice. AAAI 2023: 13905-13913 - [c121]Limao Xiong, Jie Zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuanjing Huang, Jin Ma, Ying Shan:
A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition. ACL (Findings) 2023: 1375-1386 - [c120]Rui Zheng, Zhiheng Xi, Qin Liu, Wenbin Lai, Tao Gui, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan, Weifeng Ge:
Characterizing the Impacts of Instances on Robustness. ACL (Findings) 2023: 2314-2332 - [c119]Songyang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan:
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization. ACL (1) 2023: 12177-12189 - [c118]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan:
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection. ACL (Findings) 2023: 13573-13581 - [c117]Yiming Gao, Yan-Pei Cao, Ying Shan:
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes. CVPR 2023: 108-118 - [c116]Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Öztireli, Yujiu Yang:
3D GAN Inversion with Facial Symmetry Prior. CVPR 2023: 342-351 - [c115]Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dong-Ming Yan:
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing. CVPR 2023: 427-436 - [c114]Fang Zhao, Zekun Li, Shaoli Huang, Junwu Weng, Tianfei Zhou, Guo-Sen Xie, Jue Wang, Ying Shan:
Learning Anchor Transformations for 3D Garment Animation. CVPR 2023: 491-500 - [c113]Mingdeng Cao, Chong Mou, Fanghua Yu, Xintao Wang, Yinqiang Zheng, Jian Zhang, Chao Dong, Gen Li, Ying Shan, Radu Timofte, Xiaopeng Sun, Weiqi Li, Zhenyu Zhang, Xuhan Sheng, Bin Chen, Haoyu Ma, Ming Cheng, Shijie Zhao, Wanwan Cui, Tianyu Xu, Chunyang Li, Long Bao, Heng Sun, Huaibo Huang, Xiaoqiang Zhou, Yuang Ai, Ran He, Renlong Wu, Yi Yang, Zhilu Zhang, Shuohao Zhang, Junyi Li, Yunjin Chen, Dongwei Ren, Wangmeng Zuo, Qian Wang, Hao-Hsiang Yang, Yi-Chung Chen, Zhi-Kai Huang, Wei-Ting Chen, Yuan-Chun Chiang, Hua-En Chang, I-Hsiang Chen, Chia-Hsuan Hsieh, Sy-Yen Kuo, Zebin Zhang, Jiaqi Zhang, Yuhui Wang, Shuhao Cui, Junshi Huang, Li Zhu, Shuman Tian, Wei Yu, Bingchun Luo:
NTIRE 2023 Challenge on 360° Omnidirectional Image and Video Super-Resolution: Datasets, Methods and Results. CVPR Workshops 2023: 1731-1745 - [c112]Yunpeng Bai, Yanbo Fan, Xuan Wang, Yong Zhang, Jingxiang Sun, Chun Yuan, Ying Shan:
High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors. CVPR 2023: 4541-4551 - [c111]Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Kevin Qinghong Lin, Satoshi Tsutsui, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-Training. CVPR 2023: 6598-6608 - [c110]Yue Chen, Xingyu Chen, Xuan Wang, Qi Zhang, Yu Guo, Ying Shan, Fei Wang:
Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields. CVPR 2023: 8264-8273 - [c109]Wenxuan Zhang, Xiaodong Cun, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang:
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation. CVPR 2023: 8652-8661 - [c108]Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Weiming Hu, Xiaohu Qie, Jianping Wu:
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval. CVPR 2023: 11018-11027 - [c107]Hao Ai, Zidong Cao, Yan-Pei Cao, Ying Shan, Lin Wang:
HRDFuse: Monocular 360° Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions. CVPR 2023: 13273-13282 - [c106]Fanghua Yu, Xintao Wang, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong:
OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer. CVPR 2023: 13283-13292 - [c105]Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu:
Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry. CVPR 2023: 13864-13872 - [c104]Qiangqiang Wu, Tianyu Yang, Ziquan Liu, Baoyuan Wu, Ying Shan, Antoni B. Chan:
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks. CVPR 2023: 14561-14571 - [c103]Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao:
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models. CVPR 2023: 20908-20918 - [c102]Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li:
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation. CVPR 2023: 22490-22499 - [c101]Teng Wang, Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, Xiaohu Qie, Ping Luo:
Accelerating Vision-Language Pretraining with Free Language Modeling. CVPR 2023: 23161-23170 - [c100]Shusheng Yang, Yixiao Ge, Kun Yi, Dian Li, Ying Shan, Xiaohu Qie, Xinggang Wang:
RILS: Masked Visual Reconstruction in Language Semantic Space. CVPR 2023: 23304-23314 - [c99]Liang Chen, Yong Zhang, Yibing Song, Ying Shan, Lingqiao Liu:
Improved Test-Time Adaptation for Domain Generalization. CVPR 2023: 24172-24182 - [c98]Wenxi Ma, Tianxiang Hou, Qianji Di, Zhongang Qi, Ying Shan, Hanzi Wang:
ERBNet: An Effective Representation Based Network for Unbiased Scene Graph Generation. ICASSP 2023: 1-5 - [c97]Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training. ICASSP 2023: 1-5 - [c96]Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan:
Exploring Model Transferability through the Lens of Potential Energy. ICCV 2023: 5406-5415 - [c95]Yuxin Fang, Shusheng Yang, Shijie Wang, Yixiao Ge, Ying Shan, Xinggang Wang:
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection. ICCV 2023: 6221-6230 - [c94]Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Stan Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. ICCV 2023: 7589-7599 - [c93]Zidong Cao, Hao Ai, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Lin Wang:
OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution. ICCV 2023: 12851-12861 - [c92]Zongyang Ma, Ziqi Zhang, Yuxin Chen, Zhongang Qi, Yingmin Luo, Zekun Li, Chunfeng Yuan, Bing Li, Xiaohu Qie, Ying Shan, Weiming Hu:
Order-Prompted Tag Sequence Generation for Video Tagging. ICCV 2023: 15635-15644 - [c91]Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen:
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing. ICCV 2023: 15886-15896 - [c90]Jia-Wei Liu, Yan-Pei Cao, Tianyuan Yang, Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. ICCV 2023: 18437-18448 - [c89]Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi:
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video. ICCV 2023: 22111-22120 - [c88]Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng:
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing. ICCV 2023: 22503-22513 - [c87]Kun Yi, Yixiao Ge, Xiaotong Li, Shusheng Yang, Dian Li, Jianping Wu, Ying Shan, Xiaohu Qie:
Masked Image Modeling with Denoising Contrast. ICLR 2023 - [c86]Dazhao Du, Bing Su, Yu Li, Zhongang Qi, Lingyu Si, Ying Shan:
Do We Really Need Temporal Convolutions in Action Segmentation? ICME 2023: 1014-1019 - [c85]Chengyue Wu, Teng Wang, Yixiao Ge, Zeyu Lu, Ruisong Zhou, Ying Shan, Ping Luo:
π-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation. ICML 2023: 37713-37727 - [c84]Liangbin Xie, Xintao Wang, Xiangyu Chen, Gen Li, Ying Shan, Jiantao Zhou, Chao Dong:
DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models. ICML 2023: 38204-38226 - [c83]Xuewei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li:
SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation. IJCAI 2023: 1125-1133 - [c82]Zhihan Yang, Shansong Liu, Xu Li, Haozhe Wu, Zhiyong Wu, Ying Shan, Jia Jia:
Prosody Modeling with 3D Visual Information for Expressive Video Dubbing. INTERSPEECH 2023: 4863-4867 - [c81]Yukang Gan, Yixiao Ge, Chang Zhou, Shupeng Su, Zhouchuan Xu, Xuyuan Xu, Quanchao Hui, Xiang Chen, Yexin Wang, Ying Shan:
Binary Embedding-based Retrieval at Tencent. KDD 2023: 4056-4067 - [c80]Yuxuan Zhao, Jin Ma, Zhongang Qi, Zehua Xie, Yu Luo, Qiusheng Kang, Ying Shan:
VTLayout: A Multi-Modal Approach for Video Text Layout. ACM Multimedia 2023: 2775-2784 - [c79]Tao Yang, Fan Wang, Junfan Lin, Zhongang Qi, Yang Wu, Jing Xu, Ying Shan, Changwen Chen:
Toward Human Perception-Centric Video Thumbnail Generation. ACM Multimedia 2023: 6653-6664 - [c78]Zheng Chen, Yan-Pei Cao, Yuan-Chen Guo, Chen Wang, Ying Shan, Song-Hai Zhang:
PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline Panoramas. NeurIPS 2023 - [c77]Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan:
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model. NeurIPS 2023 - [c76]Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. NeurIPS 2023 - [c75]Xiuzhe Wu, Peng Dai, Weipeng Deng, Handi Chen, Yang Wu, Yan-Pei Cao, Ying Shan, Xiaojuan Qi:
CL-NeRF: Continual Learning of Neural Radiance Fields for Evolving Scene Representation. NeurIPS 2023 - [c74]Rui Yang, Lin Song, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan:
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction. NeurIPS 2023 - [c73]Li Yang, Chunfeng Yuan, Ziqi Zhang, Zhongang Qi, Yan Xu, Wei Liu, Ying Shan, Bing Li, Weiping Yang, Peng Li, Yan Wang, Weiming Hu:
Exploiting Contextual Objects and Relations for 3D Visual Grounding. NeurIPS 2023 - [c72]Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng:
Inserting Anybody in Diffusion Models via Celeb Basis. NeurIPS 2023 - [c71]Yihua Huang, Yan-Pei Cao, Yu-Kun Lai, Ying Shan, Lin Gao:
NeRF-Texture: Texture Synthesis with Neural Radiance Fields. SIGGRAPH (Conference Paper Track) 2023: 43:1-43:10 - [c70]Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu:
NOFA: NeRF-based One-shot Facial Avatar Reconstruction. SIGGRAPH (Conference Paper Track) 2023: 85:1-85:12 - [c69]Yuan-Chen Guo, Yan-Pei Cao, Chen Wang, Yu He, Ying Shan, Song-Hai Zhang:
VMesh: Hybrid Volume-Mesh Representation for Efficient View Synthesis. SIGGRAPH Asia 2023: 17:1-17:11 - [c68]Cong Wang, Di Kang, Yan-Pei Cao, Linchao Bao, Ying Shan, Song-Hai Zhang:
Neural Point-based Volumetric Avatar: Surface-guided Neural Points for Efficient and Photorealistic Volumetric Head Avatar. SIGGRAPH Asia 2023: 50:1-50:12 - [c67]Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang:
Interactive Story Visualization with Multiple Characters. SIGGRAPH Asia 2023: 101:1-101:10 - [c66]Yiyu Zhuang, Qi Zhang, Ying Feng, Hao Zhu, Yao Yao, Xiaoyu Li, Yan-Pei Cao, Ying Shan, Xun Cao:
Anti-Aliased Neural Implicit Surfaces with Encoding Level of Detail. SIGGRAPH Asia 2023: 119:1-119:10 - [i157]Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dongming Yan:
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing. CoRR abs/2301.06281 (2023) - [i156]Shusheng Yang, Yixiao Ge, Kun Yi, Dian Li, Ying Shan, Xiaohu Qie, Xinggang Wang:
Masked Visual Reconstruction in Language Semantic Space. CoRR abs/2301.06958 (2023) - [i155]Yizhen Chen, Jie Wang, Lijian Lin, Zhongang Qi, Jin Ma, Ying Shan:
Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval. CoRR abs/2301.12644 (2023) - [i154]Fanghua Yu, Xintao Wang, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong:
OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer. CoRR abs/2302.03453 (2023) - [i153]Chong Mou, Xintao Wang, Liangbin Xie, Jian Zhang, Zhongang Qi, Ying Shan, Xiaohu Qie:
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models. CoRR abs/2302.08453 (2023) - [i152]Yukang Gan, Yixiao Ge, Chang Zhou, Shupeng Su, Zhouchuan Xu, Xuyuan Xu, Quanchao Hui, Xiang Chen, Yexin Wang, Ying Shan:
Binary Embedding-based Retrieval at Tencent. CoRR abs/2302.08714 (2023) - [i151]Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu:
Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry. CoRR abs/2303.08658 (2023) - [i150]Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen:
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing. CoRR abs/2303.09535 (2023) - [i149]Haoyu Wang, Shaoli Huang, Fang Zhao, Chun Yuan, Ying Shan:
HMC: Hierarchical Mesh Coarsening for Skeleton-free Motion Retargeting. CoRR abs/2303.10941 (2023) - [i148]Hao Ai, Zidong Cao, Yan-Pei Cao, Ying Shan, Lin Wang:
HRDFuse: Monocular 360°Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions. CoRR abs/2303.11616 (2023) - [i147]Yongkang Cheng, Shaoli Huang, Jifeng Ning, Ying Shan:
BoPR: Body-aware Part Regressor for Human Shape and Pose Estimation. CoRR abs/2303.11675 (2023) - [i146]Teng Wang, Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, Xiaohu Qie, Ping Luo:
Accelerating Vision-Language Pretraining with Free Language Modeling. CoRR abs/2303.14038 (2023) - [i145]Yuan-Chen Guo, Yan-Pei Cao, Chen Wang, Yu He, Ying Shan, Xiaohu Qie, Song-Hai Zhang:
VMesh: Hybrid Volume-Mesh Representation for Efficient View Synthesis. CoRR abs/2303.16184 (2023) - [i144]Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, Xi Li:
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation. CoRR abs/2303.17189 (2023) - [i143]Qiangqiang Wu, Tianyu Yang, Ziquan Liu, Baoyuan Wu, Ying Shan, Antoni B. Chan:
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks. CoRR abs/2304.00571 (2023) - [i142]Fang Zhao, Zekun Li, Shaoli Huang, Junwu Weng, Tianfei Zhou, Guo-Sen Xie, Jue Wang, Ying Shan:
Learning Anchor Transformations for 3D Garment Animation. CoRR abs/2304.00761 (2023) - [i141]Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong:
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models. CoRR abs/2304.00916 (2023) - [i140]Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen:
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos. CoRR abs/2304.01186 (2023) - [i139]Chen Li, Yixiao Ge, Jiayong Mao, Dian Li, Ying Shan:
TagGPT: Large Language Models are Zero-shot Multimodal Taggers. CoRR abs/2304.03022 (2023) - [i138]Liang Chen, Yong Zhang, Yibing Song, Ying Shan, Lingqiao Liu:
Improved Test-Time Adaptation for Domain Generalization. CoRR abs/2304.04494 (2023) - [i137]Mingdeng Cao, Xintao Wang, Zhongang Qi, Ying Shan, Xiaohu Qie, Yinqiang Zheng:
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing. CoRR abs/2304.08465 (2023) - [i136]Yiyu Zhuang, Qi Zhang, Xuan Wang, Hao Zhu, Ying Feng, Xiaoyu Li, Ying Shan, Xun Cao:
NeAI: A Pre-convoluted Representation for Plug-and-Play Neural Ambient Illumination. CoRR abs/2304.08757 (2023) - [i135]Yiming Gao, Yan-Pei Cao, Ying Shan:
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes. CoRR abs/2304.08971 (2023) - [i134]Jiawei Liu, Yan-Pei Cao, Tianyuan Yang, Eric Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. CoRR abs/2304.12281 (2023) - [i133]Chengyue Wu, Teng Wang, Yixiao Ge, Zeyu Lu, Ruisong Zhou, Ying Shan, Ping Luo:
π-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation. CoRR abs/2304.14381 (2023) - [i132]Weihao Cheng, Yan-Pei Cao, Ying Shan:
SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views. CoRR abs/2305.07024 (2023) - [i131]Guangzhi Wang, Yixiao Ge, Xiaohan Ding, Mohan S. Kankanhalli, Ying Shan:
What Makes for Good Visual Tokenizers for Large Language Models? CoRR abs/2305.12223 (2023) - [i130]Limao Xiong, Jie Zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuanjing Huang, Jin Ma, Ying Shan:
A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition. CoRR abs/2305.12485 (2023) - [i129]Ziyun Zeng, Yixiao Ge, Zhan Tong, Xihui Liu, Shu-Tao Xia, Ying Shan:
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale. CoRR abs/2305.14173 (2023) - [i128]Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang:
TaleCrafter: Interactive Story Visualization with Multiple Characters. CoRR abs/2305.18247 (2023) - [i127]Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. CoRR abs/2305.18292 (2023) - [i126]Rui Yang, Lin Song, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan:
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction. CoRR abs/2305.18752 (2023) - [i125]Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng:
Inserting Anybody in Diffusion Models via Celeb Basis. CoRR abs/2306.00926 (2023) - [i124]Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong:
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance. CoRR abs/2306.00943 (2023) - [i123]Zheng Chen, Yan-Pei Cao, Yuan-Chen Guo, Chen Wang, Ying Shan, Song-Hai Zhang:
PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline Panoramas. CoRR abs/2306.01531 (2023) - [i122]Xuewei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi Li:
SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation. CoRR abs/2306.03403 (2023) - [i121]Sijie Zhao, Yixiao Ge, Zhongang Qi, Lin Song, Xiaohan Ding, Zehua Xie, Ying Shan:
Sticker820K: Empowering Interactive Retrieval with Stickers. CoRR abs/2306.06870 (2023) - [i120]Jiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua Gao:
InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions. CoRR abs/2306.07154 (2023) - [i119]Binjie Zhang, Yixiao Ge, Xuyuan Xu, Ying Shan, Mike Zheng Shou:
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter. CoRR abs/2306.12642 (2023) - [i118]Qianji Di, Wenxi Ma, Zhongang Qi, Tianxiang Hou, Ying Shan, Hanzi Wang:
Towards Unseen Triples: Effective Text-Image-joint Learning for Scene Graph Generation. CoRR abs/2306.13420 (2023) - [i117]Chen Li, Xutan Peng, Teng Wang, Yixiao Ge, Mengyang Liu, Xuyuan Xu, Yexin Wang, Ying Shan:
PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas. CoRR abs/2306.14644 (2023) - [i116]Songyang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan:
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization. CoRR abs/2306.15164 (2023) - [i115]Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan:
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection. CoRR abs/2306.15705 (2023) - [i114]Yunpeng Bai, Xintao Wang, Yan-Pei Cao, Yixiao Ge, Chun Yuan, Ying Shan:
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals. CoRR abs/2306.16934 (2023) - [i113]Weihao Cheng, Yan-Pei Cao, Ying Shan:
ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models. CoRR abs/2306.17140 (2023) - [i112]Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, Jian Zhang:
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models. CoRR abs/2307.02421 (2023) - [i111]Liangbin Xie, Xintao Wang, Xiangyu Chen, Gen Li, Ying Shan, Jiantao Zhou, Chao Dong:
DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models. CoRR abs/2307.02457 (2023) - [i110]Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu:
NOFA: NeRF-based One-shot Facial Avatar Reconstruction. CoRR abs/2307.03441 (2023) - [i109]Cong Wang, Di Kang, Yan-Pei Cao, Linchao Bao, Ying Shan, Song-Hai Zhang:
Neural Point-based Volumetric Avatar: Surface-guided Neural Points for Efficient and Photorealistic Volumetric Head Avatar. CoRR abs/2307.05000 (2023) - [i108]Yingqing He, Menghan Xia, Haoxin Chen, Xiaodong Cun, Yuan Gong, Jinbo Xing, Yong Zhang, Xintao Wang, Chao Weng, Ying Shan, Qifeng Chen:
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation. CoRR abs/2307.06940 (2023) - [i107]Yuying Ge, Yixiao Ge, Ziyun Zeng, Xintao Wang, Ying Shan:
Planting a SEED of Vision in Large Language Model. CoRR abs/2307.08041 (2023) - [i106]Fanghua Yu, Xintao Wang, Zheyuan Li, Yan-Pei Cao, Ying Shan, Chao Dong:
GET3D-: Learning GET3D from Unconstrained Image Collections. CoRR abs/2307.14918 (2023) - [i105]Bohao Li, Rui Wang, Guangzhi Wang, Yuying Ge, Yixiao Ge, Ying Shan:
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension. CoRR abs/2307.16125 (2023) - [i104]Zidong Cao, Hao Ai, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Lin Wang:
OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution. CoRR abs/2308.08114 (2023) - [i103]Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong:
Guide3D: Create 3D Avatars from Text and Image Guidance. CoRR abs/2308.09705 (2023) - [i102]Weixian Lei, Yixiao Ge, Jianfeng Zhang, Dylan Sun, Kun Yi, Ying Shan, Mike Zheng Shou:
ViT-Lens: Towards Omni-modal Representations. CoRR abs/2308.10185 (2023) - [i101]Shansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Ying Shan:
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning. CoRR abs/2308.11276 (2023) - [i100]Zi-Xin Zou, Weihao Cheng, Yan-Pei Cao, Shi-Sheng Huang, Ying Shan, Song-Hai Zhang:
Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views. CoRR abs/2308.14078 (2023) - [i99]Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan:
Exploring Model Transferability through the Lens of Potential Energy. CoRR abs/2308.15074 (2023) - [i98]Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training. CoRR abs/2309.00284 (2023) - [i97]Zhouxia Wang, Xintao Wang, Liangbin Xie, Zhongang Qi, Ying Shan, Wenping Wang, Ping Luo:
StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation. CoRR abs/2309.01770 (2023) - [i96]Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi:
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video. CoRR abs/2309.04814 (2023) - [i95]Tianjun Mao, Shansong Liu, Yunxuan Zhang, Dian Li, Ying Shan:
Unified Pretraining Target Based Video-music Retrieval With Music Rhythm And Video Optical Flow Information. CoRR abs/2309.09421 (2023) - [i94]Shansong Liu, Xu Li, Dian Li, Ying Shan:
HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond. CoRR abs/2309.09623 (2023) - [i93]Yiyu Zhuang, Qi Zhang, Ying Feng, Hao Zhu, Yao Yao, Xiaoyu Li, Yan-Pei Cao, Ying Shan, Xun Cao:
Anti-Aliased Neural Implicit Surfaces with Encoding Level of Detail. CoRR abs/2309.10336 (2023) - [i92]Ruyang Liu, Chen Li, Yixiao Ge, Ying Shan, Thomas H. Li, Ge Li:
One For All: Video Conversation is Feasible Without Video Instruction Tuning. CoRR abs/2309.15785 (2023) - [i91]Yuying Ge, Sijie Zhao, Ziyun Zeng, Yixiao Ge, Chen Li, Xintao Wang, Ying Shan:
Making LLaMA SEE and Draw with SEED Tokenizer. CoRR abs/2310.01218 (2023) - [i90]Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, Long Quan, Ying Shan, Yonghong Tian:
HiFi-123: Towards High-fidelity One Image to 3D Content Generation. CoRR abs/2310.06744 (2023) - [i89]Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan:
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models. CoRR abs/2310.07702 (2023) - [i88]Jiawei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CoRR abs/2310.10624 (2023) - [i87]Yaofang Liu, Xiaodong Cun, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen, Yang Liu, Tieyong Zeng, Raymond H. Chan, Ying Shan:
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models. CoRR abs/2310.11440 (2023) - [i86]Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Xintao Wang, Tien-Tsin Wong, Ying Shan:
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors. CoRR abs/2310.12190 (2023) - [i85]Jiaxu Zhang, Shaoli Huang, Zhigang Tu, Xin Chen, Xiaohang Zhan, Gang Yu, Ying Shan:
TapMo: Shape-aware Motion Generation of Skeleton-free Characters. CoRR abs/2310.12678 (2023) - [i84]Haonan Qiu, Menghan Xia, Yong Zhang, Yingqing He, Xintao Wang, Ying Shan, Ziwei Liu:
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling. CoRR abs/2310.15169 (2023) - [i83]Haoxin Chen, Menghan Xia, Yingqing He, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Jinbo Xing, Yaofang Liu, Qifeng Chen, Xintao Wang, Chao Weng, Ying Shan:
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation. CoRR abs/2310.19512 (2023) - [i82]Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan:
CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models. CoRR abs/2310.19784 (2023) - [i81]Xin He, Shaoli Huang, Xiaohang Zhan, Chao Wen, Ying Shan:
SemanticBoost: Elevating Motion Generation with Augmented Textual Cues. CoRR abs/2310.20323 (2023) - [i80]Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan:
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model. CoRR abs/2311.03774 (2023) - [i79]Chen Li, Yixiao Ge, Dian Li, Ying Shan:
Vision-Language Instruction Tuning: A Review and Analysis. CoRR abs/2311.08172 (2023) - [i78]Atin Sakkeer Hussain, Shansong Liu, Chenshuo Sun, Ying Shan:
M2UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models. CoRR abs/2311.11255 (2023) - [i77]Xiaohan Ding, Yiyuan Zhang, Yixiao Ge, Sijie Zhao, Lin Song, Xiangyu Yue, Ying Shan:
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition. CoRR abs/2311.15599 (2023) - [i76]Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
ViT-Lens-2: Gateway to Omni-modal Intelligence. CoRR abs/2311.16081 (2023) - [i75]Zhihao Liang, Qi Zhang, Ying Feng, Ying Shan, Kui Jia:
GS-IR: 3D Gaussian Splatting for Inverse Rendering. CoRR abs/2311.16473 (2023) - [i74]Jingbo Zhang, Xiaoyu Li, Qi Zhang, Yanpei Cao, Ying Shan, Jing Liao:
HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion. CoRR abs/2311.16961 (2023) - [i73]Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu:
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting. CoRR abs/2311.17061 (2023) - [i72]Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan:
SEED-Bench-2: Benchmarking Multimodal Large Language Models. CoRR abs/2311.17092 (2023) - [i71]Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang, Qi Zhang, Yanpei Cao, Ying Shan, Long Quan:
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis. CoRR abs/2311.17123 (2023) - [i70]Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan:
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter. CoRR abs/2312.00330 (2023) - [i69]Yue Ma, Xiaodong Cun, Yingqing He, Chenyang Qi, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen:
MagicStick: Controllable Video Editing via Control Handle Transformations. CoRR abs/2312.03047 (2023) - [i68]Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan:
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation. CoRR abs/2312.03641 (2023) - [i67]Jiwen Yu, Xiaodong Cun, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang:
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators. CoRR abs/2312.03793 (2023) - [i66]Zhen Li, Mingdeng Cao, Xintao Wang, Zhongang Qi, Ming-Ming Cheng, Ying Shan:
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding. CoRR abs/2312.04461 (2023) - [i65]Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. CoRR abs/2312.04919 (2023) - [i64]Yongkang Yin, Xu Li, Ying Shan, Yuexian Zou:
AFL-Net: Integrating Audio, Facial, and Lip Modalities with Cross-Attention for Robust Speaker Diarization in the Wild. CoRR abs/2312.05730 (2023) - [i63]Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu:
EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models. CoRR abs/2312.06722 (2023) - [i62]Yuzhou Huang, Liangbin Xie, Xintao Wang, Ziyang Yuan, Xiaodong Cun, Yixiao Ge, Jiantao Zhou, Chao Dong, Rui Huang, Ruimao Zhang, Ying Shan:
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models. CoRR abs/2312.06739 (2023) - [i61]Jinguo Zhu, Xiaohan Ding, Yixiao Ge, Yuying Ge, Sijie Zhao, Hengshuang Zhao, Xiaohua Wang, Ying Shan:
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation. CoRR abs/2312.09251 (2023) - 2022
- [j13]Yanping Fu, Zhenyu Gai, Haifeng Zhao, Shaojie Zhang, Ying Shan, Yang Wu, Jin Tang:
Depth-Aware Shadow Removal. Comput. Graph. Forum 41(7): 455-464 (2022) - [j12]Yu Li, Ye Zhu, Ruoteng Li, Xintao Wang, Yue Luo, Ying Shan:
Hybrid Warping Fusion for Video Frame Interpolation. Int. J. Comput. Vis. 130(12): 2980-2993 (2022) - [c65]Xiangguang Chen, Ye Zhu, Yu Li, Bingtao Fu, Lei Sun, Ying Shan, Shan Liu:
Robust Human Matting via Semantic Guidance. ACCV (2) 2022: 613-628 - [c64]Liangbin Xie, Xintao Wang, Honglun Zhang, Chao Dong, Ying Shan:
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution. CVPR Workshops 2022: 656-665 - [c63]Shusheng Yang, Xinggang Wang, Yu Li, Yuxin Fang, Jiemin Fang, Wenyu Liu, Xun Zhao, Ying Shan:
Temporally Efficient Vision Transformer for Video Instance Segmentation. CVPR 2022: 2875-2885 - [c62]Ye Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, Xiaohu Qie:
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection. CVPR 2022: 3032-3041 - [c61]Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CVPR 2022: 3303-3312 - [c60]Yuying Ge, Yixiao Ge, Xihui Liu, Dian Li, Ying Shan, Xiaohu Qie, Ping Luo:
Bridging Video-text Retrieval with Multiple Choice Questions. CVPR 2022: 16146-16155 - [c59]Xixi Xu, Zhongang Qi, Jianqi Ma, Honglun Zhang, Ying Shan, Xiaohu Qie:
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild. CVPR 2022: 19130-19140 - [c58]Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng:
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder. ECCV (18) 2022: 126-143 - [c57]Xiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Ling-Yu Duan:
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training. ECCV (30) 2022: 231-246 - [c56]Wenqi Shao, Xun Zhao, Yixiao Ge, Zhaoyang Zhang, Lei Yang, Xiaogang Wang, Ying Shan, Ping Luo:
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space. ECCV (34) 2022: 286-302 - [c55]Yuying Ge, Yixiao Ge, Xihui Liu, Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo:
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval. ECCV (35) 2022: 691-708 - [c54]Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan:
Metric Learning Based Interactive Modulation for Real-World Super-Resolution. ECCV (17) 2022: 723-740 - [c53]Ziyu Wang, Dejing Xu, Gus Xia, Ying Shan:
Audio-To-Symbolic Arrangement Via Cross-Modal Music Representation Learning. ICASSP 2022: 181-185 - [c52]Xiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, Lingyu Duan:
Uncertainty Modeling for Out-of-Distribution Generalization. ICLR 2022 - [c51]Wenqi Shao, Yixiao Ge, Zhaoyang Zhang, Xuyuan Xu, Xiaogang Wang, Ying Shan, Ping Luo:
Dynamic Token Normalization improves Vision Transformers. ICLR 2022 - [c50]Binjie Zhang, Yixiao Ge, Yantao Shen, Yu Li, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan:
Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval. ICLR 2022 - [c49]Dazhao Du, Bing Su, Yu Li, Zhongang Qi, Lingyu Si, Ying Shan:
Convolutional Transformer with Similarity-based Boundary Prediction for Action Segmentation. ICTAI 2022: 855-860 - [c48]Binjie Zhang, Yixiao Ge, Yantao Shen, Shupeng Su, Fanzi Wu, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan:
Towards Universal Backward-Compatible Representation Learning. IJCAI 2022: 1615-1621 - [c47]Xu Li, Shansong Liu, Ying Shan:
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion. INTERSPEECH 2022: 4307-4311 - [c46]Jibin Gao, Junfu Pu, Honglun Zhang, Ying Shan, Wei-Shi Zheng:
PC-Dance: Posture-controllable Music-driven Dance Synthesis. ACM Multimedia 2022: 1261-1269 - [c45]Xintao Wang, Chao Dong, Ying Shan:
RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization. ACM Multimedia 2022: 2556-2564 - [c44]Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. NeurIPS 2022 - [c43]Yanze Wu, Xintao Wang, Gen Li, Ying Shan:
AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos. NeurIPS 2022 - [i60]Yuying Ge, Yixiao Ge, Xihui Liu, Dian Li, Ying Shan, Xiaohu Qie, Ping Luo:
BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions. CoRR abs/2201.04850 (2022) - [i59]Binjie Zhang, Yixiao Ge, Yantao Shen, Yu Li, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan:
Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image Retrieval. CoRR abs/2201.09724 (2022) - [i58]Xiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, Ling-Yu Duan:
Uncertainty Modeling for Out-of-Distribution Generalization. CoRR abs/2202.03958 (2022) - [i57]Binjie Zhang, Yixiao Ge, Yantao Shen, Shupeng Su, Fanzi Wu, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan:
Towards Universal Backward-Compatible Representation Learning. CoRR abs/2203.01583 (2022) - [i56]Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-training. CoRR abs/2203.07303 (2022) - [i55]Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou:
Revitalize Region Feature for Democratizing Video-Language Pre-training. CoRR abs/2203.07720 (2022) - [i54]Ye Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, Xiaohu Qie:
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection. CoRR abs/2203.12745 (2022) - [i53]Xiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Ling-Yu Duan:
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training. CoRR abs/2203.15371 (2022) - [i52]Ziqi Zhang, Yuxin Chen, Zongyang Ma, Zhongang Qi, Chunfeng Yuan, Bing Li, Ying Shan, Weiming Hu:
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation. CoRR abs/2203.16763 (2022) - [i51]Yuxin Fang, Shusheng Yang, Shijie Wang, Yixiao Ge, Ying Shan, Xinggang Wang:
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection. CoRR abs/2204.02964 (2022) - [i50]Shusheng Yang, Xinggang Wang, Yu Li, Yuxin Fang, Jiemin Fang, Wenyu Liu, Xun Zhao, Ying Shan:
Temporally Efficient Vision Transformer for Video Instance Segmentation. CoRR abs/2204.08412 (2022) - [i49]Yuying Ge, Yixiao Ge, Xihui Liu, Alex Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo:
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval. CoRR abs/2204.12408 (2022) - [i48]Shupeng Su, Binjie Zhang, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Ying Shan:
Privacy-Preserving Model Upgrades with Bidirectional Compatible Training in Image Retrieval. CoRR abs/2204.13919 (2022) - [i47]Liangbin Xie, Xintao Wang, Honglun Zhang, Chao Dong, Ying Shan:
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution. CoRR abs/2205.03409 (2022) - [i46]Chong Mou, Yanze Wu, Xintao Wang, Chao Dong, Jian Zhang, Ying Shan:
Metric Learning based Interactive Modulation for Real-World Super-Resolution. CoRR abs/2205.05065 (2022) - [i45]Lijian Lin, Xintao Wang, Zhongang Qi, Ying Shan:
Accelerating the Training of Video Super-Resolution Models. CoRR abs/2205.05069 (2022) - [i44]Xintao Wang, Chao Dong, Ying Shan:
RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization. CoRR abs/2205.05671 (2022) - [i43]Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, Ming-Ming Cheng:
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder. CoRR abs/2205.06803 (2022) - [i42]Kun Yi, Yixiao Ge, Xiaotong Li, Shusheng Yang, Dian Li, Jianping Wu, Ying Shan, Xiaohu Qie:
Masked Image Modeling with Denoising Contrast. CoRR abs/2205.09616 (2022) - [i41]Dazhao Du, Bing Su, Yu Li, Zhongang Qi, Lingyu Si, Ying Shan:
Efficient U-Transformer with Boundary-Aware Loss for Action Segmentation. CoRR abs/2205.13425 (2022) - [i40]Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. CoRR abs/2205.15723 (2022) - [i39]Yanze Wu, Xintao Wang, Gen Li, Ying Shan:
AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos. CoRR abs/2206.07038 (2022) - [i38]Jia-Chang Feng, Fa-Ting Hong, Jia-Run Du, Zhongang Qi, Ying Shan, Xiaohu Qie, Wei-Shi Zheng, Jianping Wu:
Weakly-supervised Action Localization via Hierarchical Mining. CoRR abs/2206.11011 (2022) - [i37]Xu Li, Shansong Liu, Ying Shan:
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion. CoRR abs/2206.13762 (2022) - [i36]Wenqi Shao, Xun Zhao, Yixiao Ge, Zhaoyang Zhang, Lei Yang, Xiaogang Wang, Ying Shan, Ping Luo:
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space. CoRR abs/2207.03036 (2022) - [i35]Jiashuo Yu, Junfu Pu, Ying Cheng, Rui Feng, Ying Shan:
Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization. CoRR abs/2207.03190 (2022) - [i34]Junfu Pu, Ying Shan:
Music-driven Dance Regeneration with Controllable Key Pose Constraints. CoRR abs/2207.03682 (2022) - [i33]Zi-Xin Zou, Shi-Sheng Huang, Yan-Pei Cao, Tai-Jiang Mu, Ying Shan, Hongbo Fu:
MonoNeuralFusion: Online Monocular Neural 3D Reconstruction with Geometric Priors. CoRR abs/2209.15153 (2022) - [i32]Xiangguang Chen, Ye Zhu, Yu Li, Bingtao Fu, Lei Sun, Ying Shan, Shan Liu:
Robust Human Matting via Semantic Guidance. CoRR abs/2210.05210 (2022) - [i31]Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. CoRR abs/2210.06954 (2022) - [i30]Runbang Zhang, Yixiao Zhang, Kai Shao, Ying Shan, Gus Xia:
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation. CoRR abs/2211.05543 (2022) - [i29]Yue Chen, Xingyu Chen, Xuan Wang, Qi Zhang, Yu Guo, Ying Shan, Fei Wang:
Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields. CoRR abs/2211.11505 (2022) - [i28]Wenxuan Zhang, Xiaodong Cun, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang:
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation. CoRR abs/2211.12194 (2022) - [i27]Yingqing He, Tianyu Yang, Yong Zhang, Ying Shan, Qifeng Chen:
Latent Video Diffusion Models for High-Fidelity Video Generation with Arbitrary Lengths. CoRR abs/2211.13221 (2022) - [i26]Yunpeng Bai, Yanbo Fan, Xuan Wang, Yong Zhang, Jingxiang Sun, Chun Yuan, Ying Shan:
High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors. CoRR abs/2211.15064 (2022) - [i25]Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Öztireli, Yujiu Yang:
3D GAN Inversion with Facial Symmetry Prior. CoRR abs/2211.16927 (2022) - [i24]Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CoRR abs/2212.03185 (2022) - [i23]Liangbin Xie, Xintao Wang, Shuwei Shi, Jinjin Gu, Chao Dong, Ying Shan:
Mitigating Artifacts in Real-World Video Super-Resolution Models. CoRR abs/2212.07339 (2022) - [i22]Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. CoRR abs/2212.11565 (2022) - [i21]Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, Shenghua Gao:
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models. CoRR abs/2212.14704 (2022) - 2021
- [j11]Jianming Zhang, Junxiang Lian, Zhaoxiang Yi, Shuwang Yang, Ying Shan:
High-Accuracy Guide Star Catalogue Generation with a Machine Learning Classification Algorithm. Sensors 21(8): 2647 (2021) - [c42]Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata:
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning. CVPR 2021: 7016-7025 - [c41]Xintao Wang, Yu Li, Honglun Zhang, Ying Shan:
Towards Real-World Blind Face Restoration With Generative Facial Prior. CVPR 2021: 9168-9178 - [c40]Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu:
Open-Book Video Captioning With Retrieve-Copy-Generate Network. CVPR 2021: 9837-9846 - [c39]Yuxin Fang, Shusheng Yang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu:
Instances as Queries. ICCV 2021: 6890-6899 - [c38]Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu:
Crossover Learning for Fast Online Video Instance Segmentation. ICCV 2021: 8023-8032 - [c37]Yanze Wu, Xintao Wang, Yu Li, Honglun Zhang, Xun Zhao, Ying Shan:
Towards Vivid and Diverse Image Colorization with Generative Color Prior. ICCV 2021: 14357-14366 - [c36]Siyuan Li, Yue Luo, Ye Zhu, Xun Zhao, Yu Li, Ying Shan:
Enforcing Temporal Consistency in Video Depth Estimation. ICCVW 2021: 1145-1154 - [c35]Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan:
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. ICCVW 2021: 1905-1914 - [c34]Xiao Wang, Weirong Ye, Zhongang Qi, Xun Zhao, Guangge Wang, Ying Shan, Hanzi Wang:
Semantic-Guided Relation Propagation Network for Few-shot Action Recognition. ACM Multimedia 2021: 816-825 - [c33]Di Jin, Zhongang Qi, Yingmin Luo, Ying Shan:
TransFusion: Multi-Modal Fusion for Video Tag Inference via Translation-based Knowledge Embedding. ACM Multimedia 2021: 1093-1101 - [c32]Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng:
Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization. ACM Multimedia 2021: 1591-1599 - [c31]Liangbin Xie, Xintao Wang, Chao Dong, Zhongang Qi, Ying Shan:
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution. NeurIPS 2021: 51-61 - [i20]Xintao Wang, Yu Li, Honglun Zhang, Ying Shan:
Towards Real-World Blind Face Restoration with Generative Facial Prior. CoRR abs/2101.04061 (2021) - [i19]Tairu Qiu, Guanxian Chen, Zhongang Qi, Bin Li, Ying Shan, Xiangyang Xue:
A Generic Object Re-identification System for Short Videos. CoRR abs/2102.05275 (2021) - [i18]Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu:
Open-book Video Captioning with Retrieve-Copy-Generate Network. CoRR abs/2103.05284 (2021) - [i17]Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu:
Crossover Learning for Fast Online Video Instance Segmentation. CoRR abs/2104.05970 (2021) - [i16]Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata:
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning. CoRR abs/2104.10955 (2021) - [i15]Yuxin Fang, Shusheng Yang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu:
Instances as Queries. CoRR abs/2105.01928 (2021) - [i14]Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Ying Shan, Bin Feng, Wenyu Liu:
Tracking Instances as Queries. CoRR abs/2106.11963 (2021) - [i13]Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan:
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. CoRR abs/2107.10833 (2021) - [i12]Fa-Ting Hong, Jia-Chang Feng, Dan Xu, Ying Shan, Wei-Shi Zheng:
Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization. CoRR abs/2107.12589 (2021) - [i11]Liangbin Xie, Xintao Wang, Chao Dong, Zhongang Qi, Ying Shan:
Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution. CoRR abs/2108.01070 (2021) - [i10]Yanze Wu, Xintao Wang, Yu Li, Honglun Zhang, Xun Zhao, Ying Shan:
Towards Vivid and Diverse Image Colorization with Generative Color Prior. CoRR abs/2108.08826 (2021) - [i9]Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CoRR abs/2112.00656 (2021) - [i8]Wenqi Shao, Yixiao Ge, Zhaoyang Zhang, Xuyuan Xu, Xiaogang Wang, Ying Shan, Ping Luo:
Dynamic Token Normalization Improves Vision Transformer. CoRR abs/2112.02624 (2021) - [i7]Ziyu Wang, Dejing Xu, Gus Xia, Ying Shan:
Audio-to-symbolic Arrangement via Cross-modal Music Representation Learning. CoRR abs/2112.15110 (2021) - 2020
- [c30]Yu Li, Zhuoran Shen, Ying Shan:
Fast Video Object Segmentation Using the Global Context Module. ECCV (10) 2020: 735-750 - [c29]Jiayin Cai, Chun Yuan, Cheng Shi, Lei Li, Yangyang Cheng, Ying Shan:
Feature Augmented Memory with Global Attention Network for VideoQA. IJCAI 2020: 998-1004 - [c28]Lijian Lin, Haosheng Chen, Honglun Zhang, Jun Liang, Yu Li, Ying Shan, Hanzi Wang:
Dual Semantic Fusion Network for Video Object Detection. ACM Multimedia 2020: 1855-1863 - [c27]Zirui Liu, Qingquan Song, Kaixiong Zhou, Ting-Hsiang Wang, Ying Shan, Xia Hu:
Detecting Interactions from Neural Networks via Topological Analysis. NeurIPS 2020 - [i6]Yu Li, Zhuoran Shen, Ying Shan:
Fast Video Object Segmentation using the Global Context Module. CoRR abs/2001.11243 (2020) - [i5]Lijian Lin, Haosheng Chen, Honglun Zhang, Jun Liang, Yu Li, Ying Shan, Hanzi Wang:
Dual Semantic Fusion Network for Video Object Detection. CoRR abs/2009.07498 (2020) - [i4]Binjie Zhang, Yu Li, Chun Yuan, Dejing Xu, Pin Jiang, Ying Shan:
A Simple Yet Effective Method for Video Temporal Grounding with Cross-Modality Attention. CoRR abs/2009.11232 (2020) - [i3]Zirui Liu, Qingquan Song, Kaixiong Zhou, Ting-Hsiang Wang, Ying Shan, Xia Hu:
Towards Interaction Detection Using Topological Analysis on Neural Networks. CoRR abs/2010.13015 (2020)
2010 – 2019
- 2019
- [c26]Ying Shan, Anqi Cui, Luchen Tan, Kun Xiong:
Overview of the NLPCC 2019 Shared Task: Open Domain Conversation Evaluation. NLPCC (2) 2019: 829-834 - 2018
- [c25]Ying Shan, Jian Jiao, Jie Zhu, J. C. Mao:
Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors. KDD 2018: 2170-2179 - [i2]Ying Shan, Jian Jiao, Jie Zhu, J. C. Mao:
Recurrent Binary Embedding for GPU-Enabled Exhaustive Retrieval from Billion-Scale Semantic Vectors. CoRR abs/1802.06466 (2018) - 2017
- [c24]Jie Zhu, Ying Shan, J. C. Mao, Dong Yu, Holakou Rahmanian, Yi Zhang:
Deep Embedding Forest: Forest-based Serving with Deep Embedding Features. KDD 2017: 1703-1711 - [i1]Jie Zhu, Ying Shan, J. C. Mao, Dong Yu, Holakou Rahmanian, Yi Zhang:
Deep Embedding Forest: Forest-based Serving with Deep Embedding Features. CoRR abs/1703.05291 (2017) - 2016
- [c23]Ying Shan, T. Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, J. C. Mao:
Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features. KDD 2016: 255-262 - 2013
- [c22]Liang Yin, Junmiao Wang, Ying Shan, Yi Jin, Zilin Sun, Weifeng Huang, Binbin Li:
An Empirical Research on Designing and Promoting the Brand Logo of Yangshan Shuimi Peaches Based on the Theory of Brand Experience. HCI (20) 2013: 165-174 - 2010
- [j10]Shai Avidan, Simon Baker, Ying Shan:
Internet Vision. Proc. IEEE 98(8): 1367-1369 (2010)
2000 – 2009
- 2009
- [c21]Gang Hua, Cha Zhang, Zicheng Liu, Zhengyou Zhang, Ying Shan:
Efficient Scale-Space Spatiotemporal Saliency Tracking for Distortion-Free Video Retargeting. ACCV (2) 2009: 182-192 - [c20]Ying Shan, Nianmin Yao:
NSM: A Security Mechanism for Object-Based Storage System. CSIE (4) 2009: 48-52 - [c19]Ying Shan, Guang Deng:
Kernel PCA Regression for Missing Data Estimation in DNA Microarray Analysis. ISCAS 2009: 1477-1480 - 2008
- [j9]James A. Sethian, Ying Shan:
Solving partial differential equations on irregular domains with moving interfaces, with applications to superconformal electrodeposition in semiconductor manufacturing. J. Comput. Phys. 227(13): 6411-6447 (2008) - [j8]Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
Unsupervised Learning of Discriminative Edge Measures for Vehicle Matching between Nonoverlapping Cameras. IEEE Trans. Pattern Anal. Mach. Intell. 30(4): 700-711 (2008) - [c18]Feng Han, Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
Discovering class specific composite features through discriminative sampling with Swendsen-Wang Cut. CVPR 2008 - 2007
- [j7]Yanlin Guo, Steven C. Hsu, Harpreet S. Sawhney, Rakesh Kumar, Ying Shan:
Robust Object Matching for Persistent Tracking with Heterogeneous Features. IEEE Trans. Pattern Anal. Mach. Intell. 29(5): 824-839 (2007) - [c17]Yanlin Guo, Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
PEET: Prototype Embedding and Embedding Transition for Matching Vehicles over Disparate Viewpoints. CVPR 2007 - 2006
- [j6]Ying Shan, Harpreet S. Sawhney, Bogdan Matei, Rakesh Kumar:
Shapeme Histogram Projection and Matching for Partial Object Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(4): 568-577 (2006) - [j5]Bogdan Matei, Ying Shan, Harpreet S. Sawhney, Yi Tan, Rakesh Kumar, Daniel F. Huber, Martial Hebert:
Rapid Object Indexing Using Locality Sensitive Hashing and Joint 3D-Signature Space Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 28(7): 1111-1126 (2006) - [c16]Ying Shan, Feng Han, Harpreet S. Sawhney, Rakesh Kumar:
Learning Exemplar-Based Categorization for the Detection of Multi-View Multi-Pose Objects. CVPR (2) 2006: 1431-1438 - 2005
- [j4]Ying Shan, Harpreet S. Sawhney, Art Pope:
Clustering multiple image sequences with a sequence-to-sequence similarity measure. Int. J. Pattern Recognit. Artif. Intell. 19(4): 551-564 (2005) - [c15]Yanlin Guo, Steven C. Hsu, Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
Vehicle Fingerprinting for Reacquisition and Tracking in Videos. CVPR (2) 2005: 761-768 - [c14]Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
Unsupervised Learning of Discriminative Edge Measures for Vehicle Matching between Non-Overlapping Cameras. CVPR (1) 2005: 894-901 - [c13]Ying Shan, Harpreet S. Sawhney, Rakesh Kumar:
Vehicle Identification between Non-Overlapping Cameras without Direct Feature Matching. ICCV 2005: 378-385 - 2004
- [j3]Zicheng Liu, Zhengyou Zhang, Ying Shan:
Image-Based Surface Detail Transfer. IEEE Computer Graphics and Applications 24(3): 30-35 (2004) - [j2]Zhengyou Zhang, Zicheng Liu, Dennis Adler, Michael F. Cohen, Erik Hanson, Ying Shan:
Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach. Int. J. Comput. Vis. 58(2): 93-119 (2004) - [c12]Ying Shan, Bogdan Matei, Harpreet S. Sawhney, Rakesh Kumar, Daniel F. Huber, Martial Hebert:
Linear Model Hashing and Batch RANSAC for Rapid and Accurate Object Recognition. CVPR (2) 2004: 121-128 - [c11]Ying Shan, Harpreet S. Sawhney, Bogdan Matei, Rakesh Kumar:
Partial Object Matching with Shapeme Histograms. ECCV (3) 2004: 442-455 - 2003
- [c10]Zhengyou Zhang, Ying Shan:
Incremental motion estimation through modified bundle adjustment. ICIP (2) 2003: 343-346 - 2002
- [j1]Ying Shan, Zhengyou Zhang:
New Measurements and Corner-Guidance for Curve Matching with Probabilistic Relaxation. Int. J. Comput. Vis. 46(2): 157-171 (2002) - 2001
- [c9]Ying Shan, Zicheng Liu, Zhengyou Zhang:
Image-Based Surface Detail Transfer. CVPR (2) 2001: 794- - [c8]Ying Shan, Zicheng Liu, Zhengyou Zhang:
Model-Based Bundle Adjustment with Application to Face Modeling. ICCV 2001: 644-651 - [c7]Zhengyou Zhang, Zicheng Liu, Dennis Adler, Michael F. Cohen, Erik Hanson, Ying Shan:
Cloning Your Own Face with a Desktop Camera. ICCV 2001: 745 - [c6]Zhengyou Zhang, Ying Wu, Ying Shan, Steven Shafer:
Visual panel: virtual mouse, keyboard and 3D controller with an ordinary piece of paper. PUI 2001: 3:1-3:8 - [c5]Zicheng Liu, Ying Shan, Zhengyou Zhang:
Expressive expression mapping with ratio images. SIGGRAPH 2001: 271-276 - 2000
- [c4]Ying Shan, Zhengyou Zhang:
Corner Guided Curve Matching and its Application to Scene Reconstruction. CVPR 2000: 1796-1803 - [c3]Zhengyou Zhang, Ying Shan:
Visual Screen: Transforming an Ordinary Screen into a Touch Screen. MVA 2000: 215-218 - [c2]Ying Shan, Zhengyou Zhang:
Curve Matching with Probabilistic Relaxation. MVA 2000: 248-253 - [c1]Zhengyou Zhang, Ying Shan:
A Progressive Scheme for Stereo Matching. SMILE 2000: 68-85
Coauthor Index
aka: Mike Zheng Shou
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-23 19:32 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint