default search action
Weizhu Chen
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c109]Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen:
Competition-Level Problems are Effective LLM Evaluators. ACL (Findings) 2024: 13526-13544 - [c108]Shengnan An, Zexiong Ma, Siqi Cai, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen:
Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks. EMNLP (Findings) 2024: 833-854 - [c107]Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen:
Automatic Instruction Evolving for Large Language Models. EMNLP 2024: 6998-7018 - [c106]Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He:
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective. ICLR 2024 - [c105]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. ICLR 2024 - [c104]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen:
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. ICLR 2024 - [c103]Yixiao Li, Yifan Yu, Chen Liang, Nikos Karampatziakis, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models. ICLR 2024 - [c102]Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang:
Supervised Knowledge Makes Large Language Models Better In-context Learners. ICLR 2024 - [c101]Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen:
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. NAACL (Industry Track) 2024: 165-190 - [c100]Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan:
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models. NAACL-HLT (Findings) 2024: 2299-2314 - [c99]Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen:
Language Models can be Deductive Solvers. NAACL-HLT (Findings) 2024: 4026-4042 - [i114]Yueqin Yin, Zhendong Wang, Yi Gu, Hai Huang, Weizhu Chen, Mingyuan Zhou:
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts. CoRR abs/2402.10958 (2024) - [i113]Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Hassan Awadalla, Weizhu Chen:
SciAgent: Tool-augmented Language Models for Scientific Reasoning. CoRR abs/2402.11451 (2024) - [i112]Ming Zhong, Yelong Shen, Shuohang Wang, Yadong Lu, Yizhu Jiao, Siru Ouyang, Donghan Yu, Jiawei Han, Weizhu Chen:
Multi-LoRA Composition for Image Generation. CoRR abs/2402.16843 (2024) - [i111]Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen:
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning. CoRR abs/2403.02333 (2024) - [i110]Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
Exploring the Mystery of Influential Data for Mathematical Reasoning. CoRR abs/2404.01067 (2024) - [i109]Vlad Fomenko, Han Yu, Jongho Lee, Stanley Hsieh, Weizhu Chen:
A Note on LoRA. CoRR abs/2404.05086 (2024) - [i108]Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen:
Rho-1: Not All Tokens Are What You Need. CoRR abs/2404.07965 (2024) - [i107]Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou:
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. CoRR abs/2404.14219 (2024) - [i106]Yueqin Yin, Zhendong Wang, Yujia Xie, Weizhu Chen, Mingyuan Zhou:
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment. CoRR abs/2405.20830 (2024) - [i105]Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen:
Automatic Instruction Evolving for Large Language Models. CoRR abs/2406.00770 (2024) - [i104]Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen:
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling. CoRR abs/2406.07522 (2024) - [i103]Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Qingwei Lin, Jianguang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen:
Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena. CoRR abs/2407.10627 (2024) - [i102]Liyuan Liu, Young Jin Kim, Shuohang Wang, Chen Liang, Yelong Shen, Hao Cheng, Xiaodong Liu, Masahiro Tanaka, Xiaoxia Wu, Wenxiang Hu, Vishrav Chaudhary, Zeqi Lin, Chengruidong Zhang, Jilong Xue, Hany Awadalla, Jianfeng Gao, Weizhu Chen:
GRIN: GRadient-INformed MoE. CoRR abs/2409.12136 (2024) - [i101]Yaming Yang, Dilxat Muhtar, Yelong Shen, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Denvy Deng, Feng Sun, Qi Zhang, Weizhu Chen, Yunhai Tong:
MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning. CoRR abs/2410.09437 (2024) - 2023
- [c98]Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan:
Code Execution with Pre-trained Language Models. ACL (Findings) 2023: 4984-4999 - [c97]Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen:
Making Language Models Better Reasoners with Step-Aware Verifier. ACL (1) 2023: 5315-5333 - [c96]Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen:
Joint Generator-Ranker Learning for Natural Language Generation. ACL (Findings) 2023: 7681-7699 - [c95]Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng:
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models. ACL (1) 2023: 8208-8222 - [c94]Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen:
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation. EMNLP 2023: 2471-2484 - [c93]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. EMNLP (Findings) 2023: 9248-9274 - [c92]Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou:
Skill-Based Few-Shot Selection for In-Context Learning. EMNLP 2023: 13472-13492 - [c91]Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen:
CodeT: Code Generation with Generated Tests. ICLR 2023 - [c90]Pengcheng He, Jianfeng Gao, Weizhu Chen:
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. ICLR 2023 - [c89]Zhendong Wang, Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Diffusion-GAN: Training GANs with Diffusion. ICLR 2023 - [c88]Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao:
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. ICLR 2023 - [c87]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders. ICLR 2023 - [c86]Yixiao Li, Yifan Yu, Qingru Zhang, Chen Liang, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation. ICML 2023: 20336-20350 - [c85]Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao:
Less is More: Task-aware Layer-wise Distillation for Language Model Compression. ICML 2023: 20852-20867 - [c84]Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Nan Duan, Weizhu Chen:
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise. ICML 2023: 21051-21064 - [c83]Jason Phang, Yi Mao, Pengcheng He, Weizhu Chen:
HyperTuning: Toward Adapting Large Language Models without Back-propagation. ICML 2023: 27854-27875 - [c82]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. ICML 2023: 30706-30775 - [c81]Anh Nguyen, Nikos Karampatziakis, Weizhu Chen:
Meet in the Middle: A New Pre-training Paradigm. NeurIPS 2023 - [c80]Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang (Atlas) Wang, Mingyuan Zhou:
In-Context Learning Unlocked for Diffusion Models. NeurIPS 2023 - [c79]Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou:
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models. NeurIPS 2023 - [c78]Tong Wu, Zhihao Fan, Xiao Liu, Hai-Tao Zheng, Yeyun Gong, Yelong Shen, Jian Jiao, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen:
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation. NeurIPS 2023 - [i100]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models. CoRR abs/2302.00618 (2023) - [i99]Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao:
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback. CoRR abs/2302.12813 (2023) - [i98]Anh Nguyen, Nikos Karampatziakis, Weizhu Chen:
Meet in the Middle: A New Pre-training Paradigm. CoRR abs/2303.07295 (2023) - [i97]Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao:
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. CoRR abs/2303.10512 (2023) - [i96]Fengji Zhang, Bei Chen, Yue Zhang, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen:
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation. CoRR abs/2303.12570 (2023) - [i95]Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen:
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. CoRR abs/2303.16854 (2023) - [i94]Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan:
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models. CoRR abs/2304.06364 (2023) - [i93]Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou:
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models. CoRR abs/2304.12526 (2023) - [i92]Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou:
In-Context Learning Unlocked for Diffusion Models. CoRR abs/2305.01115 (2023) - [i91]Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan:
Code Execution with Pre-trained Language Models. CoRR abs/2305.05383 (2023) - [i90]Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen:
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation. CoRR abs/2305.09515 (2023) - [i89]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen:
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. CoRR abs/2305.11738 (2023) - [i88]Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou:
Skill-Based Few-Shot Selection for In-Context Learning. CoRR abs/2305.14210 (2023) - [i87]Woojeong Jin, Subhabrata Mukherjee, Yu Cheng, Yelong Shen, Weizhu Chen, Ahmed Hassan Awadallah, Damien Jose, Xiang Ren:
GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions. CoRR abs/2305.14676 (2023) - [i86]Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen:
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. CoRR abs/2305.15294 (2023) - [i85]Yixiao Li, Yifan Yu, Qingru Zhang, Chen Liang, Pengcheng He, Weizhu Chen, Tuo Zhao:
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation. CoRR abs/2306.11222 (2023) - [i84]Alexander Bukharin, Yixiao Li, Pengcheng He, Weizhu Chen, Tuo Zhao:
Deep Reinforcement Learning from Hierarchical Weak Preference Feedback. CoRR abs/2309.02632 (2023) - [i83]Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan:
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency. CoRR abs/2309.17272 (2023) - [i82]Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen:
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. CoRR abs/2309.17452 (2023) - [i81]Liyuan Liu, Jianfeng Gao, Weizhu Chen:
Sparse Backpropagation for MoE Training. CoRR abs/2310.00811 (2023) - [i80]Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao:
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models. CoRR abs/2310.08659 (2023) - [i79]Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He:
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective. CoRR abs/2310.11451 (2023) - [i78]Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen:
Learning From Mistakes Makes LLM Better Reasoner. CoRR abs/2310.20689 (2023) - [i77]Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen:
Language Models can be Logical Solvers. CoRR abs/2311.06158 (2023) - [i76]Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen:
Competition-Level Problems are Effective LLM Evaluators. CoRR abs/2312.02143 (2023) - [i75]Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang:
Supervised Knowledge Makes Large Language Models Better In-context Learners. CoRR abs/2312.15918 (2023) - 2022
- [j2]Caihong Mu, Weizhu Chen, Yi Liu, Dongchang Lei, Ruochen Liu:
Virtual information core optimization for collaborative filtering recommendation based on clustering and evolutionary algorithms. Appl. Soft Comput. 116: 108355 (2022) - [c77]Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan:
XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge. AAAI 2022: 10840-10848 - [c76]Zhuocheng Gong, Di He, Yelong Shen, Tie-Yan Liu, Weizhu Chen, Dongyan Zhao, Ji-Rong Wen, Rui Yan:
Finding the Dominant Winning Ticket in Pre-Trained Language Models. ACL (Findings) 2022: 1459-1472 - [c75]Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren:
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models. ACL (1) 2022: 2763-2775 - [c74]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. ACL (Findings) 2022: 2912-2924 - [c73]Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, Nan Duan:
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation. ACL (1) 2022: 4852-4864 - [c72]Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao, Zhifang Sui, Weizhu Chen, Bill Dolan:
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation. ACL (1) 2022: 6723-6737 - [c71]Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao:
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing. ACL (1) 2022: 7162-7175 - [c70]Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen:
What Makes Good In-Context Examples for GPT-3? DeeLIO@ACL 2022: 100-114 - [c69]Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Ahmed Awadallah, Zhangyang Wang:
Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models. ECCV (23) 2022: 389-405 - [c68]Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan:
Soft-Labeled Contrastive Pre-Training for Function-Level Code Representation. EMNLP (Findings) 2022: 118-129 - [c67]Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen:
Reasoning Like Program Executors. EMNLP 2022: 761-779 - [c66]Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan:
CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search. EMNLP 2022: 2898-2910 - [c65]Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen:
LoRA: Low-Rank Adaptation of Large Language Models. ICLR 2022 - [c64]Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. ICLR 2022 - [c63]Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou:
TAPEX: Table Pre-training via Learning a Neural SQL Executor. ICLR 2022 - [c62]Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen:
Adversarial Retriever-Ranker for Dense Text Retrieval. ICLR 2022 - [c61]Qingru Zhang, Simiao Zuo, Chen Liang, Alexander Bukharin, Pengcheng He, Weizhu Chen, Tuo Zhao:
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance. ICML 2022: 26809-26823 - [c60]Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei Guan, Yongji Wang, Weizhu Chen, Jian-Guang Lou:
CERT: Continual Pre-training on Sketches for Library-oriented Code Generation. IJCAI 2022: 2369-2375 - [c59]Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen:
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering. NAACL-HLT 2022: 932-942 - [c58]Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
ALLSH: Active Learning Guided by Local Sensitivity and Hardness. NAACL-HLT (Findings) 2022: 1328-1342 - [c57]Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen:
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation. NAACL-HLT 2022: 1610-1623 - [i74]Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan:
CodeRetriever: Unimodal and Bimodal Contrastive Learning. CoRR abs/2201.10866 (2022) - [i73]Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Yan Gao, Qiang Fu, Jian-Guang Lou, Weizhu Chen:
Reasoning Like Program Executors. CoRR abs/2201.11473 (2022) - [i72]Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. CoRR abs/2202.02664 (2022) - [i71]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs. CoRR abs/2202.06510 (2022) - [i70]Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Truncated Diffusion Probabilistic Models. CoRR abs/2202.09671 (2022) - [i69]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. CoRR abs/2202.13257 (2022) - [i68]Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu Chen, Nanning Zheng, Jian-Guang Lou:
Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models. CoRR abs/2203.03131 (2022) - [i67]Greg Yang, Edward J. Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick Ryder, Jakub Pachocki, Weizhu Chen, Jianfeng Gao:
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer. CoRR abs/2203.03466 (2022) - [i66]Chen Liang, Pengcheng He, Yelong Shen, Weizhu Chen, Tuo Zhao:
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing. CoRR abs/2204.06625 (2022) - [i65]Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen:
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation. CoRR abs/2204.07675 (2022) - [i64]Wei Chen, Yeyun Gong, Song Wang, Bolun Yao, Weizhen Qi, Zhongyu Wei, Xiaowu Hu, Bartuer Zhou, Yi Mao, Weizhu Chen, Biao Cheng, Nan Duan:
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation. CoRR abs/2204.13031 (2022) - [i63]Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
ALLSH: Active Learning Guided by Local Sensitivity and Hardness. CoRR abs/2205.04980 (2022) - [i62]Weizhen Qi, Yeyun Gong, Yelong Shen, Jian Jiao, Yu Yan, Houqiang Li, Ruofei Zhang, Weizhu Chen, Nan Duan:
A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation. CoRR abs/2205.11162 (2022) - [i61]Zhendong Wang, Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou:
Diffusion-GAN: Training GANs with Diffusion. CoRR abs/2206.02262 (2022) - [i60]Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen:
On the Advance of Making Language Models Better Reasoners. CoRR abs/2206.02336 (2022) - [i59]Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei Guan, Yongji Wang, Weizhu Chen, Jian-Guang Lou:
CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation. CoRR abs/2206.06888 (2022) - [i58]Qingru Zhang, Simiao Zuo, Chen Liang, Alexander Bukharin, Pengcheng He, Weizhu Chen, Tuo Zhao:
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance. CoRR abs/2206.12562 (2022) - [i57]Weizhou Shen, Yeyun Gong, Yelong Shen, Song Wang, Xiaojun Quan, Nan Duan, Weizhu Chen:
Joint Generator-Ranker Learning for Natural Language Generation. CoRR abs/2206.13974 (2022) - [i56]Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen:
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering. CoRR abs/2207.03637 (2022) - [i55]Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen:
CodeT: Code Generation with Generated Tests. CoRR abs/2207.10397 (2022) - [i54]Chen Liang, Simiao Zuo, Qingru Zhang, Pengcheng He, Weizhu Chen, Tuo Zhao:
Less is More: Task-aware Layer-wise Distillation for Language Model Compression. CoRR abs/2210.01351 (2022) - [i53]Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan:
Soft-Labeled Contrastive Pre-training for Function-level Code Representation. CoRR abs/2210.09597 (2022) - [i52]Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen:
SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval. CoRR abs/2210.11773 (2022) - [i51]Biyang Guo, Yeyun Gong, Yelong Shen, Songqiao Han, Hailiang Huang, Nan Duan, Weizhu Chen:
GENIUS: Sketch-based Language Model Pre-training via Extreme and Selective Masking for Text Generation and Augmentation. CoRR abs/2211.10330 (2022) - [i50]Jason Phang, Yi Mao, Pengcheng He, Weizhu Chen:
HyperTuning: Toward Adapting Large Language Models without Back-propagation. CoRR abs/2211.12485 (2022) - [i49]Dong Li, Yelong Shen, Ruoming Jin, Yi Mao, Kuan Wang, Weizhu Chen:
Generation-Augmented Query Expansion For Code Retrieval. CoRR abs/2212.10692 (2022) - [i48]Zhenghao Lin, Yeyun Gong, Yelong Shen, Tong Wu, Zhihao Fan, Chen Lin, Weizhu Chen, Nan Duan:
GENIE: Large Scale Pre-training for Text Generation with Diffusion Model. CoRR abs/2212.11685 (2022) - 2021
- [c56]Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen:
Reader-Guided Passage Reranking for Open-Domain Question Answering. ACL/IJCNLP (Findings) 2021: 344-350 - [c55]Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan:
GLGE: A New General Language Generation Evaluation Benchmark. ACL/IJCNLP (Findings) 2021: 408-420 - [c54]Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao:
UnitedQA: A Hybrid Approach for Open Domain Question Answering. ACL/IJCNLP (1) 2021: 3080-3090 - [c53]Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen:
Generation-Augmented Retrieval for Open-Domain Question Answering. ACL/IJCNLP (1) 2021: 4089-4100 - [c52]Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen:
Memory-Efficient Differentiable Transformer Architecture Search. ACL/IJCNLP (Findings) 2021: 4254-4264 - [c51]Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang:
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability. ACL/IJCNLP (1) 2021: 4380-4390 - [c50]Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao, Weizhu Chen:
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization. ACL/IJCNLP (1) 2021: 6524-6538 - [c49]Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Tuo Zhao:
Token-wise Curriculum Learning for Neural Machine Translation. EMNLP (Findings) 2021: 3658-3670 - [c48]Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
ARCH: Efficient Adversarial Regularized Training with Caching. EMNLP (Findings) 2021: 4118-4131 - [c47]Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach. EMNLP (1) 2021: 6562-6577 - [c46]Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han:
Few-Shot Named Entity Recognition: An Empirical Baseline Study. EMNLP (1) 2021: 10408-10423 - [c45]Jungo Kasai, Hao Peng, Yizhe Zhang, Dani Yogatama, Gabriel Ilharco, Nikolaos Pappas, Yi Mao, Weizhu Chen, Noah A. Smith:
Finetuning Pretrained Transformers into RNNs. EMNLP (1) 2021: 10630-10643 - [c44]Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen:
Deberta: decoding-Enhanced Bert with Disentangled Attention. ICLR 2021 - [c43]Kevin J. Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin:
MixKD: Towards Efficient Distillation of Large-scale Language Models. ICLR 2021 - [c42]Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Weizhu Chen, Jiawei Han:
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding. ICLR 2021 - [c41]Weizhen Qi, Yeyun Gong, Jian Jiao, Yu Yan, Weizhu Chen, Dayiheng Liu, Kewen Tang, Houqiang Li, Jiusheng Chen, Ruofei Zhang, Ming Zhou, Nan Duan:
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining. ICML 2021: 8630-8639 - [c40]Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen:
Poolingformer: Long Document Modeling with Pooling Attention. ICML 2021: 12437-12446 - [c39]Sandra Sajeev, Jade Huang, Nikos Karampatziakis, Matthew Hall, Sebastian Kochman, Weizhu Chen:
Contextual Bandit Applications in a Customer Support Bot. KDD 2021: 3522-3530 - [c38]Ge Yang, Edward J. Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick Ryder, Jakub Pachocki, Weizhu Chen, Jianfeng Gao:
Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer. NeurIPS 2021: 17084-17097 - [i47]Sewon Min, Jordan L. Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick S. H. Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Sejr Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih:
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned. CoRR abs/2101.00133 (2021) - [i46]Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao:
UnitedQA: A Hybrid Approach for Open Domain Question Answering. CoRR abs/2101.00178 (2021) - [i45]Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen:
Reader-Guided Passage Reranking for Open-Domain Question Answering. CoRR abs/2101.00294 (2021) - [i44]Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen:
What Makes Good In-Context Examples for GPT-3? CoRR abs/2101.06804 (2021) - [i43]Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Tuo Zhao:
Token-wise Curriculum Learning for Neural Machine Translation. CoRR abs/2103.11088 (2021) - [i42]Jungo Kasai, Hao Peng, Yizhe Zhang, Dani Yogatama, Gabriel Ilharco, Nikolaos Pappas, Yi Mao, Weizhu Chen, Noah A. Smith:
Finetuning Pretrained Transformers into RNNs. CoRR abs/2103.13076 (2021) - [i41]Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach. CoRR abs/2104.04886 (2021) - [i40]Tianyu Liu, Yizhe Zhang, Chris Brockett, Yi Mao, Zhifang Sui, Weizhu Chen, Bill Dolan:
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation. CoRR abs/2104.08704 (2021) - [i39]Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen:
Poolingformer: Long Document Modeling with Pooling Attention. CoRR abs/2105.04371 (2021) - [i38]Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao, Weizhu Chen:
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization. CoRR abs/2105.12002 (2021) - [i37]Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen:
Memory-Efficient Differentiable Transformer Architecture Search. CoRR abs/2105.14669 (2021) - [i36]Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang:
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization. CoRR abs/2106.00149 (2021) - [i35]Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Weizhu Chen:
LoRA: Low-Rank Adaptation of Large Language Models. CoRR abs/2106.09685 (2021) - [i34]Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao:
ARCH: Efficient Adversarial Regularized Training with Caching. CoRR abs/2109.07048 (2021) - [i33]Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan:
XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge. CoRR abs/2109.12573 (2021) - [i32]Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen:
Adversarial Retriever-Ranker for dense text retrieval. CoRR abs/2110.03611 (2021) - [i31]Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren:
A Good Prompt Is Worth Millions of Parameters? Low-resource Prompt-based Learning for Vision-Language Models. CoRR abs/2110.08484 (2021) - [i30]Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Zhangyang Wang, Ahmed Hassan Awadallah:
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models. CoRR abs/2111.00160 (2021) - [i29]Pengcheng He, Jianfeng Gao, Weizhu Chen:
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing. CoRR abs/2111.09543 (2021) - [i28]Sandra Sajeev, Jade Huang, Nikos Karampatziakis, Matthew Hall, Sebastian Kochman, Weizhu Chen:
Contextual Bandit Applications in Customer Support Bot. CoRR abs/2112.03210 (2021) - 2020
- [c37]Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao:
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding. ACL (demo) 2020: 118-126 - [c36]Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao:
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. ACL 2020: 2177-2190 - [c35]Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Jiawei Han:
Understanding the Difficulty of Training Transformers. EMNLP (1) 2020: 5747-5763 - [c34]Tao Shen, Yi Mao, Pengcheng He, Guodong Long, Adam Trischler, Weizhu Chen:
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning. EMNLP (1) 2020: 8980-8994 - [c33]Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han:
On the Variance of the Adaptive Learning Rate and Beyond. ICLR 2020 - [c32]Sewon Min, Jordan L. Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick S. H. Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Sejr Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih:
NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned. NeurIPS (Competition and Demos) 2020: 86-111 - [i27]Yujia Xie, Tianyi Zhou, Yi Mao, Weizhu Chen:
Conditional Self-Attention for Query-based Summarization. CoRR abs/2002.07338 (2020) - [i26]Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao:
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding. CoRR abs/2002.07972 (2020) - [i25]Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Jiawei Han:
Understanding the Difficulty of Training Transformers. CoRR abs/2004.08249 (2020) - [i24]Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao:
Adversarial Training for Large Neural Language Models. CoRR abs/2004.08994 (2020) - [i23]Tao Shen, Yi Mao, Pengcheng He, Guodong Long, Adam Trischler, Weizhu Chen:
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning. CoRR abs/2004.14224 (2020) - [i22]Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen:
DeBERTa: Decoding-enhanced BERT with Disentangled Attention. CoRR abs/2006.03654 (2020) - [i21]Morteza Ziyadi, Yuting Sun, Abhishek Goswami, Jade Huang, Weizhu Chen:
Example-Based Named Entity Recognition. CoRR abs/2008.10570 (2020) - [i20]Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen:
Generation-Augmented Retrieval for Open-domain Question Answering. CoRR abs/2009.08553 (2020) - [i19]Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen:
A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation. CoRR abs/2009.13818 (2020) - [i18]Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao:
Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model. CoRR abs/2010.06040 (2020) - [i17]Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Jiawei Han, Weizhu Chen:
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding. CoRR abs/2010.08670 (2020) - [i16]Kevin J. Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin:
MixKD: Towards Efficient Distillation of Large-scale Language Models. CoRR abs/2011.00593 (2020) - [i15]Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan:
GLGE: A New General Language Generation Evaluation Benchmark. CoRR abs/2011.11928 (2020) - [i14]Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han:
Few-Shot Named Entity Recognition: A Comprehensive Study. CoRR abs/2012.14978 (2020) - [i13]Weizhen Qi, Yeyun Gong, Jian Jiao, Yu Yan, Dayiheng Liu, Weizhu Chen, Kewen Tang, Houqiang Li, Jiusheng Chen, Ruofei Zhang, Ming Zhou, Nan Duan:
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining. CoRR abs/2012.15525 (2020)
2010 – 2019
- 2019
- [j1]Lin Xiao, Adams Wei Yu, Qihang Lin, Weizhu Chen:
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization. J. Mach. Learn. Res. 20: 43:1-43:58 (2019) - [c31]Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao:
Multi-Task Deep Neural Networks for Natural Language Understanding. ACL (1) 2019: 4487-4496 - [c30]Ziyi Yang, Chenguang Zhu, Weizhu Chen:
Parameter-free Sentence Embedding via Orthogonal Basis. EMNLP/IJCNLP (1) 2019: 638-648 - [c29]Jianmo Ni, Chenguang Zhu, Weizhu Chen, Julian J. McAuley:
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering. NAACL-HLT (1) 2019: 335-344 - [i12]Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao:
Multi-Task Deep Neural Networks for Natural Language Understanding. CoRR abs/1901.11504 (2019) - [i11]Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao:
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding. CoRR abs/1904.09482 (2019) - [i10]Nikos Karampatziakis, Sebastian Kochman, Jade Huang, Paul Mineiro, Kathy Osborne, Weizhu Chen:
Lessons from Real-World Reinforcement Learning in a Customer Support Bot. CoRR abs/1905.02219 (2019) - [i9]Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao:
A Hybrid Neural Network Model for Commonsense Reasoning. CoRR abs/1907.11983 (2019) - [i8]Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han:
On the Variance of the Adaptive Learning Rate and Beyond. CoRR abs/1908.03265 (2019) - [i7]Pengcheng He, Yi Mao, Kaushik Chakrabarti, Weizhu Chen:
X-SQL: reinforce schema representation with context. CoRR abs/1908.08113 (2019) - [i6]Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao:
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. CoRR abs/1911.03437 (2019) - 2018
- [c28]Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, Weizhu Chen:
FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension. ICLR (Poster) 2018 - [i5]Jianmo Ni, Chenguang Zhu, Weizhu Chen, Julian J. McAuley:
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Scientific Question Answering. CoRR abs/1808.09492 (2018) - [i4]Tianze Shi, Kedar Tatwawadi, Kaushik Chakrabarti, Yi Mao, Oleksandr Polozov, Weizhu Chen:
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles. CoRR abs/1809.05054 (2018) - [i3]Ziyi Yang, Chenguang Zhu, Weizhu Chen:
Zero-training Sentence Embedding via Orthogonal Basis. CoRR abs/1810.00438 (2018) - 2017
- [c27]Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen:
ReasoNet: Learning to Stop Reading in Machine Comprehension. KDD 2017: 1047-1055 - [c26]Ching-Pei Lee, Po-Wei Wang, Weizhu Chen, Chih-Jen Lin:
Limited-memory Common-directions Method for Distributed Optimization and its Application on Empirical Risk Minimization. SDM 2017: 732-740 - [i2]Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, Weizhu Chen:
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension. CoRR abs/1711.07341 (2017) - 2016
- [c25]Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen:
ReasoNet: Learning to Stop Reading in Machine Comprehension. CoCo@NIPS 2016 - [i1]Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen:
ReasoNet: Learning to Stop Reading in Machine Comprehension. CoRR abs/1609.05284 (2016) - 2014
- [c24]Yangqiu Song, Haixun Wang, Weizhu Chen, Shusen Wang:
Transfer Understanding from Head Queries to Tail Queries. CIKM 2014: 1299-1308 - [c23]Weizhu Chen, Zhenghao Wang, Jingren Zhou:
Large-scale L-BFGS using MapReduce. NIPS 2014: 1332-1340 - 2012
- [c22]Weizhu Chen, Dong Wang, Yuchen Zhang, Zheng Chen, Adish Singla, Qiang Yang:
A noise-aware click model for web search. WSDM 2012: 313-322 - [c21]Si Shen, Botao Amber Hu, Weizhu Chen, Qiang Yang:
Personalized click model through collaborative filtering. WSDM 2012: 323-332 - [c20]Danqi Chen, Weizhu Chen, Haixun Wang, Zheng Chen, Qiang Yang:
Beyond ten blue links: enabling user click modeling in federated web search. WSDM 2012: 463-472 - 2011
- [c19]Weizhu Chen, Zhanglong Ji, Si Shen, Qiang Yang:
A Whole Page Click Model to Better Interpret Search Engine Click Data. AAAI 2011: 1140-1145 - [c18]Danqi Chen, Weizhu Chen, Qiang Yang:
Characterizing Inverse Time Dependency in Multi-class Learning. ICDM 2011: 1020-1025 - [c17]Yangqiu Song, Haixun Wang, Zhongyuan Wang, Hongsong Li, Weizhu Chen:
Short Text Conceptualization Using a Probabilistic Knowledgebase. IJCAI 2011: 2330-2336 - [c16]Yuchen Zhang, Weizhu Chen, Dong Wang, Qiang Yang:
User-click modeling for understanding and predicting search-behavior. KDD 2011: 1388-1396 - [c15]Dakan Wang, Gang Wang, Xiaofeng Ke, Weizhu Chen:
Action prediction and identification from mining temporal user behaviors. WSDM 2011: 435-444 - [c14]Botao Amber Hu, Yuchen Zhang, Weizhu Chen, Gang Wang, Qiang Yang:
Characterizing search intent diversity into click models. WWW 2011: 17-26 - 2010
- [c13]Yuchen Zhang, Dong Wang, Gang Wang, Weizhu Chen, Zhihua Zhang, Botao Amber Hu, Li Zhang:
Learning click models via probit bayesian inference. CIKM 2010: 439-448 - [c12]Dong Wang, Weizhu Chen, Gang Wang, Yuchen Zhang, Botao Amber Hu:
Explore click models for search ranking. CIKM 2010: 1417-1420 - [c11]Feimin Zhong, Dong Wang, Gang Wang, Weizhu Chen, Yuchen Zhang, Zheng Chen, Haixun Wang:
Incorporating post-click behaviors into a click model. SIGIR 2010: 355-362 - [c10]Zeyuan Allen Zhu, Weizhu Chen, Tom Minka, Chenguang Zhu, Zheng Chen:
A novel click model and its applications to online advertising. WSDM 2010: 321-330 - [c9]Dong Wang, Chenguang Zhu, Weizhu Chen, Gang Wang, Zheng Chen:
Co-optimization of multiple relevance metrics in web search. WWW 2010: 1199-1200
2000 – 2009
- 2009
- [c8]Chenguang Zhu, Weizhu Chen, Zeyuan Allen Zhu, Gang Wang, Dong Wang, Zheng Chen:
A general magnitude-preserving boosting algorithm for search ranking. CIKM 2009: 817-826 - [c7]Zeyuan Allen Zhu, Weizhu Chen, Tao Wan, Chenguang Zhu, Gang Wang, Zheng Chen:
To divide and conquer search ranking by learning query difficulty. CIKM 2009: 1883-1886 - [c6]Zeyuan Allen Zhu, Weizhu Chen, Chenguang Zhu, Gang Wang, Haixun Wang, Zheng Chen:
Inverse Time Dependency in Convex Regularized Learning. ICDM 2009: 667-676 - [c5]Zeyuan Allen Zhu, Weizhu Chen, Gang Wang, Chenguang Zhu, Zheng Chen:
P-packSVM: Parallel Primal grAdient desCent Kernel SVM. ICDM 2009: 677-686 - 2008
- [c4]Rong Hu, Weizhu Chen, Jian Hu, Yansheng Lu, Zheng Chen, Qiang Yang:
Mining Translations of Web Queries from Web Click-through Data. AAAI 2008: 1144-1149 - [c3]Rong Hu, Weizhu Chen, Peng Bai, Yansheng Lu, Zheng Chen, Qiang Yang:
Web query translation via web log mining. SIGIR 2008: 749-750 - 2007
- [c2]Dou Shen, Min Qin, Weizhu Chen, Qiang Yang, Zheng Chen:
Mining Web Query Hierarchies from Clickthrough Data. AAAI 2007: 341-346 - [c1]Weizhu Chen, Jun Yan, Benyu Zhang, Zheng Chen, Qiang Yang:
Document Transformation for Multi-label Feature Selection in Text Categorization. ICDM 2007: 451-456
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-22 19:00 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint