default search action
Nicolas Heess
Person information
- affiliation: University College London, Centre for Computational Statistics and Machine Learning
- affiliation: University of Edinburgh, Institute for Adaptive and Neural Computation
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j11]Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess:
Learning agile soccer skills for a bipedal robot with deep reinforcement learning. Sci. Robotics 9(89) (2024) - [j10]Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Manon Devin, Alex X. Lee, Maria Bauzá Villalonga, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Fernandes Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Zolna, Scott E. Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Thomas Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin A. Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess:
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation. Trans. Mach. Learn. Res. 2024 (2024) - [c96]Noah Y. Siegel, Oana-Maria Camburu, Nicolas Heess, María Pérez-Ortiz:
The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models. ACL (Short Papers) 2024: 530-546 - [c95]Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Z. Leibo, Nicolas Heess:
Neural Population Learning beyond Symmetric Zero-Sum Games. AAMAS 2024: 1247-1255 - [c94]Siqi Liu, Luke Marris, Georgios Piliouras, Ian Gemp, Nicolas Heess:
NfgTransformer: Equivariant Representation Learning for Normal-form Games. ICLR 2024 - [c93]Dhruva Tirumala, Thomas Lampe, José Enrique Chen, Tuomas Haarnoja, Sandy H. Huang, Guy Lever, Ben Moran, Tim Hertweck, Leonard Hasenclever, Martin A. Riedmiller, Nicolas Heess, Markus Wulfmeier:
Replay across Experiments: A Natural Extension of Off-Policy RL. ICLR 2024 - [c92]Jake Bruce, Michael D. Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal M. P. Behbahani, Stephanie C. Y. Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott E. Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel:
Genie: Generative Interactive Environments. ICML 2024 - [c91]Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter:
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs. ICML 2024 - [c90]Jost Tobias Springenberg, Abbas Abdolmaleki, Jingwei Zhang, Oliver Groth, Michael Bloesch, Thomas Lampe, Philemon Brakel, Sarah Bechtle, Steven Kapturowski, Roland Hafner, Nicolas Heess, Martin A. Riedmiller:
Offline Actor-Critic Reinforcement Learning Scales to Large Models. ICML 2024 - [c89]Abby O'Neill, Abdul Rehman, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alexander Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie, Anthony Brohan, Antonin Raffin, Archit Sharma, Arefeh Yavary, Arhan Jain, Ashwin Balakrishna, Ayzaan Wahid, Ben Burgess-Limerick, Beomjoon Kim, Bernhard Schölkopf, Blake Wulfe, Brian Ichter, Cewu Lu, Charles Xu, Charlotte Le, Chelsea Finn, Chen Wang, Chenfeng Xu, Cheng Chi, Chenguang Huang, Christine Chan, Christopher Agia, Chuer Pan, Chuyuan Fu, Coline Devin, Danfei Xu, Daniel Morton, Danny Driess, Daphne Chen, Deepak Pathak, Dhruv Shah, Dieter Büchler, Dinesh Jayaraman, Dmitry Kalashnikov, Dorsa Sadigh, Edward Johns, Ethan Paul Foster, Fangchen Liu, Federico Ceola, Fei Xia, Feiyu Zhao, Freek Stulp, Gaoyue Zhou, Gaurav S. Sukhatme, Gautam Salhotra, Ge Yan, Gilbert Feng, Giulio Schiavi, Glen Berseth, Gregory Kahn, Guanzhi Wang, Hao Su, Haoshu Fang, Haochen Shi, Henghui Bao, Heni Ben Amor, Henrik I. Christensen, Hiroki Furuta, Homer Walke, Hongjie Fang, Huy Ha, Igor Mordatch, Ilija Radosavovic, Isabel Leal, Jacky Liang, Jad Abou-Chakra, Jaehyung Kim, Jaimyn Drake, Jan Peters, Jan Schneider, Jasmine Hsu, Jeannette Bohg, Jeffrey Bingham, Jeffrey Wu, Jensen Gao, Jiaheng Hu, Jiajun Wu, Jialin Wu, Jiankai Sun, Jianlan Luo, Jiayuan Gu, Jie Tan, Jihoon Oh, Jimmy Wu, Jingpei Lu, Jingyun Yang, Jitendra Malik, João Silvério, Joey Hejna, Jonathan Booher, Jonathan Tompson, Jonathan Yang, Jordi Salvador, Joseph J. Lim, Junhyek Han, Kaiyuan Wang, Kanishka Rao, Karl Pertsch, Karol Hausman, Keegan Go, Keerthana Gopalakrishnan, Ken Goldberg, Kendra Byrne, Kenneth Oslund, Kento Kawaharazuka, Kevin Black, Kevin Lin, Kevin Zhang, Kiana Ehsani, Kiran Lekkala, Kirsty Ellis, Krishan Rana, Krishnan Srinivasan, Kuan Fang, Kunal Pratap Singh, Kuo-Hao Zeng, Kyle Hatch, Kyle Hsu, Laurent Itti, Lawrence Yunliang Chen, Lerrel Pinto, Li Fei-Fei, Liam Tan, Linxi Jim Fan, Lionel Ott, Lisa Lee, Luca Weihs, Magnum Chen, Marion Lepert, Marius Memmel, Masayoshi Tomizuka, Masha Itkina, Mateo Guaman Castro, Max Spero, Maximilian Du, Michael Ahn, Michael C. Yip, Mingtong Zhang, Mingyu Ding, Minho Heo, Mohan Kumar Srirama, Mohit Sharma, Moo Jin Kim, Naoaki Kanazawa, Nicklas Hansen, Nicolas Heess, Nikhil J. Joshi, Niko Sünderhauf, Ning Liu, Norman Di Palo, Nur Muhammad (Mahi) Shafiullah, Oier Mees, Oliver Kroemer, Osbert Bastani, Pannag R. Sanketi, Patrick Tree Miller, Patrick Yin, Paul Wohlhart, Peng Xu, Peter David Fagan, Peter Mitrano, Pierre Sermanet, Pieter Abbeel, Priya Sundaresan, Qiuyu Chen, Quan Vuong, Rafael Rafailov, Ran Tian, Ria Doshi, Roberto Martín-Martín, Rohan Baijal, Rosario Scalise, Rose Hendrix, Roy Lin, Runjia Qian, Ruohan Zhang, Russell Mendonca, Rutav Shah, Ryan Hoque, Ryan Julian, Samuel Bustamante, Sean Kirmani, Sergey Levine, Shan Lin, Sherry Moore, Shikhar Bahl, Shivin Dass, Shubham D. Sonawani, Shuran Song, Sichun Xu, Siddhant Haldar, Siddharth Karamcheti, Simeon Adebola, Simon Guist, Soroush Nasiriany, Stefan Schaal, Stefan Welker, Stephen Tian, Subramanian Ramamoorthy, Sudeep Dasari, Suneel Belkhale, Sungjae Park, Suraj Nair, Suvir Mirchandani, Takayuki Osa, Tanmay Gupta, Tatsuya Harada, Tatsuya Matsushima, Ted Xiao, Thomas Kollar, Tianhe Yu, Tianli Ding, Todor Davchev, Tony Z. Zhao, Travis Armstrong, Trevor Darrell, Trinity Chung, Vidhi Jain, Vincent Vanhoucke, Wei Zhan, Wenxuan Zhou, Wolfram Burgard, Xi Chen, Xiaolong Wang, Xinghao Zhu, Xinyang Geng, Xiyuan Liu, Liangwei Xu, Xuanlin Li, Yao Lu, Yecheng Jason Ma, Yejin Kim, Yevgen Chebotar, Yifan Zhou, Yifeng Zhu, Yilin Wu, Ying Xu, Yixuan Wang, Yonatan Bisk, Yoonyoung Cho, Youngwoon Lee, Yuchen Cui, Yue Cao, Yueh-Hua Wu, Yujin Tang, Yuke Zhu, Yunchu Zhang, Yunfan Jiang, Yunshuang Li, Yunzhu Li, Yusuke Iwasawa, Yutaka Matsuo, Zehan Ma, Zhuo Xu, Zichen Jeff Cui, Zichen Zhang, Zipeng Lin:
Open X-Embodiment: Robotic Learning Datasets and RT-X Models : Open X-Embodiment Collaboration. ICRA 2024: 6892-6903 - [c88]Thomas Lampe, Abbas Abdolmaleki, Sarah Bechtle, Sandy H. Huang, Jost Tobias Springenberg, Michael Bloesch, Oliver Groth, Roland Hafner, Tim Hertweck, Michael Neunert, Markus Wulfmeier, Jingwei Zhang, Francesco Nori, Nicolas Heess, Martin A. Riedmiller:
Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots. ICRA 2024: 7772-7779 - [i131]Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Z. Leibo, Nicolas Heess:
Neural Population Learning beyond Symmetric Zero-sum Games. CoRR abs/2401.05133 (2024) - [i130]Jost Tobias Springenberg, Abbas Abdolmaleki, Jingwei Zhang, Oliver Groth, Michael Bloesch, Thomas Lampe, Philemon Brakel, Sarah Bechtle, Steven Kapturowski, Roland Hafner, Nicolas Heess, Martin A. Riedmiller:
Offline Actor-Critic Reinforcement Learning Scales to Large Models. CoRR abs/2402.05546 (2024) - [i129]Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter:
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs. CoRR abs/2402.07872 (2024) - [i128]Siqi Liu, Luke Marris, Georgios Piliouras, Ian Gemp, Nicolas Heess:
NfgTransformer: Equivariant Representation Learning for Normal-form Games. CoRR abs/2402.08393 (2024) - [i127]Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauzá, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil J. Joshi, Ben Jyenis, J. Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore, Ken Oslund, Dushyant Rao, Allen Z. Ren, Baruch Tabanpour, Quan Vuong, Ayzaan Wahid, Ted Xiao, Ying Xu, Vincent Zhuang, Peng Xu, Erik Frey, Ken Caluwaerts, Tingnan Zhang, Brian Ichter, Jonathan Tompson, Leila Takayama, Vincent Vanhoucke, Izhak Shafran, Maja J. Mataric, Dorsa Sadigh, Nicolas Heess, Kanishka Rao, Nik Stewart, Jie Tan, Carolina Parada:
Learning to Learn Faster from Human Feedback with Language Model Predictive Control. CoRR abs/2402.11450 (2024) - [i126]Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal M. P. Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott E. Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel:
Genie: Generative Interactive Environments. CoRR abs/2402.15391 (2024) - [i125]Noah Y. Siegel, Oana-Maria Camburu, Nicolas Heess, María Pérez-Ortiz:
The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models. CoRR abs/2404.03189 (2024) - [i124]Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy H. Huang, Jan Humplik, Guy Lever, Tuomas Haarnoja, Leonard Hasenclever, Arunkumar Byravan, Nathan Batchelor, Neil Sreendra, Kushal Patel, Marlon Gwira, Francesco Nori, Martin A. Riedmiller, Nicolas Heess:
Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning. CoRR abs/2405.02425 (2024) - [i123]Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, Eva Kanso:
Deep Dive into Model-free Reinforcement Learning for Biological and Robotic Systems: Theory and Practice. CoRR abs/2405.11457 (2024) - [i122]Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Ávila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana Borsa, Arthur Guez, Will Dabney:
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning. CoRR abs/2406.02035 (2024) - [i121]Maria Bauzá, José Enrique Chen, Valentin Dalibard, Nimrod Gileadi, Roland Hafner, Murilo F. Martins, Joss Moore, Rugile Pevceviciute, Antoine Laurens, Dushyant Rao, Martina Zambelli, Martin A. Riedmiller, Jon Scholz, Konstantinos Bousmalis, Francesco Nori, Nicolas Heess:
DemoStart: Demonstration-led auto-curriculum applied to sim-to-real with multi-fingered robots. CoRR abs/2409.06613 (2024) - [i120]Abbas Abdolmaleki, Bilal Piot, Bobak Shahriari, Jost Tobias Springenberg, Tim Hertweck, Rishabh Joshi, Junhyuk Oh, Michael Bloesch, Thomas Lampe, Nicolas Heess, Jonas Buchli, Martin A. Riedmiller:
Preference Optimization as Probabilistic Inference. CoRR abs/2410.04166 (2024) - 2023
- [j9]Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin A. Riedmiller:
SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration. Trans. Mach. Learn. Res. 2023 (2023) - [c87]Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb:
Representation Learning in Deep RL via Discrete Information Bottleneck. AISTATS 2023: 8699-8722 - [c86]Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montserrat Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia:
Language to Rewards for Robotic Skill Synthesis. CoRL 2023: 374-404 - [c85]Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Curtis Mozer, Nicolas Heess, Yoshua Bengio:
Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning. ICLR 2023 - [c84]Mohit Sharma, Claudio Fantacci, Yuxiang Zhou, Skanda Koppula, Nicolas Heess, Jon Scholz, Yusuf Aytar:
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation. ICLR 2023 - [c83]Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess:
NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields. ICRA 2023: 9362-9369 - [c82]Joe Watson, Sandy H. Huang, Nicolas Heess:
Coherent Soft Imitation Learning. NeurIPS 2023 - [i119]Jingwei Zhang, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Abbas Abdolmaleki, Dushyant Rao, Nicolas Heess, Martin A. Riedmiller:
Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains. CoRR abs/2302.12617 (2023) - [i118]Mohit Sharma, Claudio Fantacci, Yuxiang Zhou, Skanda Koppula, Nicolas Heess, Jon Scholz, Yusuf Aytar:
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation. CoRR abs/2304.06600 (2023) - [i117]Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Markus Wulfmeier, Jan Humplik, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess:
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning. CoRR abs/2304.13653 (2023) - [i116]Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto, Martin A. Riedmiller, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Nicolas Heess:
A Generalist Dynamics Model for Control. CoRR abs/2305.10912 (2023) - [i115]Ken Caluwaerts, Atil Iscen, J. Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, José Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Tsang-Wei Edward Lee, Linda Luu, Ofir Nachum, Ken Oslund, Jason Powell, Diego Reyes, Francesco Romano, Fereshteh Sadeghi, Ron Sloat, Baruch Tabanpour, Daniel Zheng, Michael Neunert, Raia Hadsell, Nicolas Heess, Francesco Nori, Jeff Seto, Carolina Parada, Vikas Sindhwani, Vincent Vanhoucke, Jie Tan:
Barkour: Benchmarking Animal-level Agility with Quadruped Robots. CoRR abs/2305.14654 (2023) - [i114]Joe Watson, Sandy H. Huang, Nicolas Heess:
Coherent Soft Imitation Learning. CoRR abs/2305.16498 (2023) - [i113]Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia:
Language to Rewards for Robotic Skill Synthesis. CoRR abs/2306.08647 (2023) - [i112]Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauzá, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo F. Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Zolna, Scott E. Reed, Sergio Gómez Colmenarejo, Jon Scholz, Abbas Abdolmaleki, Oliver Groth, Jean-Baptiste Regli, Oleg Sushkov, Thomas Rothörl, José Enrique Chen, Yusuf Aytar, Dave Barker, Joy Ortiz, Martin A. Riedmiller, Jost Tobias Springenberg, Raia Hadsell, Francesco Nori, Nicolas Heess:
RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation. CoRR abs/2306.11706 (2023) - [i111]Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess, Martin A. Riedmiller:
Towards A Unified Agent with Foundation Models. CoRR abs/2307.09668 (2023) - [i110]Shruti Mishra, Ankit Anand, Jordan Hoffmann, Nicolas Heess, Martin A. Riedmiller, Abbas Abdolmaleki, Doina Precup:
Policy composition in reinforcement learning via multi-objective policy optimization. CoRR abs/2308.15470 (2023) - [i109]Zhe Wang, Petar Velickovic, Daniel Hennes, Nenad Tomasev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, William Spearman, Ian Graham, Jerome T. Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess, Michael Bowling, Demis Hassabis, Karl Tuyls:
TacticAI: an AI assistant for football tactics. CoRR abs/2310.10553 (2023) - [i108]Dhruva Tirumala, Thomas Lampe, José Enrique Chen, Tuomas Haarnoja, Sandy H. Huang, Guy Lever, Ben Moran, Tim Hertweck, Leonard Hasenclever, Martin A. Riedmiller, Nicolas Heess, Markus Wulfmeier:
Replay across Experiments: A Natural Extension of Off-Policy RL. CoRR abs/2311.15951 (2023) - [i107]Markus Wulfmeier, Arunkumar Byravan, Sarah Bechtle, Karol Hausman, Nicolas Heess:
Foundations for Transfer in Reinforcement Learning: A Taxonomy of Knowledge Modalities. CoRR abs/2312.01939 (2023) - [i106]Thomas Lampe, Abbas Abdolmaleki, Sarah Bechtle, Sandy H. Huang, Jost Tobias Springenberg, Michael Bloesch, Oliver Groth, Roland Hafner, Tim Hertweck, Michael Neunert, Markus Wulfmeier, Jingwei Zhang, Francesco Nori, Nicolas Heess, Martin A. Riedmiller:
Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots. CoRR abs/2312.11374 (2023) - 2022
- [j8]Dhruva Tirumala, Alexandre Galashov, Hyeonwoo Noh, Leonard Hasenclever, Razvan Pascanu, Jonathan Schwarz, Guillaume Desjardins, Wojciech Marian Czarnecki, Arun Ahuja, Yee Whye Teh, Nicolas Heess:
Behavior Priors for Efficient Reinforcement Learning. J. Mach. Learn. Res. 23: 221:1-221:68 (2022) - [j7]Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess:
From motor control to team play in simulated humanoid football. Sci. Robotics 7(69) (2022) - [j6]Scott E. Reed, Konrad Zolna, Emilio Parisotto, Sergio Gómez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas:
A Generalist Agent. Trans. Mach. Learn. Res. 2022 (2022) - [c81]Wenxuan Zhou, Steven Bohez, Jan Humplik, Nicolas Heess, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja:
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data. CoLLAs 2022: 294-309 - [c80]Sasha Salter, Markus Wulfmeier, Dhruva Tirumala, Nicolas Heess, Martin A. Riedmiller, Raia Hadsell, Dushyant Rao:
MO2: Model-Based Offline Options. CoLLAs 2022: 902-919 - [c79]Jongmin Lee, Cosmin Paduraru, Daniel J. Mankowitz, Nicolas Heess, Doina Precup, Kee-Eung Kim, Arthur Guez:
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation. ICLR 2022 - [c78]Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas Abdolmaleki, Nicolas Heess, Josh Merel, Martin A. Riedmiller:
Evaluating Model-Based Planning and Planner Amortization for Continuous Control. ICLR 2022 - [c77]Siqi Liu, Luke Marris, Daniel Hennes, Josh Merel, Nicolas Heess, Thore Graepel:
NeuPL: Neural Population Learning. ICLR 2022 - [c76]Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell:
Learning transferable motor skills with hierarchical latent mixture policies. ICLR 2022 - [c75]Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adrià Puigdomènech Badia, Arthur Guez, Mehdi Mirza, Peter Conway Humphreys, Ksenia Konyushkova, Michal Valko, Simon Osindero, Timothy P. Lillicrap, Nicolas Heess, Charles Blundell:
Retrieval-Augmented Reinforcement Learning. ICML 2022: 7740-7765 - [c74]Siqi Liu, Marc Lanctot, Luke Marris, Nicolas Heess:
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games. ICML 2022: 13793-13806 - [c73]Tony Z. Zhao, Jianlan Luo, Oleg Sushkov, Rugile Pevceviciute, Nicolas Heess, Jon Scholz, Stefan Schaal, Sergey Levine:
Offline Meta-Reinforcement Learning for Industrial Insertion. ICRA 2022: 6386-6393 - [c72]Philémon Brakel, Steven Bohez, Leonard Hasenclever, Nicolas Heess, Konstantinos Bousmalis:
Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner. IROS 2022: 10335-10342 - [c71]Alexandre Galashov, Joshua Scott Merel, Nicolas Heess:
Data augmentation for efficient learning from parametric experts. NeurIPS 2022 - [d1]Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess:
Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football". Zenodo, 2022 - [i105]Siqi Liu, Luke Marris, Daniel Hennes, Josh Merel, Nicolas Heess, Thore Graepel:
NeuPL: Neural Population Learning. CoRR abs/2202.07415 (2022) - [i104]Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adrià Puigdomènech Badia, Arthur Guez, Mehdi Mirza, Ksenia Konyushkova, Michal Valko, Simon Osindero, Timothy P. Lillicrap, Nicolas Heess, Charles Blundell:
Retrieval-Augmented Reinforcement Learning. CoRR abs/2202.08417 (2022) - [i103]Steven Bohez, Saran Tunyasuvunakool, Philemon Brakel, Fereshteh Sadeghi, Leonard Hasenclever, Yuval Tassa, Emilio Parisotto, Jan Humplik, Tuomas Haarnoja, Roland Hafner, Markus Wulfmeier, Michael Neunert, Ben Moran, Noah Y. Siegel, Andrea Huber, Francesco Romano, Nathan Batchelor, Federico Casarini, Josh Merel, Raia Hadsell, Nicolas Heess:
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors. CoRR abs/2203.17138 (2022) - [i102]Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess:
Offline Distillation for Robot Lifelong Learning with Imbalanced Experience. CoRR abs/2204.05893 (2022) - [i101]Jongmin Lee, Cosmin Paduraru, Daniel J. Mankowitz, Nicolas Heess, Doina Precup, Kee-Eung Kim, Arthur Guez:
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation. CoRR abs/2204.08957 (2022) - [i100]Bobak Shahriari, Abbas Abdolmaleki, Arunkumar Byravan, Abe Friesen, Siqi Liu, Jost Tobias Springenberg, Nicolas Heess, Matt Hoffman, Martin A. Riedmiller:
Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach. CoRR abs/2204.10256 (2022) - [i99]Scott E. Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas:
A Generalist Agent. CoRR abs/2205.06175 (2022) - [i98]Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio:
Coordinating Policies Among Multiple Agents via an Intelligent Communication Channel. CoRR abs/2205.10607 (2022) - [i97]Alexandre Galashov, Josh Merel, Nicolas Heess:
Data augmentation for efficient learning from parametric experts. CoRR abs/2205.11448 (2022) - [i96]Siqi Liu, Marc Lanctot, Luke Marris, Nicolas Heess:
Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games. CoRR abs/2205.15879 (2022) - [i95]Sasha Salter, Markus Wulfmeier, Dhruva Tirumala, Nicolas Heess, Martin A. Riedmiller, Raia Hadsell, Dushyant Rao:
MO2: Model-Based Offline Options. CoRR abs/2209.01947 (2022) - [i94]Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio:
Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2210.03022 (2022) - [i93]Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess:
NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields. CoRR abs/2210.04932 (2022) - [i92]Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin A. Riedmiller:
SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration. CoRR abs/2211.13743 (2022) - [i91]Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb:
Representation Learning in Deep RL via Discrete Information Bottleneck. CoRR abs/2212.13835 (2022) - 2021
- [j5]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. J. Artif. Intell. Res. 71: 41-88 (2021) - [c70]Sandy H. Huang, Abbas Abdolmaleki, Giulia Vezzani, Philemon Brakel, Daniel J. Mankowitz, Michael Neunert, Steven Bohez, Yuval Tassa, Nicolas Heess, Martin A. Riedmiller, Raia Hadsell:
A Constrained Multi-Objective Reinforcement Learning Framework. CoRL 2021: 883-893 - [c69]Michael Bloesch, Jan Humplik, Viorica Patraucean, Roland Hafner, Tuomas Haarnoja, Arunkumar Byravan, Noah Yamamoto Siegel, Saran Tunyasuvunakool, Federico Casarini, Nathan Batchelor, Francesco Romano, Stefano Saliceti, Martin A. Riedmiller, S. M. Ali Eslami, Nicolas Heess:
Towards Real Robot Learning in the Wild: A Case Study in Bipedal Locomotion. CoRL 2021: 1502-1511 - [c68]Martin A. Riedmiller, Jost Tobias Springenberg, Roland Hafner, Nicolas Heess:
Collect & Infer - a fresh look at data-efficient Reinforcement Learning. CoRL 2021: 1736-1744 - [c67]Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Thomas S. Stepleton, Nicolas Heess, Arthur Guez, Eric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. ICML 2021: 7654-7664 - [c66]Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala, Noah Y. Siegel, Nicolas Heess, Martin A. Riedmiller:
Data-efficient Hindsight Off-policy Option Learning. ICML 2021: 11340-11350 - [c65]Steven Hansen, Guillaume Desjardins, Kate Baumli, David Warde-Farley, Nicolas Heess, Simon Osindero, Volodymyr Mnih:
Entropic Desired Dynamics for Intrinsic Control. NeurIPS 2021: 11436-11448 - [c64]Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Charles Blundell, Philippe Beaudoin, Nicolas Heess, Michael Mozer, Yoshua Bengio:
Neural Production Systems. NeurIPS 2021: 25673-25687 - [i90]Anirudh Goyal, Aniket Didolkar, Nan Rosemary Ke, Charles Blundell, Philippe Beaudoin, Nicolas Heess, Michael Mozer, Yoshua Bengio:
Neural Production Systems. CoRR abs/2103.01937 (2021) - [i89]Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess:
From Motor Control to Team Play in Simulated Humanoid Football. CoRR abs/2105.12196 (2021) - [i88]Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, András György, Csaba Szepesvári, Raia Hadsell, Nicolas Heess, Martin A. Riedmiller:
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning. CoRR abs/2106.08199 (2021) - [i87]Martin A. Riedmiller, Jost Tobias Springenberg, Roland Hafner, Nicolas Heess:
Collect & Infer - a fresh look at data-efficient Reinforcement Learning. CoRR abs/2108.10273 (2021) - [i86]Oliver Groth, Markus Wulfmeier, Giulia Vezzani, Vibhavari Dasagi, Tim Hertweck, Roland Hafner, Nicolas Heess, Martin A. Riedmiller:
Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration. CoRR abs/2109.08603 (2021) - [i85]Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa:
Learning Dynamics Models for Model Predictive Agents. CoRR abs/2109.14311 (2021) - [i84]Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas Abdolmaleki, Nicolas Heess, Josh Merel, Martin A. Riedmiller:
Evaluating model-based planning and planner amortization for continuous control. CoRR abs/2110.03363 (2021) - [i83]Tony Z. Zhao, Jianlan Luo, Oleg Sushkov, Rugile Pevceviciute, Nicolas Heess, Jonathan Scholz, Stefan Schaal, Sergey Levine:
Offline Meta-Reinforcement Learning for Industrial Insertion. CoRR abs/2110.04276 (2021) - [i82]Philemon Brakel, Steven Bohez, Leonard Hasenclever, Nicolas Heess, Konstantinos Bousmalis:
Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner. CoRR abs/2111.00262 (2021) - [i81]Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell:
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies. CoRR abs/2112.05062 (2021) - 2020
- [j4]Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy P. Lillicrap, Nicolas Heess, Yuval Tassa:
dm_control: Software and tasks for continuous control. Softw. Impacts 6: 100022 (2020) - [j3]Josh Merel, Saran Tunyasuvunakool, Arun Ahuja, Yuval Tassa, Leonard Hasenclever, Vu Pham, Tom Erez, Greg Wayne, Nicolas Heess:
Catch & Carry: reusable neural controllers for vision-guided whole-body tasks. ACM Trans. Graph. 39(4): 39 (2020) - [c63]Lars Buesing, Nicolas Heess, Theophane Weber:
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions. AISTATS 2020: 624-634 - [c62]Rae Jeong, Jost Tobias Springenberg, Jackie Kay, Daniel Zheng, Alexandre Galashov, Nicolas Heess, Francesco Nori:
Learning Dexterous Manipulation from Suboptimal Experts. CoRL 2020: 915-934 - [c61]Roland Hafner, Tim Hertweck, Philipp Klöppner, Michael Bloesch, Michael Neunert, Markus Wulfmeier, Saran Tunyasuvunakool, Nicolas Heess, Martin A. Riedmiller:
Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion. CoRL 2020: 1084-1099 - [c60]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. ICLR 2020 - [c59]Noah Y. Siegel, Jost Tobias Springenberg, Felix Berkenkamp, Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin A. Riedmiller:
Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning. ICLR 2020 - [c58]H. Francis Song, Abbas Abdolmaleki, Jost Tobias Springenberg, Aidan Clark, Hubert Soyer, Jack W. Rae, Seb Noury, Arun Ahuja, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Dan Belov, Martin A. Riedmiller, Matthew M. Botvinick:
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control. ICLR 2020 - [c57]Abbas Abdolmaleki, Sandy H. Huang, Leonard Hasenclever, Michael Neunert, H. Francis Song, Martina Zambelli, Murilo F. Martins, Nicolas Heess, Raia Hadsell, Martin A. Riedmiller:
A distributional view on multi-objective policy optimization. ICML 2020: 11-22 - [c56]Leonard Hasenclever, Fabio Pardo, Raia Hadsell, Nicolas Heess, Josh Merel:
CoMic: Complementary Task Learning & Mimicry for Reusable Skills. ICML 2020: 4105-4115 - [c55]Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Çaglar Gülçehre, Siddhant M. Jayakumar, Max Jaderberg, Raphaël Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell:
Stabilizing Transformers for Reinforcement Learning. ICML 2020: 7487-7498 - [c54]Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh Merel, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Y. Siegel, Çaglar Gülçehre, Nicolas Heess, Nando de Freitas:
Critic Regularized Regression. NeurIPS 2020 - [c53]Arthur Guez, Fabio Viola, Theophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. NeurIPS 2020 - [c52]Çaglar Gülçehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas:
RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning. NeurIPS 2020 - [c51]Guy Lorberbom, Chris J. Maddison, Nicolas Heess, Tamir Hazan, Daniel Tarlow:
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces. NeurIPS 2020 - [c50]Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Noah Y. Siegel, Tim Hertweck, Thomas Lampe, Nicolas Heess, Martin A. Riedmiller:
Compositional Transfer in Hierarchical Reinforcement Learning. Robotics: Science and Systems 2020 - [i80]Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin A. Riedmiller:
Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics. CoRR abs/2001.00449 (2020) - [i79]Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. CoRR abs/2002.08329 (2020) - [i78]Noah Y. Siegel, Jost Tobias Springenberg, Felix Berkenkamp, Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin A. Riedmiller:
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning. CoRR abs/2002.08396 (2020) - [i77]Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber:
Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning. CoRR abs/2004.11410 (2020) - [i76]Abbas Abdolmaleki, Sandy H. Huang, Leonard Hasenclever, Michael Neunert, H. Francis Song, Martina Zambelli, Murilo F. Martins, Nicolas Heess, Raia Hadsell, Martin A. Riedmiller:
A Distributional View on Multi-Objective Policy Optimization. CoRR abs/2005.07513 (2020) - [i75]Tim Hertweck, Martin A. Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Y. Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess:
Simple Sensor Intentions for Exploration. CoRR abs/2005.07541 (2020) - [i74]Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy P. Lillicrap, Nicolas Heess:
dm_control: Software and Tasks for Continuous Control. CoRR abs/2006.12983 (2020) - [i73]Çaglar Gülçehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gómez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas:
RL Unplugged: Benchmarks for Offline Reinforcement Learning. CoRR abs/2006.13888 (2020) - [i72]Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Y. Siegel, Josh Merel, Çaglar Gülçehre, Nicolas Heess, Nando de Freitas:
Critic Regularized Regression. CoRR abs/2006.15134 (2020) - [i71]Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala, Noah Y. Siegel, Nicolas Heess, Martin A. Riedmiller:
Data-efficient Hindsight Off-policy Option Learning. CoRR abs/2007.15588 (2020) - [i70]Roland Hafner, Tim Hertweck, Philipp Klöppner, Michael Bloesch, Michael Neunert, Markus Wulfmeier, Saran Tunyasuvunakool, Nicolas Heess, Martin A. Riedmiller:
Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion. CoRR abs/2008.12228 (2020) - [i69]Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl J. Friston, Nicolas Heess:
Action and Perception as Divergence Minimization. CoRR abs/2009.01791 (2020) - [i68]Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever, Rae Jeong, Yee Whye Teh, Nicolas Heess:
Importance Weighted Policy Learning and Adaption. CoRR abs/2009.04875 (2020) - [i67]Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Péter Karkus, Sébastien Racanière, Lars Buesing, Timothy P. Lillicrap, Nicolas Heess:
Physically Embedded Planning Problems: New Challenges for Reinforcement Learning. CoRR abs/2009.05524 (2020) - [i66]Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, Eva Kanso:
Learning to swim in potential flow. CoRR abs/2009.14280 (2020) - [i65]Péter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy P. Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber:
Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban. CoRR abs/2010.01298 (2020) - [i64]Sebastian Flennerhag, Jane X. Wang, Pablo Sprechmann, Francesco Visin, Alexandre Galashov, Steven Kapturowski, Diana L. Borsa, Nicolas Heess, André Barreto, Razvan Pascanu:
Temporal Difference Uncertainties as a Signal for Exploration. CoRR abs/2010.02255 (2020) - [i63]Jost Tobias Springenberg, Nicolas Heess, Daniel J. Mankowitz, Josh Merel, Arunkumar Byravan, Abbas Abdolmaleki, Jackie Kay, Jonas Degrave, Julian Schrittwieser, Yuval Tassa, Jonas Buchli, Dan Belov, Martin A. Riedmiller:
Local Search for Policy Iteration in Continuous Control. CoRR abs/2010.05545 (2020) - [i62]Rae Jeong, Jost Tobias Springenberg, Jackie Kay, Daniel Zheng, Yuxiang Zhou, Alexandre Galashov, Nicolas Heess, Francesco Nori:
Learning Dexterous Manipulation from Suboptimal Experts. CoRR abs/2010.08587 (2020) - [i61]Daniel J. Mankowitz, Dan A. Calian, Rae Jeong, Cosmin Paduraru, Nicolas Heess, Sumanth Dathathri, Martin A. Riedmiller, Timothy A. Mann:
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification. CoRR abs/2010.10644 (2020) - [i60]Dhruva Tirumala, Alexandre Galashov, Hyeonwoo Noh, Leonard Hasenclever, Razvan Pascanu, Jonathan Schwarz, Guillaume Desjardins, Wojciech Marian Czarnecki, Arun Ahuja, Yee Whye Teh, Nicolas Heess:
Behavior Priors for Efficient Reinforcement Learning. CoRR abs/2010.14274 (2020) - [i59]Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis:
Game Plan: What AI can do for Football, and What Football can do for AI. CoRR abs/2011.09192 (2020) - [i58]Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Marcus Hutter, Lars Buesing, Rémi Munos:
Counterfactual Credit Assignment in Model-Free Reinforcement Learning. CoRR abs/2011.09464 (2020)
2010 – 2019
- 2019
- [c49]Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. AISTATS 2019: 2231-2240 - [c48]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. AISTATS 2019: 2650-2660 - [c47]Diana Borsa, Nicolas Heess, Bilal Piot, Siqi Liu, Leonard Hasenclever, Rémi Munos, Olivier Pietquin:
Observational Learning by Reinforcement Learning. AAMAS 2019: 1117-1124 - [c46]Dylan Banarse, Yoram Bachrach, Siqi Liu, Guy Lever, Nicolas Heess, Chrisantha Fernando, Pushmeet Kohli, Thore Graepel:
The Body is Not a Given: Joint Agent Policy Learning and Morphology Evolution. AAMAS 2019: 1134-1142 - [c45]Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert, Thomas Lampe, Noah Y. Siegel, Nicolas Heess, Martin A. Riedmiller:
Imagined Value Gradients: Model-Based Policy Optimization with Tranferable Latent Dynamics Models. CoRL 2019: 566-589 - [c44]Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin A. Riedmiller:
Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics. CoRL 2019: 735-751 - [c43]Lars Buesing, Theophane Weber, Yori Zwols, Nicolas Heess, Sébastien Racanière, Arthur Guez, Jean-Baptiste Lespiau:
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search. ICLR (Poster) 2019 - [c42]Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess:
Information asymmetry in KL-regularized RL. ICLR (Poster) 2019 - [c41]Siqi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel:
Emergent Coordination Through Competition. ICLR (Poster) 2019 - [c40]Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Greg Wayne:
Hierarchical Visuomotor Control of Humanoids. ICLR (Poster) 2019 - [c39]Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess:
Neural Probabilistic Motor Primitives for Humanoid Control. ICLR (Poster) 2019 - [c38]Jonathan Uesato, Ananya Kumar, Csaba Szepesvári, Tom Erez, Avraham Ruderman, Keith Anderson, Krishnamurthy (Dj) Dvijotham, Nicolas Heess, Pushmeet Kohli:
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures. ICLR (Poster) 2019 - [c37]Jonathan J. Hunt, André Barreto, Timothy P. Lillicrap, Nicolas Heess:
Composing Entropic Policies using Divergence Correction. ICML 2019: 2911-2920 - [c36]Peter Sunehag, Guy Lever, Siqi Liu, Josh Merel, Nicolas Heess, Joel Z. Leibo, Edward Hughes, Tom Eccles, Thore Graepel:
Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems. ALIFE 2019: 103-110 - [c35]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476 - [i57]Carlos Florensa, Jonas Degrave, Nicolas Heess, Jost Tobias Springenberg, Martin A. Riedmiller:
Self-supervised Learning of Image Embedding for Continuous Control. CoRR abs/1901.00943 (2019) - [i56]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. CoRR abs/1901.01761 (2019) - [i55]Steven Bohez, Abbas Abdolmaleki, Michael Neunert, Jonas Buchli, Nicolas Heess, Raia Hadsell:
Value constrained model-free continuous control. CoRR abs/1902.04623 (2019) - [i54]Siqi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel:
Emergent Coordination Through Competition. CoRR abs/1902.07151 (2019) - [i53]Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Rémi Munos, Doina Precup:
The Termination Critic. CoRR abs/1902.09996 (2019) - [i52]Dhruva Tirumala, Hyeonwoo Noh, Alexandre Galashov, Leonard Hasenclever, Arun Ahuja, Greg Wayne, Razvan Pascanu, Yee Whye Teh, Nicolas Heess:
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL. CoRR abs/1903.07438 (2019) - [i51]Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess:
Information asymmetry in KL-regularized RL. CoRR abs/1905.01240 (2019) - [i50]Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alexander Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin J. Miller, Mohammad Gheshlaghi Azar, Ian Osband, Neil C. Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew M. Botvinick, Shane Legg:
Meta-learning of Sequential Strategies. CoRR abs/1905.03030 (2019) - [i49]Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh, Nicolas Heess:
Meta reinforcement learning as task inference. CoRR abs/1905.06424 (2019) - [i48]Guy Lorberbom, Chris J. Maddison, Nicolas Heess, Tamir Hazan, Daniel Tarlow:
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces. CoRR abs/1906.06062 (2019) - [i47]Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Y. Siegel, Nicolas Heess, Martin A. Riedmiller:
Regularized Hierarchical Policies for Compositional Transfer in Robotics. CoRR abs/1906.11228 (2019) - [i46]H. Francis Song, Abbas Abdolmaleki, Jost Tobias Springenberg, Aidan Clark, Hubert Soyer, Jack W. Rae, Seb Noury, Arun Ahuja, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Dan Belov, Martin A. Riedmiller, Matthew M. Botvinick:
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control. CoRR abs/1909.12238 (2019) - [i45]Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Pérolat, Siqi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Rémi Munos:
A Generalized Training Approach for Multiagent Learning. CoRR abs/1909.12823 (2019) - [i44]Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert, Thomas Lampe, Noah Y. Siegel, Nicolas Heess, Martin A. Riedmiller:
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models. CoRR abs/1910.04142 (2019) - [i43]Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Çaglar Gülçehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell:
Stabilizing Transformers for Reinforcement Learning. CoRR abs/1910.06764 (2019) - [i42]Lars Buesing, Nicolas Heess, Theophane Weber:
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions. CoRR abs/1910.06862 (2019) - [i41]Jonas Degrave, Abbas Abdolmaleki, Jost Tobias Springenberg, Nicolas Heess, Martin A. Riedmiller:
Quinoa: a Q-function You Infer Normalized Over Actions. CoRR abs/1911.01831 (2019) - [i40]Josh Merel, Saran Tunyasuvunakool, Arun Ahuja, Yuval Tassa, Leonard Hasenclever, Vu Pham, Tom Erez, Greg Wayne, Nicolas Heess:
Reusable neural skill embeddings for vision-guided whole body movement and object manipulation. CoRR abs/1911.06636 (2019) - [i39]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019) - 2018
- [c34]Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. ICLR (Poster) 2018 - [c33]Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy P. Lillicrap:
Distributed Distributional Deterministic Policy Gradients. ICLR (Poster) 2018 - [c32]Karol Hausman, Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin A. Riedmiller:
Learning an Embedding Space for Transferable Robot Skills. ICLR (Poster) 2018 - [c31]Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Nicolas Heess, Simon Osindero, Razvan Pascanu:
Mix & Match Agent Curricula for Reinforcement Learning. ICML 2018: 1095-1103 - [c30]Martin A. Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas Degrave, Tom Van de Wiele, Vlad Mnih, Nicolas Heess, Jost Tobias Springenberg:
Learning by Playing Solving Sparse Reward Tasks from Scratch. ICML 2018: 4341-4350 - [c29]Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin A. Riedmiller, Raia Hadsell, Peter W. Battaglia:
Graph Networks as Learnable Physics Engines for Inference and Control. ICML 2018: 4467-4476 - [c28]Yuke Zhu, Ziyu Wang, Josh Merel, Andrei A. Rusu, Tom Erez, Serkan Cabi, Saran Tunyasuvunakool, János Kramár, Raia Hadsell, Nando de Freitas, Nicolas Heess:
Reinforcement and Imitation Learning for Diverse Visuomotor Skills. Robotics: Science and Systems 2018 - [i38]Yuke Zhu, Ziyu Wang, Josh Merel, Andrei A. Rusu, Tom Erez, Serkan Cabi, Saran Tunyasuvunakool, János Kramár, Raia Hadsell, Nando de Freitas, Nicolas Heess:
Reinforcement and Imitation Learning for Diverse Visuomotor Skills. CoRR abs/1802.09564 (2018) - [i37]Martin A. Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas Degrave, Tom Van de Wiele, Volodymyr Mnih, Nicolas Heess, Jost Tobias Springenberg:
Learning by Playing - Solving Sparse Reward Tasks from Scratch. CoRR abs/1802.10567 (2018) - [i36]Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy P. Lillicrap:
Distributed Distributional Deterministic Policy Gradients. CoRR abs/1804.08617 (2018) - [i35]Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin A. Riedmiller, Raia Hadsell, Peter W. Battaglia:
Graph networks as learnable physics engines for inference and control. CoRR abs/1806.01242 (2018) - [i34]Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinícius Flores Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Çaglar Gülçehre, H. Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl, Ashish Vaswani, Kelsey R. Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matthew M. Botvinick, Oriol Vinyals, Yujia Li, Razvan Pascanu:
Relational inductive biases, deep learning, and graph networks. CoRR abs/1806.01261 (2018) - [i33]Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Simon Osindero, Nicolas Heess, Razvan Pascanu:
Mix&Match - Agent Curricula for Reinforcement Learning. CoRR abs/1806.01780 (2018) - [i32]Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Rémi Munos, Nicolas Heess, Martin A. Riedmiller:
Maximum a Posteriori Policy Optimisation. CoRR abs/1806.06920 (2018) - [i31]Lars Buesing, Theophane Weber, Yori Zwols, Sébastien Racanière, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess:
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search. CoRR abs/1811.06272 (2018) - [i30]Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Greg Wayne:
Hierarchical visuomotor control of humanoids. CoRR abs/1811.09656 (2018) - [i29]Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess:
Neural probabilistic motor primitives for humanoid control. CoRR abs/1811.11711 (2018) - [i28]Jonathan Uesato, Ananya Kumar, Csaba Szepesvári, Tom Erez, Avraham Ruderman, Keith Anderson, Krishnamurthy Dvijotham, Nicolas Heess, Pushmeet Kohli:
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures. CoRR abs/1812.01647 (2018) - [i27]Jonathan J. Hunt, André Barreto, Timothy P. Lillicrap, Nicolas Heess:
Entropic Policy Composition with Generalized Policy Improvement and Divergence Correction. CoRR abs/1812.02216 (2018) - [i26]Abbas Abdolmaleki, Jost Tobias Springenberg, Jonas Degrave, Steven Bohez, Yuval Tassa, Dan Belov, Nicolas Heess, Martin A. Riedmiller:
Relative Entropy Regularized Policy Iteration. CoRR abs/1812.02256 (2018) - 2017
- [c27]Andrei A. Rusu, Matej Vecerík, Thomas Rothörl, Nicolas Heess, Razvan Pascanu, Raia Hadsell:
Sim-to-Real Robot Learning from Pixels with Progressive Nets. CoRL 2017: 262-270 - [c26]Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Rémi Munos, Koray Kavukcuoglu, Nando de Freitas:
Sample Efficient Actor-Critic with Experience Replay. ICLR (Poster) 2017 - [c25]Jessica B. Hamrick, Andrew J. Ballard, Razvan Pascanu, Oriol Vinyals, Nicolas Heess, Peter W. Battaglia:
Metacontrol for Adaptive Imagination-Based Optimization. ICLR (Poster) 2017 - [c24]Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh:
Particle Value Functions. ICLR (Workshop) 2017 - [c23]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. ICML 2017: 3540-3549 - [c22]Yee Whye Teh, Victor Bapst, Wojciech M. Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu:
Distral: Robust multitask reinforcement learning. NIPS 2017: 4496-4506 - [c21]Ziyu Wang, Josh Merel, Scott E. Reed, Nando de Freitas, Gregory Wayne, Nicolas Heess:
Robust Imitation of Diverse Behaviors. NIPS 2017: 5320-5329 - [c20]Sébastien Racanière, Theophane Weber, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, Demis Hassabis, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. NIPS 2017: 5690-5701 - [c19]Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, Yee Whye Teh:
Filtering Variational Objectives. NIPS 2017: 6573-6583 - [c18]Danijar Hafner, Alexander Irpan, James Davidson, Nicolas Heess:
Learning Hierarchical Information Flow with Recurrent Neural Modules. NIPS 2017: 6724-6733 - [i25]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. CoRR abs/1703.01161 (2017) - [i24]Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, Andriy Mnih, Yee Whye Teh:
Particle Value Functions. CoRR abs/1703.05820 (2017) - [i23]Ivaylo Popov, Nicolas Heess, Timothy P. Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerík, Thomas Lampe, Yuval Tassa, Tom Erez, Martin A. Riedmiller:
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation. CoRR abs/1704.03073 (2017) - [i22]Jessica B. Hamrick, Andrew J. Ballard, Razvan Pascanu, Oriol Vinyals, Nicolas Heess, Peter W. Battaglia:
Metacontrol for Adaptive Imagination-Based Optimization. CoRR abs/1705.02670 (2017) - [i21]Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, Yee Whye Teh:
Filtering Variational Objectives. CoRR abs/1705.09279 (2017) - [i20]Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess:
Learning Hierarchical Information Flow with Recurrent Neural Modules. CoRR abs/1706.05744 (2017) - [i19]Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, Nicolas Heess:
Learning human behaviors from motion capture by adversarial imitation. CoRR abs/1707.02201 (2017) - [i18]Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller, David Silver:
Emergence of Locomotion Behaviours in Rich Environments. CoRR abs/1707.02286 (2017) - [i17]Ziyu Wang, Josh Merel, Scott E. Reed, Greg Wayne, Nando de Freitas, Nicolas Heess:
Robust Imitation of Diverse Behaviors. CoRR abs/1707.02747 (2017) - [i16]Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu:
Distral: Robust Multitask Reinforcement Learning. CoRR abs/1707.04175 (2017) - [i15]Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sébastien Racanière, David P. Reichert, Theophane Weber, Daan Wierstra, Peter W. Battaglia:
Learning model-based planning from scratch. CoRR abs/1707.06170 (2017) - [i14]Theophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. CoRR abs/1707.06203 (2017) - [i13]Matej Vecerík, Todd Hester, Jonathan Scholz, Fumin Wang, Olivier Pietquin, Bilal Piot, Nicolas Heess, Thomas Rothörl, Thomas Lampe, Martin A. Riedmiller:
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. CoRR abs/1707.08817 (2017) - 2016
- [c17]S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, Geoffrey E. Hinton:
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. NIPS 2016: 3225-3233 - [c16]Danilo Jimenez Rezende, S. M. Ali Eslami, Shakir Mohamed, Peter W. Battaglia, Max Jaderberg, Nicolas Heess:
Unsupervised Learning of 3D Structure from Images. NIPS 2016: 4997-5005 - [c15]Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra:
Continuous control with deep reinforcement learning. ICLR (Poster) 2016 - [i12]S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, Koray Kavukcuoglu, Geoffrey E. Hinton:
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. CoRR abs/1603.08575 (2016) - [i11]Danilo Jimenez Rezende, S. M. Ali Eslami, Shakir Mohamed, Peter W. Battaglia, Max Jaderberg, Nicolas Heess:
Unsupervised Learning of 3D Structure from Images. CoRR abs/1607.00662 (2016) - [i10]Andrei A. Rusu, Matej Vecerík, Thomas Rothörl, Nicolas Heess, Razvan Pascanu, Raia Hadsell:
Sim-to-Real Robot Learning from Pixels with Progressive Nets. CoRR abs/1610.04286 (2016) - [i9]Nicolas Heess, Gregory Wayne, Yuval Tassa, Timothy P. Lillicrap, Martin A. Riedmiller, David Silver:
Learning and Transfer of Modulated Locomotor Controllers. CoRR abs/1610.05182 (2016) - [i8]Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Rémi Munos, Koray Kavukcuoglu, Nando de Freitas:
Sample Efficient Actor-Critic with Experience Replay. CoRR abs/1611.01224 (2016) - 2015
- [c14]Nicolas Heess, Gregory Wayne, David Silver, Timothy P. Lillicrap, Tom Erez, Yuval Tassa:
Learning Continuous Control Policies by Stochastic Value Gradients. NIPS 2015: 2944-2952 - [c13]John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel:
Gradient Estimation Using Stochastic Computation Graphs. NIPS 2015: 3528-3536 - [c12]Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó:
Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages. UAI 2015: 405-414 - [i7]Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess:
Passing Expectation Propagation Messages with Kernel Methods. CoRR abs/1501.00375 (2015) - [i6]Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó:
Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages. CoRR abs/1503.02551 (2015) - [i5]John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel:
Gradient Estimation Using Stochastic Computation Graphs. CoRR abs/1506.05254 (2015) - [i4]Nicolas Heess, Greg Wayne, David Silver, Timothy P. Lillicrap, Yuval Tassa, Tom Erez:
Learning Continuous Control Policies by Stochastic Value Gradients. CoRR abs/1510.09142 (2015) - [i3]Nicolas Heess, Jonathan J. Hunt, Timothy P. Lillicrap, David Silver:
Memory-based control with recurrent neural networks. CoRR abs/1512.04455 (2015) - 2014
- [j2]S. M. Ali Eslami, Nicolas Heess, Christopher K. I. Williams, John M. Winn:
The Shape Boltzmann Machine: A Strong Model of Object Shape. Int. J. Comput. Vis. 107(2): 155-176 (2014) - [c11]Jyri J. Kivinen, Christopher K. I. Williams, Nicolas Heess:
Visual Boundary Prediction: A Deep Neural Prediction Network and Quality Dissection. AISTATS 2014: 512-521 - [c10]David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin A. Riedmiller:
Deterministic Policy Gradient Algorithms. ICML 2014: 387-395 - [c9]Arthur Guez, Nicolas Heess, David Silver, Peter Dayan:
Bayes-Adaptive Simulation-based Search with Value Function Approximation. NIPS 2014: 451-459 - [c8]Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu:
Recurrent Models of Visual Attention. NIPS 2014: 2204-2212 - [i2]Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu:
Recurrent Models of Visual Attention. CoRR abs/1406.6247 (2014) - 2013
- [c7]Nicolas Heess, Daniel Tarlow, John M. Winn:
Learning to Pass Expectation Propagation Messages. NIPS 2013: 3219-3227 - 2012
- [b1]Nicolas Manfred Otto Heess:
Learning generative models of mid-level structure in natural images. University of Edinburgh, UK, 2012 - [c6]S. M. Ali Eslami, Nicolas Heess, John M. Winn:
The Shape Boltzmann Machine: A strong model of object shape. CVPR 2012: 406-413 - [c5]Nicolas Heess, David Silver, Yee Whye Teh:
Actor-Critic Reinforcement Learning with Energy-Based Policies. EWRL 2012: 43-58 - [c4]Bogdan Alexe, Nicolas Heess, Yee Whye Teh, Vittorio Ferrari:
Searching for objects driven by context. NIPS 2012: 890-898 - 2011
- [j1]Nicolas Le Roux, Nicolas Heess, Jamie Shotton, John M. Winn:
Learning a Generative Model of Images by Factoring Appearance and Shape. Neural Comput. 23(3): 593-650 (2011) - [c3]Nicolas Heess, Nicolas Le Roux, John M. Winn:
Weakly Supervised Learning of Foreground-Background Segmentation Using Masked RBMs. ICANN (2) 2011: 9-16 - [c2]Hannes P. Saal, Nicolas Manfred Otto Heess, Sethu Vijayakumar:
Multimodal Nonlinear Filtering Using Gauss-Hermite Quadrature. ECML/PKDD (3) 2011: 81-96 - [i1]Nicolas Heess, Nicolas Le Roux, John M. Winn:
Weakly Supervised Learning of Foreground-Background Segmentation using Masked RBMs. CoRR abs/1107.3823 (2011)
2000 – 2009
- 2009
- [c1]Nicolas Heess, Christopher K. I. Williams, Geoffrey E. Hinton:
Learning Generative Texture Models with extended Fields-of-Experts. BMVC 2009: 1-11
Coauthor Index
aka: Joshua Scott Merel
aka: Noah Yamamoto Siegel
aka: Gregory Wayne
aka: Théophane Weber
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-13 23:53 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint