User profiles for Tim Hertweck
Tim HertweckGoogle DeepMind Verified email at google.com Cited by 335 |
The challenges of exploration for offline reinforcement learning
Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked
processes of reinforcement learning: collecting informative experience and inferring optimal …
processes of reinforcement learning: collecting informative experience and inferring optimal …
Data-efficient hindsight off-policy option learning
We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm.
Given any trajectory, HO2 infers likely option choices and backpropagates through the …
Given any trajectory, HO2 infers likely option choices and backpropagates through the …
Compositional transfer in hierarchical reinforcement learning
…, JT Springenberg, M Neunert, T Hertweck… - arXiv preprint arXiv …, 2019 - arxiv.org
The successful application of general reinforcement learning algorithms to real-world robotics
applications is often limited by their high data requirements. We introduce Regularized …
applications is often limited by their high data requirements. We introduce Regularized …
Towards general and autonomous learning of core skills: A case study in locomotion
R Hafner, T Hertweck, P Klöppner… - … on Robot Learning, 2021 - proceedings.mlr.press
Modern Reinforcement Learning (RL) algorithms promise to solve difficult motor control
problems directly from raw sensory inputs. Their attraction is due in part to the fact that they can …
problems directly from raw sensory inputs. Their attraction is due in part to the fact that they can …
Is curiosity all you need? on the utility of emergent behaviours from curious exploration
Curiosity-based reward schemes can present powerful exploration mechanisms which
facilitate the discovery of solutions for complex, sparse or long-horizon tasks. However, as the …
facilitate the discovery of solutions for complex, sparse or long-horizon tasks. However, as the …
Mastering stacking of diverse shapes with large-scale iterative reinforcement learning on real robots
Reinforcement learning solely from an agent’s self-generated data is often believed to be
infeasible for learning on real robots, due to the amount of data needed. However, if done right, …
infeasible for learning on real robots, due to the amount of data needed. However, if done right, …
Simultaneously learning vision and feature-based control policies for real-world ball-in-a-cup
We present a method for fast training of vision based control policies on real robots. The key
idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks …
idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks …
Replay across experiments: A natural extension of off-policy rl
Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy
reinforcement learning (RL). We present an effective yet simple framework to extend the …
reinforcement learning (RL). We present an effective yet simple framework to extend the …
Less is more--the Dispatcher/Executor principle for multi-task Reinforcement Learning
Humans instinctively know how to neglect details when it comes to solve complex decision
making problems in environments with unforeseeable variations. This abstraction process …
making problems in environments with unforeseeable variations. This abstraction process …
[PDF][PDF] Regularized hierarchical policies for compositional transfer in robotics
…, JT Springenberg, M Neunert, T Hertweck… - arXiv preprint arXiv …, 2019 - academia.edu
The successful application of flexible, general learning algorithms—such as deep
reinforcement learning—to real-world robotics applications is often limited by their poor data-efficiency…
reinforcement learning—to real-world robotics applications is often limited by their poor data-efficiency…