User profiles for Tim Hertweck

Tim Hertweck

Google DeepMind
Verified email at google.com
Cited by 335

The challenges of exploration for offline reinforcement learning

…, A Byravan, M Bloesch, V Dasagi, T Hertweck… - arXiv preprint arXiv …, 2022 - arxiv.org
Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked
processes of reinforcement learning: collecting informative experience and inferring optimal …

Data-efficient hindsight off-policy option learning

…, T Lampe, A Abdolmaleki, T Hertweck… - International …, 2021 - proceedings.mlr.press
We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm.
Given any trajectory, HO2 infers likely option choices and backpropagates through the …

Compositional transfer in hierarchical reinforcement learning

…, JT Springenberg, M Neunert, T Hertweck… - arXiv preprint arXiv …, 2019 - arxiv.org
The successful application of general reinforcement learning algorithms to real-world robotics
applications is often limited by their high data requirements. We introduce Regularized …

Towards general and autonomous learning of core skills: A case study in locomotion

R Hafner, T Hertweck, P Klöppner… - … on Robot Learning, 2021 - proceedings.mlr.press
Modern Reinforcement Learning (RL) algorithms promise to solve difficult motor control
problems directly from raw sensory inputs. Their attraction is due in part to the fact that they can …

Is curiosity all you need? on the utility of emergent behaviours from curious exploration

…, M Wulfmeier, G Vezzani, V Dasagi, T Hertweck… - arXiv preprint arXiv …, 2021 - arxiv.org
Curiosity-based reward schemes can present powerful exploration mechanisms which
facilitate the discovery of solutions for complex, sparse or long-horizon tasks. However, as the …

Mastering stacking of diverse shapes with large-scale iterative reinforcement learning on real robots

…, O Groth, R Hafner, T Hertweck… - … on Robotics and …, 2024 - ieeexplore.ieee.org
Reinforcement learning solely from an agent’s self-generated data is often believed to be
infeasible for learning on real robots, due to the amount of data needed. However, if done right, …

Simultaneously learning vision and feature-based control policies for real-world ball-in-a-cup

…, M Neunert, A Abdolmaleki, T Hertweck… - arXiv preprint arXiv …, 2019 - arxiv.org
We present a method for fast training of vision based control policies on real robots. The key
idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks …

Replay across experiments: A natural extension of off-policy rl

…, S Huang, G Lever, B Moran, T Hertweck… - arXiv preprint arXiv …, 2023 - arxiv.org
Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy
reinforcement learning (RL). We present an effective yet simple framework to extend the …

Less is more--the Dispatcher/Executor principle for multi-task Reinforcement Learning

M Riedmiller, T Hertweck, R Hafner - arXiv preprint arXiv:2312.09120, 2023 - arxiv.org
Humans instinctively know how to neglect details when it comes to solve complex decision
making problems in environments with unforeseeable variations. This abstraction process …

[PDF][PDF] Regularized hierarchical policies for compositional transfer in robotics

…, JT Springenberg, M Neunert, T Hertweck… - arXiv preprint arXiv …, 2019 - academia.edu
The successful application of flexible, general learning algorithms—such as deep
reinforcement learning—to real-world robotics applications is often limited by their poor data-efficiency…