Code for "Distributionally Robust Deep Q-learning"

Chung I Lu, Julian Sester, Aijia Zhang

Abstract

We propose a novel distributionally robust $Q$-learning algorithm for the non-tabular case accounting for continuous state spaces where the state transition of the underlying Markov decision process is subject to model uncertainty. The uncertainty is taken into account by considering the worst-case transition from a ball around a reference probability measure. To determine the optimal policy under the worst-case state transition, we solve the associated non-linear Bellman equation by dualising and regularising the Bellman operator with the Sinkhorn distance, which is then parameterized with deep neural networks. This approach allows us to modify the Deep Q-Network algorithm to optimise for the worst case state transition.

Preprint

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agent		agent
data		data
gambling		gambling
mmd		mmd
.gitignore		.gitignore
README.md		README.md
gambling_env.ipynb		gambling_env.ipynb
mmd_simulator_env.ipynb		mmd_simulator_env.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Code for "Distributionally Robust Deep Q-learning"

Chung I Lu, Julian Sester, Aijia Zhang

Abstract

Preprint

Contents

About

Uh oh!

Releases

Packages

Languages

luchungi/Sinkhorn_RDQN

Folders and files

Latest commit

History

Repository files navigation

Code for "Distributionally Robust Deep Q-learning"

Chung I Lu, Julian Sester, Aijia Zhang

Abstract

Preprint

Contents

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages