Skip to content

Fix PER bug. Add loadable memory. N-step TD.#276

Open
jakegrigsby wants to merge 5 commits into
keras-rl:keras-rl-v0.4.2from
jakegrigsby:keras-rl-v0.4.2
Open

Fix PER bug. Add loadable memory. N-step TD.#276
jakegrigsby wants to merge 5 commits into
keras-rl:keras-rl-v0.4.2from
jakegrigsby:keras-rl-v0.4.2

Conversation

@jakegrigsby

Copy link
Copy Markdown

This is basically #214 with most of the extra features cut out, because the PER bug needs to get fixed, whether or not the rest ever merges. DQfD, NoisyNets, and the csv logging callback are removed.

  1. Fixes bug discussed in Prioritized Experience Replay #212 where observation indices rotate while their priorities stay static.

  2. Adds loadable memory. The PartitionedRingBuffer is similar to the old implementation of RingBuffer but the data stays locked to the idx it was originally assigned to. That's what fixed the PER issue, and also makes it pretty easy to load demonstration data and never delete it. I originally used it in DQfD but I think it has more applications than that. A feature like this was requested in [feature-request] Add the possibility to load and store memory. #262.

  3. Adds n-step temporal difference error to dqn (only for prioritized memories). This basically involves a change to how the experiences are retrieved from memory. I'd appreciate someone double checking my math on that.

Comment thread rl/agents/dqn.py
Comment thread rl/memory.py Outdated
Comment thread rl/memory.py Outdated
Comment thread rl/memory.py
@RaphaelMeudec

Copy link
Copy Markdown
Contributor

@jakegrigsby Thanks for your PR! I've left you some comments. Also, it seems it doesn't pass on Tensorflow backend. Can you clear this out?

@jakegrigsby

Copy link
Copy Markdown
Author

@RaphaelMeudec I use tensorflow as my backend and it seems to be working. I use 1.10 so I'll update and see if that's the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants