Skip to content

Conversation

@jcformanek
Copy link
Contributor

@jcformanek jcformanek commented Jul 19, 2021

What?

Implemented importance sampling (IS) / prioritized experience replay for feedforward madqn.

Why?

Hopefully importance sampling will assist performance.

How?

After each learning step we compute new priorities for the samples we drew from the replay buffer. This is done using Q-value errors. The mutate_priorities() function is used to update the priorities in the reverb table.

Extra

For now IS only works in feedforward madqn. I have also tested to make sure all of the other systems that inherit from feedforward MADQN still work. Recurrent MADQN, MADQN with comms, Dial, VDN and Qmix all still work after I made these changes.

Because so many systems inherit from feedforward MADQN, it is quite hard to make changes without breaking the other systems. So my strategy mocing forward is going to be to make very incremental upgrades to MADQN and ensure at each step that nothing breaks the other systems.

@jcformanek jcformanek added the enhancement New feature or request label Jul 19, 2021
@jcformanek jcformanek requested a review from arnupretorius July 19, 2021 13:46
@jcformanek jcformanek self-assigned this Jul 19, 2021
Copy link
Contributor

@DriesSmit DriesSmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great thanks @jcformanek 🔥 So this PR should go in after the new adder PR?

Copy link
Contributor

@KaleabTessera KaleabTessera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great @jcformanek ! Thanks so much for this. The only thing left from my side is to benchmark this. After that, I am happy to approve.

@jcformanek
Copy link
Contributor Author

jcformanek commented Jul 26, 2021

@KaleabTessera: I am going to benchmark this on Flatland now.

Copy link
Collaborator

@arnupretorius arnupretorius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @jcformanek! 🔥

@arnupretorius arnupretorius merged commit 0608255 into develop Jul 28, 2021
@arnupretorius arnupretorius deleted the feature/feedforward-madqn-importance-sampling branch July 28, 2021 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants