-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
I tried to run mario_a2c.py, mario_ppo.py and mario_curio.py but for non of them I cannot improve the reward.
Did you use the same hyper-parameters as in the files to conduct the evaluation? (i.e. number of workers, learning rate)
Which version of the libraries did you use ?
For instance, A2C without ICM: (after 3M time-steps)
Metadata
Metadata
Assignees
Labels
No labels