When rolling out policy [here](https://github.com/jrobine/twm/blob/e6a8e599864f5539b166ee9a8d173ece8f3fba84/twm/trainer.py#L78). The nested function `policy(index)` ALWAYS assumes dreamer is None (i.e. never going to the else section).
When rolling out policy here. The nested function
policy(index)ALWAYS assumes dreamer is None (i.e. never going to the else section).