Hi, thank you for your code. I'm a little bit confused of the infinit bootstrap in
|
done_bool = 0 if episode_step + 1 == env._max_episode_steps else float( |
.
Will it be wrong when sampling at the end of an episode (where the next_obs is the start observation of the next episode)? It seems you simply ignore this.
Hi, thank you for your code. I'm a little bit confused of the infinit bootstrap in
curl/train.py
Line 269 in 8416d6e
Will it be wrong when sampling at the end of an episode (where the next_obs is the start observation of the next episode)? It seems you simply ignore this.