Skip to content

Tags: araffin/sbx

Tags

v0.26.0

Toggle v0.26.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update OnPolicyAlgorithmJax & PPO to support custom rollout_buffer_cl…

…ass (#90)

* Update OnPolicyAlgorithmJax & PPO to support custom rollout_buffer_class

* Added assertion to prevent PPO from using DictRolloutBuffer implicitly

* Update links to https

* Update version

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

v0.25.0

Toggle v0.25.0's commit message
Bump version

v0.24.0

Toggle v0.24.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add Python 3.13 support, drop Python 3.9 (#83)

* Drop Python 3.9 support

* Apply autofixes for Python 3.10

* Reformat

v0.23.0

Toggle v0.23.0's commit message
Release v0.23.0

v0.22.0

Toggle v0.22.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add n-step return support with `n_steps` parameter (#74)

* Add support for n-step returns

* Add type hint

* Cleanup ppo code

* Log policy and entropy loss separately

* Cleanup vf init

* Update version

* Add test for n steps

* Cap Jax version

* Reformat

v0.21.0

Toggle v0.21.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
KL Adaptive LR for PPO and LR schedule for SAC/TQC (#72)

* Only check for terminated episodes

* Start adding ortho init

* Add SimbaPolicy for PPO

* Try adding ortho init to SAC

* Enable lr schedule for PPO

* Allow to pass lr, prepare for adaptive lr

* Implement adaptive lr

* Add small test

* Refactor adaptive lr

* Add adaptive lr for SAC

* Fix qf_learning_rate

* Revert "Fix qf_learning_rate"

This reverts commit ab33983.

* Revert "Add adaptive lr for SAC"

This reverts commit 5832702.

* Revert kl div for SAC changes

* Revert dist.mode() in two lines

* Cleanup code

* Add support for Gaussian actor for SAC

* Enable Gaussian actor for TQC

* Log std too

* Avoid NaN in kl div approx

* Allow to use layer_norm in actor

* Reformat

* Allow max grad norm for TQC and fix optimizer class

* Comment out max grad norm

* Update to schedule classes

* Add lr schedule support for TQC

* Revert experimental changes and add support for lr schedule for SAC

* Add test for adaptive kl div, remove squash output param

v0.20.0

Toggle v0.20.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update PPO to support `net_arch`, and additional fixes (#65)

* Add support for flexible arch in PPO

* Fix ent_coeff logging for TQC

* Fix name order

* Fix ent_coeff logging for SAC

* Hotfix for PPO, do not squash output at test time

* Fix typo

* Fix typo in common policy

* Try Gaussian dist for TQC

* Revert "Try Gaussian dist for TQC"

This reverts commit 6eeaf23.

* Fix CrossQ ent_coef logging

* Log PPO std when possible

* Fix for CrossQ

v0.19.0

Toggle v0.19.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add SimBa Policy: Simplicity Bias for Scaling Up Parameters in DRL (#59)

* Start testing simba

* Quick try with CrossQ

* Add actor for CrossQ

* Add simba net for TQC

* Remove unused param

* Add parameter resets for TQC

* Fix reset

* Add missing param

* Update documentation

* Add parameter resets

* Reformat pyproject.toml

* Refactor: share actor between SAC and TQC

* Add run tests for simba

* Upgrade to python 3.9 (#64)

* Fix mypy error, update version

v0.18.0

Toggle v0.18.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Optimize the log of the entropy coeff instead of the entropy coeff (#56)

* optimize the log of the entropy coeff instead of the entropy coeff

* Update log ent coef for SAC and derivates

* Reformat yaml

* Use uv for faster downloads

* Remove TODO

* Remove redundant call

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

v0.17.0

Toggle v0.17.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add CNN support for DQN (#49)

* Add CNN support for DQN

* Update version and deps

* Fix CNN, channel last, padding and reshape