Does Ludwig Support PPO? #3949
-
Hi all, I didn't see anything on the doc site so am asking here: does ludwig support PPO training? And what would an example .yaml config file look to do this? I assume the config would need to include parameters for a supervised fine-tuned model, a reward model, and any parameters for the PPO loss calculations and gradient updates for the SFT model. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @braunagn! Unfortunately, Ludwig currently doesn't support PPO or DPO, but it is something we intend to add in the next few months. Would you be interested in contributing support for either of them? |
Beta Was this translation helpful? Give feedback.
Hi @braunagn! Unfortunately, Ludwig currently doesn't support PPO or DPO, but it is something we intend to add in the next few months.
Would you be interested in contributing support for either of them?