ppo

Fine-tunes FLAN-T5 using Reinforcement Learning (PPO) and PEFT to generate less toxic summaries, leveraging Meta AI's hate speech reward model for detoxification.

ppo generative-ai flan-t5

Updated May 25, 2025
HTML

tganamur / RL-vs-MPC-Racing

Star

Comparing the performance of MPC based racing and RL based racing

python reinforcement-learning autonomous-driving model-predictive-control ppo f1tenth stable-baselines3

Updated May 27, 2024
HTML

Improve this page

Add a description, image, and links to the ppo topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ppo topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ppo

Here are 7 public repositories matching this topic...

koshachya-myata / Data_Center_Simulation

MatheusAlves11 / TCC-AcademiaGo

Fashad-Ahmed / Tablut-challenge

nedamhs / DiabetesRL

GuickerZ / sistema-poo-eda

Ajairajv / Detoxified-Summaries-with-FLAN-T5-PPO

tganamur / RL-vs-MPC-Racing

Improve this page

Add this topic to your repo