reward

An advanced desktop automation tool for Microsoft Rewards. It performs Bing searches and collects Daily Sets using mathematically driven, human-like input simulation (W3C Actions, Bezier curves, and smart scrolling). Built with Python/Selenium and packaged as an executable Windows app for a seamless, plug-and-play experience.

bot automation selenium inno-setup pyinstaller reward bing-search pywebview microsoft-rewards

Updated Apr 19, 2026
Python

holarissun / RewardModelingBeyondBradleyTerry

Star

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

reward inverse-reinforcement-learning large-language-models rlhf reward-models largelanguagemodels reward-modeling llm-aligment llmalignment

Updated Apr 2, 2025
Python

far3x / lumen

Star

The official CLI for the Lumen Protocol & Local Prompt Generation.

client ai code lumen reward

Updated Oct 29, 2025
Python

manglu097 / Thoth

Star

[ICLR 2026] Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

protocol reward llm ai4s

Updated Jan 30, 2026
Python

khinthandarkyaw98 / Optimizing-UAV-trajectory-for-maximum-data-rate-via-Q-Learning

Star

During our participation in the Internship Exchange Program, my friend and I collaborated with the guidance of our esteemed supervisor from NTHU.

reinforcement-learning uav reward data-rate uav-trajectory

Updated May 18, 2024
Python

ssbuild / chatglm_rlhf

Star

chatglm_rlhf_finetuning

chat lora reward finetuning rlhf chatglm qlora

Updated Oct 10, 2023
Python

ssbuild / llm_rlhf

Star

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

lora reward trl llm rlhf trlx llm-rlhf

Updated Sep 19, 2023
Python

dp770 / aws_deepracer_worksheet

Star

Worksheet and Utilities for AWS DeepRacer – one of the most exciting ways of building strong skills in reinforcement learning and through a hands-on approach. This repository offers: 1) Functionally-rich and flexible reward function 2) Utilities with Jupiter notes for Racing Line calculation and visualisation of track 3) Scripts to parse RoboMak…

learning aws reinforcement-learning utilities racing line notes excel scripts function track coordinates reinforcement jupiter reward worksheet hands-on deepracer

Updated Sep 16, 2022
Python

Miraclemarvel55 / LLaMA-MOSS-RLHF-LoRA

Star

用RLHF可选LoRA对LLaMA和MOSS进行训练|Training LLaMA or MOSS with RLHF [LoRA]

similarity chinese rl llama lora moss reward ppo rlhf

Updated May 16, 2023
Python

corbosiny / AIVO-StreetFigherReinforcementLearning

Star

Creating an environment to quickly train a variety of Deep Reinforcement Learning algorithms on Street Fighter 2 using tournaments between learning agents

reinforcement-learning retro reward fighter street-fighter game-environment reinforcement-learning-environments