Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction
-
Updated
Jan 5, 2024 - Python
Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction
muzero Algorithm Reinforcement Learning for Chinese XiangQi
GenesisZERO : potential applications for MCTS agents with LLMs for Sequential decision-making
Meta-learning experiments for the game of minichess and related rule variants.
Trains deep reinforcement learning agents in Atari environments via the DRLA library.
MuZero for Super Mario Bros
Simple Muesli RL algorithm implementation (PyTorch)
A set of experiments and human-playing comparisons with the Muzero agent from Google DeepMind, made as part of a research project with l'école polytechnique.
A Notebook implementation of the Pseudocode from the original Muzero paper
[IEEE TAI] Interpretable MuZero with a decoder that reconstructs observations from hidden states for demystifying planning
Trains a deep reinforcement learning agent in simulation testbed environments with the DRLA library.
An implementation of the MuZero algorithm by Google Deepmind.
Materials for AlphaGo
A robust variant of MuZero
Deep Q Learning blackbox strategies for casino games
Add a description, image, and links to the muzero topic page so that developers can more easily learn about it.
To associate your repository with the muzero topic, visit your repo's landing page and select "manage topics."