The GitHub repository for "Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo", AISTATS 2024.
-
Updated
Oct 19, 2024 - Python
The GitHub repository for "Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo", AISTATS 2024.
over-parameterization = exploration ?
Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits
A reinforcement learning project where a snake learns to navigate and survive in a dynamic environment through Q-learning.
A systematic parameter study of exploration–exploitation trade-offs in an Active Inference agent under varying precision and sensory noise.
Uses GloVe embeddings + bandit-style exploration/exploitation with adaptive diversification, UCB-driven cluster search, and stagnation recovery.
PyTorch-native contextual bandits with Neural Thompson Sampling for scalable exploration and large action sets.
Deep Intrinsically Motivated Exploration in Continuous Control
Official implementation of LECO (NeurIPS'22)
This project focuses on comparing different Reinforcement Learning Algorithms, including monte-carlo, q-learning, lambda q-learning epsilon-greedy variations, etc.
This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.
This repository collects reference code for several multi-agent and distributed contextual bandit algorithms.
🤖 Explore and optimize rewards with Bandexa, a PyTorch-native library for Neural-Linear Thompson Sampling in contextual bandits.
Explore the 10-Arm Testbed Simulation! 🎲 Utilize Python to test various ε-greedy strategies in a reinforcement learning environment. Visualize and compare agents' performance as they balance exploration and exploitation. Perfect for learners and enthusiasts! 🚀📊
MACE-RL is a novel framework that learns to dynamically regulate its own exploration strategy. It integrates a curiosity module, informed by episodic memory of past experiences, with a meta-learning network that monitors the agent's performance and adaptively regulates the influence of the intrinsic curiosity bonus.
The official code release for "More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling", Reinforcement Learning Conference (RLC) 2024
Research Thesis - Reinforcement Learning
The official code release for Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo, ICLR 2024.
Add a description, image, and links to the exploration-exploitation topic page so that developers can more easily learn about it.
To associate your repository with the exploration-exploitation topic, visit your repo's landing page and select "manage topics."