Build software better, together

haoyangzheng-ai / ts_ulmc

The GitHub repository for "Accelerating Approximate Thompson Sampling with Underdamped Langevin Monte Carlo", AISTATS 2024.

monte-carlo thompson-sampling multi-armed-bandit langevin-dynamics exploration-exploitation

Updated Oct 19, 2024
Python

siavashadpey / MultiArmedBandits

Star

reinforcement-learning active-learning bandit-algorithms exploration-exploitation

Updated Mar 27, 2022
Python

rom1mouret / exploration

Star

over-parameterization = exploration ?

global-optimization gradient-descent hypernetworks exploration-exploitation over-parameterization

Updated Aug 23, 2020
Python

Amshra267 / Thompson-Greedy-Comparison-for-MultiArmed-Bandits

Star

Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits

thompson-sampling epsilon-greedy exploration-exploitation optimistic-bayesian-sampling

Updated Jul 2, 2021
Python

ivotints / Learn2Slither

Star

A reinforcement learning project where a snake learns to navigate and survive in a dynamic environment through Q-learning.

reinforcement-learning neural-network tensorflow keras q-learning snake-game exploration-exploitation ai-agent

Updated Apr 16, 2025
Python

isabellahmann / active-inference-exploration

Star

A systematic parameter study of exploration–exploitation trade-offs in an Active Inference agent under varying precision and sensory noise.

uncertainty computational-neuroscience exploration-exploitation active-inference

Updated Jan 14, 2026
Python

nagsujosh / contexto-solver-agent

Star

Uses GloVe embeddings + bandit-style exploration/exploitation with adaptive diversification, UCB-driven cluster search, and stagnation recovery.

ucb glove-embeddings bandit-algorithms exploration-exploitation

Updated Aug 29, 2025
Python

keyvar / bandexa

Star

PyTorch-native contextual bandits with Neural Thompson Sampling for scalable exploration and large action sets.

python pytorch thompson-sampling experimentation multi-armed-bandits online-learning contextual-bandits bayesian-linear-regression bandit-algorithms replay-buffer exploration-exploitation neural-linear two-tower

Updated Feb 7, 2026
Python

baturaysaglam / DISCOVER

Star

Deep Intrinsically Motivated Exploration in Continuous Control

deep-reinforcement-learning actor-critic exploration-exploitation

Updated Mar 2, 2024
Python

kakaobrain / leco

Star

Official implementation of LECO (NeurIPS'22)

reinforcement-learning exploration-exploitation

Updated May 11, 2023
Python

kochlisGit / Reinforcement-Learning-Algorithms

Star

This project focuses on comparing different Reinforcement Learning Algorithms, including monte-carlo, q-learning, lambda q-learning epsilon-greedy variations, etc.

python reinforcement-learning monte-carlo openai-gym q-learning policy rl-agents epsilon-greedy dynamic-programming markov-chains approximation-algorithms ucb1 q-lambda exploration-exploitation thomson-sampling frozen-lake multi-bandit-army

Updated Feb 15, 2022
Python

ruqoyyasadiq / deep_RL-multi-arm-bandit-exploration

Star

This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.

reinforcement-learning reinforcement-learning-algorithms bandit-algorithms exploration-exploitation exploration-strategy

Updated Oct 4, 2021
Python

MINDS-THU / multi_agent_bandit_algorithms

Star

This repository collects reference code for several multi-agent and distributed contextual bandit algorithms.

reinforcement-learning bandit-algorithms exploration-exploitation

Updated Nov 25, 2025
Python

fanfan45 / bandexa

Star

🤖 Explore and optimize rewards with Bandexa, a PyTorch-native library for Neural-Linear Thompson Sampling in contextual bandits.

python pytorch thompson-sampling experimentation multi-armed-bandits online-learning contextual-bandits bayesian-linear-regression bandit-algorithms replay-buffer exploration-exploitation neural-linear two-tower

Updated Apr 19, 2026
Python

Explore the 10-Arm Testbed Simulation! 🎲 Utilize Python to test various ε-greedy strategies in a reinforcement learning environment. Visualize and compare agents' performance as they balance exploration and exploitation. Perfect for learners and enthusiasts! 🚀📊

python machine-learning reinforcement-learning decision-making epsilon-greedy multi-armed-bandit exploration-exploitation

Updated May 27, 2024
Python

ayushsi42 / mace_rl

Star

MACE-RL is a novel framework that learns to dynamically regulate its own exploration strategy. It integrates a curiosity module, informed by episodic memory of past experiences, with a meta-learning network that monitors the agent's performance and adaptively regulates the influence of the intrinsic curiosity bonus.

rl exploration-exploitation

Updated Jan 20, 2026
Python

mbhenaff / neural-e3

Star

deep-learning deep-reinforcement-learning model-based-rl exploration-exploitation

Updated Dec 17, 2019
Python

panxulab / LSVI-ASE

Star

The official code release for "More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling", Reinforcement Learning Conference (RLC) 2024

reinforcement-learning thompson-sampling langevin-dynamics exploration-exploitation langevin-mc

Updated Jun 19, 2024
Python

hridayns / Research-Project-on-Reinforcement-learning

Star

Research Thesis - Reinforcement Learning

reinforcement-learning openai-gym dqn ddqn exploration-exploitation

Updated May 22, 2019
Python

hmishfaq / LMC-LSVI

Star

The official code release for Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo, ICLR 2024.

reinforcement-learning thompson-sampling exploration-exploitation langevin-mc

Updated Feb 19, 2026
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exploration-exploitation

Here are 28 public repositories matching this topic...

haoyangzheng-ai / ts_ulmc

siavashadpey / MultiArmedBandits

rom1mouret / exploration

Amshra267 / Thompson-Greedy-Comparison-for-MultiArmed-Bandits

ivotints / Learn2Slither

isabellahmann / active-inference-exploration

nagsujosh / contexto-solver-agent

keyvar / bandexa

baturaysaglam / DISCOVER

kakaobrain / leco

kochlisGit / Reinforcement-Learning-Algorithms

ruqoyyasadiq / deep_RL-multi-arm-bandit-exploration

MINDS-THU / multi_agent_bandit_algorithms

fanfan45 / bandexa

KaranAnchan / 10_Arm_Testbed

ayushsi42 / mace_rl

mbhenaff / neural-e3

panxulab / LSVI-ASE

hridayns / Research-Project-on-Reinforcement-learning

hmishfaq / LMC-LSVI

Improve this page

Add this topic to your repo