Thompson Sampling for Bandits using UCB policy
-
Updated
Jul 29, 2017 - Python
Thompson Sampling for Bandits using UCB policy
Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay
R.I.T project
Foundations Of Intelligent Learning Agents (FILA) Assignments
Codes and templates for ML algorithms created, modified and optimized in Python and R.
We implemented a Monte Carlo Tree Search (MCTS) from scratch and we successfully applied it to Tic-Tac-Toe game.
We compare different policies for the checkers game using reinforcement learning algorithms.
Python package for Unity Cloud Build api
Structure and Interpretation of Computer Programs
Multi Armed Bandits implementation using the Jester Dataset
Author's implementation of the paper Correlated Age-of-Information Bandits.
My programs during CS747 (Foundations of Intelligent and Learning Agents) Autumn 2021-22
👤 Multi-Armed Bandit Algorithms Library (MAB) 👮
On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems
Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.
Multi-armed bandit algorithm with tensorflow and 11 policies
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.
Complete Tutorial Guide with Code for learning ML
Add a description, image, and links to the ucb topic page so that developers can more easily learn about it.
To associate your repository with the ucb topic, visit your repo's landing page and select "manage topics."