-
Here, there, everywhere
- London
- https://scholar.google.com/citations?user=KYmFMxsAAAAJ&hl=en
Stars
The simplest, fastest repository for training/finetuning medium-sized GPTs.
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Our library for RL environments + evals
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
This repo contains the Hugging Face Deep Reinforcement Learning Course.
Python code, PDFs and resources for the series of posts on Reinforcement Learning which I published on my personal blog
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Whisper realtime streaming for long speech-to-text transcription and translation
An interactive dashboard to display Formula 1 data and statistics
A dataset focused on summarization of dialogs, which represents the rich domain of Twitter customer care conversations
A tool for generating .pex (Python EXecutable) files, lock files and venvs.
Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)
I will build Transformer from scratch
nannyml: post-deployment data science in python
Implementation of Bayesian Hyperparameter Optimization of Machine Learning Algorithms
Uplift modeling and causal inference with machine learning algorithms
Distributed Asynchronous Hyperparameter Optimization in Python