-
McGill University, MILA
- Montreal, QC
- https://hmishfaq.github.io/
Highlights
- Pro
Stars
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Jax implementation of LMC-LSVI and Adam LMCDQN .
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
A playbook for systematically maximizing the performance of deep learning models.
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
Kinetics: Rethinking Test-Time Scaling Laws
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
A little Python script to collect LaTeX sources for upload to the arXiv.
Template Makefile for ML projects in Python.
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Solve puzzles. Improve your pytorch.
Minimal reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
Recipes to train reward model for RLHF.
A high-throughput and memory-efficient inference and serving engine for LLMs
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Machine Learning Foundations: Linear Algebra, Calculus, Statistics & Computer Science