Lichang-Chen

Follow

Lichang Chen Lichang-Chen

Follow

38 followers · 21 following

University of Maryland
College Park
lichang-chen.github.io

Achievements

Achievements

Organizations

Stars

SakanaAI / robust-kbench

Python 96 11 Updated Nov 22, 2025

UCB-ADRS / ADRS

AI-Driven Research Systems (ADRS)

Jupyter Notebook 142 23 Updated Dec 17, 2025

Dramwig / FlowLine

Automated tool for running Python programs in a streamlined manner

JavaScript 390 24 Updated Jan 12, 2026

Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

Python 1,285 191 Updated Feb 9, 2021

Rhyme0730 / CS234-Reinforcement-Learning

This repo mainly contains CS234 (Spring 2024) assignment's coding problems

Python 60 22 Updated Feb 4, 2025

ksang / cs234-assignments

Stanford CS234: Reinforcement Learning assignments and practices

Python 63 11 Updated Jul 31, 2024

microsoft / PhiCookBook

This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…

Jupyter Notebook 3,737 497 Updated May 13, 2026

gmongaras / PPO_CartPole

Using PPO, I am attempting to solve the cartpole environment

Python 1 Updated Jan 20, 2022

stanford-cs336 / assignment1-basics

Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch

Python 1,620 2,073 Updated Apr 7, 2026

shreyansh26 / LLM-Training-Puzzles-Solutions

The LLM Training Puzzles by Sasha Rush

Jupyter Notebook 4 Updated Jul 8, 2023

TIGER-AI-Lab / verl-tool

A version of verl to support diverse tool use

Python 982 80 Updated Mar 2, 2026

srush / Autodiff-Puzzles

Jupyter Notebook 502 44 Updated Oct 18, 2024

newfacade / LeetCodeDataset

LeetCode Training and Evaluation Dataset

Python 49 2 Updated Apr 22, 2025

ganler / code-r1

Reproducing R1 for Code with Reliable Rewards

Python 309 20 Updated May 5, 2025

open-lm-engine / lm-engine

LM engine is a library for pretraining/finetuning LLMs

Python 171 29 Updated May 17, 2026

bytedance / MegaTTS3

Python 6,084 471 Updated Aug 29, 2025

srush / LLM-Training-Puzzles

What would you do with 1000 H100s...

Jupyter Notebook 1,172 72 Updated Jan 10, 2024

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 12,151 935 Updated Sep 1, 2024

gpu-mode / Triton-Puzzles

Puzzles for learning Triton

Jupyter Notebook 2,440 229 Updated Apr 1, 2026

RLHFlow / Self-rewarding-reasoning-LLM

Recipes to train the self-rewarding reasoning LLMs.

Python 232 14 Updated Mar 2, 2025

srush / Tensor-Puzzles

Solve puzzles. Improve your pytorch.

Jupyter Notebook 4,054 367 Updated Jul 15, 2024

chiphuyen / ml-interviews-book

https://huyenchip.com/ml-interviews-book/

HTML 4,624 669 Updated Mar 21, 2025

RLHFlow / Online-DPO-R1

Codebase for Iterative DPO Using Rule-based Rewards

Python 271 34 Updated Apr 11, 2025

alirezadir / Machine-Learning-Interviews

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 8,272 1,473 Updated Nov 28, 2025

MingLiiii / Layer_Gradient

[ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Python 76 3 Updated Jun 25, 2025

srush / awesome-o1

A bibliography and survey of the papers surrounding o1

TeX 1,214 51 Updated Nov 16, 2024

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,531 109 Updated Apr 24, 2025

google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

C++ 5,224 1,131 Updated May 16, 2026

Lichang-Chen / ODIN

ODIN: Disentangled Reward Mitigates Hacking in RLHF (ICML 2024)

Python 6 Updated Sep 5, 2024

dair-ai / ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

17,199 1,560 Updated Feb 13, 2023