Skip to content
View zhxieml's full-sized avatar
🤡
🤡

Highlights

  • Pro

Block or report zhxieml

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 results for forked starred repositories
Clear filter

Verifiers for LLM Reinforcement Learning

Python 79 11 Updated Apr 15, 2025
Jupyter Notebook 22 4 Updated Apr 23, 2024

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Python 75 16 Updated Aug 17, 2024

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 2,187 365 Updated Aug 14, 2025
Python 3 Updated Mar 14, 2023

Locally run an Instruction-Tuned Chat-Style LLM

C 10,198 879 Updated Apr 19, 2023

A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick

Python 293 25 Updated Nov 25, 2023

MATE: the Multi-Agent Tracking Environment.

Python 48 12 Updated Mar 31, 2023

18.337 - Parallel Computing and Scientific Machine Learning

Jupyter Notebook 244 45 Updated Apr 24, 2023

A collection of MuJoCo based environments.

Python 20 5 Updated Nov 30, 2020

Code for CIKM2020 "S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization"

Python 261 35 Updated Nov 22, 2020

Padavan

C 1,715 1,594 Updated Aug 4, 2024