A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Python 1,139 151 Updated Oct 1, 2024

openai / InfoGAN

Code for reproducing key results in the paper "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets"

Python 1,072 305 Updated Mar 25, 2021

jondurbin / airoboros

Customizable implementation of the self-instruct paper.

Python 1,051 67 Updated Mar 7, 2024

Xwin-LM / Xwin-LM

Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment

Python 1,042 44 Updated May 31, 2024

ray-project / llmperf

LLMPerf is a library for validating and benchmarking LLMs

Python 1,039 191 Updated Dec 9, 2024

trotsky1997 / MathBlackBox

Python 1,035 108 Updated Dec 17, 2024

bilibili / Index-1.9B

A lightweight multilingual LLM

Python 1,004 48 Updated Aug 8, 2025

pjlab-sys4nlp / llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Python 995 62 Updated Dec 6, 2024

airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

Python 963 160 Updated Oct 22, 2022

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 848 89 Updated Aug 22, 2024

soskek / bookcorpus

Crawl BookCorpus

Python 846 109 Updated Jul 14, 2023

buppt / ChineseNRE

中文实体关系抽取，pytorch，bilstm+attention

Python 766 179 Updated Nov 13, 2021

fastnlp / fastHan

fastHan是基于fastNLP与pytorch实现的中文自然语言处理工具，像spacy一样调用方便。

Python 759 88 Updated Dec 9, 2023

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 743 41 Updated Aug 13, 2025

cszn / DPIR

Plug-and-Play Image Restoration with Deep Denoiser Prior (IEEE TPAMI 2021) (PyTorch)

Python 724 110 Updated Nov 21, 2022

deepseek-ai / ESFT

Expert Specialized Fine-Tuning

Python 708 260 Updated May 22, 2025

ottokart / punctuator2

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text

Python 681 196 Updated Sep 19, 2021

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 676 50 Updated Jan 20, 2025

sinovation / ZEN

A BERT-based Chinese Text Encoder Enhanced by N-gram Representations

Python 649 106 Updated Jul 24, 2022

hpcaitech / EnergonAI

Large-scale model inference.

Python 629 86 Updated Sep 12, 2023

mahmoudnafifi / Deep_White_Balance

Reference code for the paper: Deep White-Balance Editing (CVPR 2020). Our method is a deep learning multi-task framework for white-balance editing.

Python 589 70 Updated Jul 5, 2023

CR-Gjx / LeakGAN

The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.

Python 576 183 Updated Jul 2, 2022

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 573 70 Updated Sep 11, 2024

Previous Next

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhuang Liu zhuango

Achievements

Achievements

Block or report zhuango

Stars

lsdefine / simple_GRPO

bigscience-workshop / Megatron-DeepSpeed

SamLynnEvans / Transformer

Open-Source-O1 / Open-O1

NVIDIA-NeMo / Curator

lucidrains / performer-pytorch

AimeeLee77 / keyword_extraction

ericyangyu / PPO-for-Beginners