Skip to content
View zhuango's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Peking

Block or report zhuango

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
191 stars written in Python
Clear filter

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)

Python 10,116 1,395 Updated Jul 15, 2025

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,116 974 Updated Jul 1, 2024

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python 9,483 2,060 Updated Apr 16, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,785 572 Updated May 3, 2024

A collection of libraries to optimise AI model performances

Python 8,367 632 Updated Jul 22, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,330 807 Updated Oct 31, 2025

A faster pytorch implementation of faster r-cnn

Python 7,842 2,322 Updated May 20, 2022

Accessible large language models via k-bit quantization for PyTorch.

Python 7,726 793 Updated Nov 4, 2025

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Python 7,473 2,033 Updated Mar 24, 2024

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,112 391 Updated Jul 11, 2024

Example models using DeepSpeed

Python 6,709 1,109 Updated Oct 15, 2025

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,581 586 Updated Oct 24, 2024

Repo for external large-scale work

Python 6,547 721 Updated Apr 27, 2024

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Python 6,303 1,795 Updated Aug 6, 2023

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,263 680 Updated Oct 24, 2025

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Python 6,177 1,167 Updated May 28, 2023

Supercharge Your LLM with the Fastest KV Cache Layer

Python 5,916 691 Updated Nov 6, 2025

Example code for the book Fluent Python, 1st Edition (O'Reilly, 2015)

Python 5,589 2,180 Updated Dec 2, 2021

OpenChat: Advancing Open-source Language Models with Imperfect Data

Python 5,439 429 Updated Sep 13, 2024

Language Technology Platform

Python 5,203 1,057 Updated Jun 2, 2025

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,983 524 Updated Apr 11, 2025

中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。

Python 4,515 794 Updated Nov 21, 2023

An Open-Source Package for Neural Relation Extraction (NRE)

Python 4,437 1,053 Updated Jan 10, 2024

搜索所有中文NLP数据集,附常用英文NLP数据集

Python 4,391 628 Updated Nov 21, 2022

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,165 338 Updated May 7, 2025

An Open-Source Package for Knowledge Embedding (KE)

Python 3,992 991 Updated Jan 10, 2024

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型

Python 3,988 750 Updated Nov 21, 2022

Simple RL training for reasoning

Python 3,782 279 Updated Aug 3, 2025

A live stream development of RL tunning for LLM agents

Python 3,577 498 Updated Oct 8, 2025

Entropy Based Sampling and Parallel CoT Decoding

Python 3,421 327 Updated Nov 13, 2024