Stars
All Algorithms implemented in Python
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Google Research
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
My blogs and code for machine learning. http://cnblogs.com/pinard
Accessible large language models via k-bit quantization for PyTorch.
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
Implementation of Graph Convolutional Networks in TensorFlow
An open source library for deep learning end-to-end dialog systems and chatbots.
An annotated implementation of the Transformer paper.
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Transformer related optimization, including BERT, GPT
XLNet: Generalized Autoregressive Pretraining for Language Understanding
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
LightSeq: A High Performance Library for Sequence Processing and Generation
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training