Skip to content
View itsucks's full-sized avatar

Block or report itsucks

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,979 220 Updated Nov 5, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,658 187 Updated Jun 25, 2024

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,841 245 Updated Nov 4, 2025

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,212 182 Updated Mar 27, 2024

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 41,151 5,213 Updated Jun 27, 2024

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Python 2,942 200 Updated Nov 26, 2023

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,941 1,877 Updated Jul 15, 2025

xk-time 是时间转换,时间计算,时间格式化,时间解析,日历,时间cron表达式和时间NLP等的工具,使用Java8(JSR-310),线程安全,简单易用,多达70几种常用日期格式化模板,支持Java8时间类和Date,轻量级,无第三方依赖。

Java 336 86 Updated Sep 22, 2024

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,541 186 Updated Jul 12, 2024

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

Python 492 74 Updated Dec 30, 2022

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,748 441 Updated Oct 30, 2025

中文语句中的时间语义识别。即通过分析中文语句,识别出话语中提到的时间。

Java 653 184 Updated Dec 17, 2023

Accessible large language models via k-bit quantization for PyTorch.

Python 7,720 791 Updated Nov 4, 2025

ONNX-TensorRT: TensorRT backend for ONNX

C++ 3,165 543 Updated Sep 8, 2025

百度NLP:分词,词性标注,命名实体识别,词重要性

C++ 3,974 594 Updated May 25, 2021

Transformer related optimization, including BERT, GPT

C++ 6,342 920 Updated Mar 27, 2024

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Python 189 24 Updated Aug 17, 2021

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,609 531 Updated Oct 16, 2024

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Python 3,143 642 Updated Jan 22, 2024

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Python 875 110 Updated Oct 30, 2023

Google Research

Jupyter Notebook 36,660 8,225 Updated Oct 30, 2025

Little python library for retrofitting autoregressive decoder transformers to use DeepMinds Retro framework: https://arxiv.org/pdf/2112.04426.pdf

Jupyter Notebook 6 2 Updated Jan 5, 2022

A Keras TensorFlow 2.0 implementation of BERT, ALBERT and adapter-BERT.

Python 810 196 Updated Jan 13, 2023
Python 44 9 Updated Jul 14, 2021

Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model

Python 264 44 Updated Nov 30, 2024

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

Python 260 42 Updated Jan 29, 2023

Block-sparse primitives for PyTorch

Python 160 23 Updated Apr 5, 2021

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 18,282 3,532 Updated Nov 5, 2025

Bolt is a deep learning library with high performance and heterogeneous flexibility.

C++ 953 162 Updated Apr 11, 2025

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,301 333 Updated May 16, 2023
Next