Skip to content
View markwwen's full-sized avatar
🐍
Happy new year~
🐍
Happy new year~

Highlights

  • Pro

Organizations

@SUSTech-CS-Courses

Block or report markwwen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库

Python 730 88 Updated May 24, 2025

[EMNLP 2024] 中文领域心理健康对话大模型MeChat

Python 504 56 Updated Nov 17, 2024

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,261 350 Updated Dec 19, 2025

The official repo for the paper Direct Multi-token Decoding

Python 3 Updated Oct 17, 2025

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 1,692 218 Updated Dec 20, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,087 333 Updated Dec 20, 2025

[EMNLP 2025 Main] SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning

Python 30 1 Updated Dec 2, 2025

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,800 74 Updated Jun 5, 2025

Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)

Python 52 7 Updated Mar 14, 2025

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python 46 1 Updated Dec 9, 2023

复现大模型相关算法及一些学习记录

Python 2,708 370 Updated Dec 15, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,167 568 Updated Aug 22, 2025

Summary of some awesome work for optimizing LLM inference

150 5 Updated Nov 30, 2025

[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning

Python 56 6 Updated Oct 31, 2025

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,311 78 Updated Mar 6, 2025

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 1 Updated Aug 5, 2025
Python 3 2 Updated Aug 1, 2025
Jupyter Notebook 583 25 Updated Aug 23, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,060 55 Updated Dec 11, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,065 232 Updated Dec 18, 2025
Python 3 Updated Jun 24, 2025
Python 26 6 Updated Jan 16, 2025

PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning

Python 567 77 Updated Nov 12, 2025

The best ChatGPT that $100 can buy.

Python 38,914 4,919 Updated Dec 9, 2025

Official Schlably Repository by the Institute for TMDT

Python 94 33 Updated Feb 23, 2023

PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".

Python 93 19 Updated May 23, 2023

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an LLM (with low latency overhead!)

Jupyter Notebook 49 7 Updated Jun 1, 2024
Next