ZHITENGLI

Follow

Zhiteng Li ZHITENGLI

Follow

Master Student @ CSE, SJTU. Research Interest: LLM/VLM/DiT Compression and Acceleration.

33 followers · 20 following

Shanghai Jiao Tong University
Shanghai
03:25 (UTC +08:00)
https://zhitengli.github.io/

Achievements

Achievements

Stars

guangshuoqin / VEQ

6 Updated Feb 3, 2026

XIANGLONGYAN / PT2-LLM

13 1 Updated Mar 11, 2026

ZTA2785 / Quant-dLLM

11 1 Updated Oct 8, 2025

deepseek-ai / DeepSeek-V3.2-Exp

Python 1,603 176 Updated Nov 18, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,162 2,095 Updated Jun 9, 2026

MoonshotAI / Kimi-K2

Kimi K2 is the large language model series developed by Moonshot AI team

10,855 852 Updated Jan 21, 2026

lhxcs / DVD-Quant

17 1 Updated Oct 5, 2025

ali-vilab / VACE

[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing

Python 3,818 266 Updated Oct 17, 2025

Shenyi-Z / TaylorSeer

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 401 22 Updated Mar 2, 2026

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 16,256 2,858 Updated Mar 5, 2026

JunyiWuCode / QuantCache

[ICCV 2025] QuantCache：Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

18 1 Updated Sep 26, 2025

ZHITENGLI / AdaSVD

PyTorch code for our paper "AdaSVD: Adaptive Singular Value Decomposition for Large Language Models"

15 Updated Mar 9, 2025

ZHITENGLI / ARB-LLM

[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models

Python 30 2 Updated Aug 5, 2025

Tencent-Hunyuan / HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 12,204 1,253 Updated Nov 21, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 82,842 18,045 Updated Jun 14, 2026

xuyang-liu16 / Awesome-Generation-Acceleration

📚 Collection of awesome generation acceleration resources.

399 12 Updated Jul 7, 2025

hao-ai-lab / FastVideo

A unified inference and post-training framework for accelerated video generation.

Python 3,707 360 Updated Jun 14, 2026

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,985 380 Updated Mar 12, 2026

thu-nics / ViDiT-Q

[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Python 162 27 Updated Mar 21, 2025

unslothai / unsloth

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Python 66,508 5,961 Updated Jun 14, 2026

Kai-Liu001 / BiMaCoSR

Python 24 1 Updated Jul 14, 2025

XIANGLONGYAN / PBS2P

PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"

13 Updated Mar 11, 2026

mit-han-lab / omniserve

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 844 65 Updated Mar 6, 2025

dropbox / gemlite

Fast low-bit matmul kernels in Triton

Python 471 34 Updated May 15, 2026

dropbox / hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Python 943 90 Updated Feb 26, 2026

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,867 2,465 Updated Jun 14, 2026

jkwang28 / OSDFace

Official Repo for CVPR 2025 paper "OSDFace: One-Step Diffusion Model for Face Restoration"

Python 280 14 Updated Dec 23, 2025

deepseek-ai / DeepSeek-V3

Python 103,753 16,731 Updated Aug 28, 2025

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

1,260 98 Updated Jun 23, 2025

microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 765 59 Updated Aug 6, 2025