Skip to content
View ZHITENGLI's full-sized avatar

Block or report ZHITENGLI

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
6 Updated Feb 3, 2026

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,162 2,095 Updated Jun 9, 2026

Kimi K2 is the large language model series developed by Moonshot AI team

10,855 852 Updated Jan 21, 2026
17 1 Updated Oct 5, 2025

[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing

Python 3,818 266 Updated Oct 17, 2025

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 401 22 Updated Mar 2, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 16,256 2,858 Updated Mar 5, 2026

[ICCV 2025] QuantCache:Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

18 1 Updated Sep 26, 2025

PyTorch code for our paper "AdaSVD: Adaptive Singular Value Decomposition for Large Language Models"

15 Updated Mar 9, 2025

[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models

Python 30 2 Updated Aug 5, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 12,204 1,253 Updated Nov 21, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 82,842 18,045 Updated Jun 14, 2026

📚 Collection of awesome generation acceleration resources.

399 12 Updated Jul 7, 2025

A unified inference and post-training framework for accelerated video generation.

Python 3,707 360 Updated Jun 14, 2026

Solve Visual Understanding with Reinforced VLMs

Python 5,985 380 Updated Mar 12, 2026

[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Python 162 27 Updated Mar 21, 2025

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Python 66,508 5,961 Updated Jun 14, 2026
Python 24 1 Updated Jul 14, 2025

PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"

13 Updated Mar 11, 2026

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 844 65 Updated Mar 6, 2025

Fast low-bit matmul kernels in Triton

Python 471 34 Updated May 15, 2026

Official implementation of Half-Quadratic Quantization (HQQ)

Python 943 90 Updated Feb 26, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,867 2,465 Updated Jun 14, 2026

Official Repo for CVPR 2025 paper "OSDFace: One-Step Diffusion Model for Face Restoration"

Python 280 14 Updated Dec 23, 2025

[TMLR 2024] Efficient Large Language Models: A Survey

1,260 98 Updated Jun 23, 2025

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 765 59 Updated Aug 6, 2025
Next