Skip to content
View 666DZY666's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.
  • Peking University
  • Beijing

Block or report 666DZY666

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

Python 1,209 131 Updated May 20, 2026

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 4,146 589 Updated May 20, 2026

End to end benchmark using Slidesparse GEMM kernels

Python 4 1 Updated Mar 31, 2026

🧠「大模型」2小时完全从0训练64M的小参数LLM!Train a 64M-parameter LLM from scratch in just 2h!

Python 50,279 6,409 Updated May 19, 2026

Accessible large language models via k-bit quantization for PyTorch.

Python 8,214 854 Updated May 15, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,728 401 Updated May 20, 2026

Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

Python 2,716 94 Updated Apr 25, 2023

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 104,384 13,753 Updated May 20, 2026

This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficien…

Python 89 87 Updated May 20, 2026

Piecewise-Affine Regularized Quantization

Python 19 4 Updated Feb 5, 2026
Python 43 2 Updated May 19, 2026

Official repository of paper titled "CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications"

Python 93 12 Updated Jan 15, 2026

real time face swap and one-click video deepfake with only a single image

Python 93,205 13,568 Updated May 20, 2026

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Python 11,075 872 Updated May 20, 2026

[CVPR 2025] Official repository for GETA

Python 42 5 Updated Nov 5, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 4,070 328 Updated May 20, 2026

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Python 715 80 Updated May 14, 2026

[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.

Python 180 17 Updated Apr 24, 2026

Material for gpu-mode lectures

Jupyter Notebook 6,088 613 Updated May 9, 2026

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

Python 4,414 737 Updated Mar 15, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,229 376 Updated Apr 20, 2026

一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework

C++ 1,820 219 Updated Apr 25, 2026

PyTorch Tutorial for Deep Learning Researchers

Python 32,359 8,246 Updated Aug 15, 2023

how to learn PyTorch and OneFlow

496 30 Updated May 20, 2026

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 3,430 363 Updated Jun 22, 2025

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

Python 878 134 Updated Mar 3, 2026

A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆26 knowledge distillation methods presented at TPAMI, CVPR, ICLR, ECCV, NeurIPS, ICCV, AAAI, etc…

Python 1,616 145 Updated Mar 31, 2026

A collection of design patterns/idioms in Python

Python 42,753 7,035 Updated Mar 13, 2026

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

Python 3,309 382 Updated Sep 7, 2025

An official implementation of "Network Quantization with Element-wise Gradient Scaling" (CVPR 2021) in PyTorch.

Python 96 17 Updated Jul 14, 2023
Next