Skip to content
View ZhangYunchenY's full-sized avatar
🧉
🧉

Block or report ZhangYunchenY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,353 91 Updated Dec 19, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,923 918 Updated Dec 15, 2025

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLM, VLM, and video generation models.

Python 643 64 Updated Nov 19, 2025

Dynamically get the suggested clusters in the data for unsupervised learning.

Rust 225 50 Updated Jul 31, 2024

UP-TO-DATE LLM Watermark paper. 🔥🔥🔥

367 20 Updated Dec 12, 2024

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment

Python 792 61 Updated Apr 24, 2023

Fast inference from large lauguage models via speculative decoding

Python 868 93 Updated Aug 22, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,795 290 Updated Dec 19, 2025

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

2,176 178 Updated Apr 30, 2025

🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Python 2,824 231 Updated Jun 23, 2023

CLI platform to experiment with codegen. Precursor to: https://lovable.dev

Python 55,122 7,343 Updated May 14, 2025
Python 3,389 282 Updated Sep 29, 2023

LangChain 的中文入门教程

8,688 682 Updated Apr 19, 2025

🦜🔗 The platform for reliable agents.

Python 122,268 20,158 Updated Dec 19, 2025

Build a Flutter Q&A bot of Flutter Docs Site (https://docs.flutter.dev/)

Python 15 4 Updated May 13, 2023

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,393 284 Updated Jul 17, 2025

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,081 122 Updated Jun 1, 2023

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,789 869 Updated Jun 10, 2024

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Python 1,136 85 Updated Jan 11, 2024

Aligning pretrained language models with instruction data generated by themselves.

Python 4,548 524 Updated Mar 27, 2023

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

Python 551 44 Updated Mar 10, 2024

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 17,442 2,853 Updated Nov 3, 2025

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,879 567 Updated Jul 11, 2025

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,306 2,131 Updated Dec 18, 2025

Open-source keyboard firmware for Atmel AVR and Arm USB families

C 19,946 43,196 Updated Dec 19, 2025

Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.

Python 585 47 Updated Mar 25, 2023

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,323 4,780 Updated Jun 2, 2025

Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]

Shell 37 6 Updated Aug 27, 2024
Next