Skip to content
View ZhangYunchenY's full-sized avatar
🧉
🧉

Block or report ZhangYunchenY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

143 results for source starred repositories
Clear filter

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,466 95 Updated Feb 6, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,448 981 Updated Jan 20, 2026

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Python 675 69 Updated Nov 19, 2025

Dynamically get the suggested clusters in the data for unsupervised learning.

Rust 227 50 Updated Jul 31, 2024

UP-TO-DATE LLM Watermark paper. 🔥🔥🔥

371 20 Updated Dec 12, 2024

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment

Python 791 61 Updated Apr 24, 2023

Fast inference from large lauguage models via speculative decoding

Python 886 96 Updated Aug 22, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,882 298 Updated Feb 5, 2026

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

2,197 185 Updated Apr 30, 2025

🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Python 2,823 231 Updated Jun 23, 2023

CLI platform to experiment with codegen. Precursor to: https://lovable.dev

Python 55,208 7,338 Updated May 14, 2025

LangChain 的中文入门教程

8,795 695 Updated Apr 19, 2025

🦜🔗 The platform for reliable agents.

Python 126,042 20,725 Updated Feb 6, 2026

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,432 292 Updated Jul 17, 2025

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,090 122 Updated Jun 1, 2023

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,833 872 Updated Jun 10, 2024

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Python 1,143 85 Updated Jan 11, 2024

Aligning pretrained language models with instruction data generated by themselves.

Python 4,571 525 Updated Mar 27, 2023

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

Python 551 44 Updated Mar 10, 2024

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 17,650 2,877 Updated Nov 3, 2025

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,926 568 Updated Jul 11, 2025

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,591 2,172 Updated Feb 3, 2026

Open-source keyboard firmware for Atmel AVR and Arm USB families

C 20,103 43,468 Updated Feb 5, 2026

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,392 4,777 Updated Jun 2, 2025

Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]

Shell 38 7 Updated Aug 27, 2024

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Python 9,919 591 Updated Sep 7, 2024

C++ implementation for BLOOM

C 809 58 Updated May 13, 2023

Tensor library for machine learning

C++ 13,916 1,459 Updated Jan 30, 2026
Next