Skip to content
View pandengyao's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report pandengyao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 13,893 1,303 Updated Oct 28, 2025

Minimalist developer portfolio using Next.js 14, React, TailwindCSS, Shadcn UI and Magic UI

TypeScript 1,195 312 Updated Apr 21, 2025

🐶 Kubernetes CLI To Manage Your Clusters In Style!

Go 32,234 2,030 Updated Dec 20, 2025

Ongoing research training transformer models at scale

Python 14,659 3,403 Updated Dec 21, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,921 353 Updated Dec 21, 2025

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,181 120 Updated Nov 9, 2025

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,241 253 Updated Dec 20, 2025

Tile primitives for speedy kernels

Cuda 3,009 217 Updated Dec 9, 2025

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

Jupyter Notebook 718 102 Updated Dec 19, 2025
Python 125 6 Updated Aug 18, 2025

Fast and memory-efficient exact attention

Python 21,205 2,234 Updated Dec 20, 2025

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 50,768 4,209 Updated Dec 16, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,443 1,995 Updated Nov 1, 2025

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 1,693 218 Updated Dec 20, 2025

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python 2,551 287 Updated Dec 19, 2025

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python 2,148 243 Updated Dec 18, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,437 328 Updated Dec 19, 2025

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 794 56 Updated Mar 6, 2025
Jupyter Notebook 154 17 Updated Mar 4, 2025

A framework for few-shot evaluation of language models.

Python 10,989 2,914 Updated Dec 18, 2025

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 468 54 Updated Nov 26, 2024

PyTorch native quantization and sparsity for training and inference

Python 2,582 387 Updated Dec 21, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 5,472 760 Updated Dec 3, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,887 2,275 Updated Sep 3, 2025

This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"

Python 116 8 Updated Oct 15, 2025

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Python 322 25 Updated Mar 4, 2025

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 961 82 Updated Sep 4, 2024

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 23,175 3,036 Updated Aug 15, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 51,273 8,585 Updated Nov 12, 2025

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 24,485 5,835 Updated Aug 14, 2024
Next