ymwangg

Yanming W. ymwangg

21 followers · 11 following

@aws

Achievements

x2 x2

Achievements

x2 x2

Stars

rohan-gopalam / helion_nki

Forked from pytorch/helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 1 Updated Jun 11, 2026

aws-neuron / nkipy

NKIPy: Rapid Prototyping on Trainium

Python 28 9 Updated Jun 19, 2026

aws-neuron / nki-library

Python 64 7 Updated Jun 2, 2026

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 890 155 Updated Jun 22, 2026

awslabs / nki-autotune

Python 17 6 Updated Jun 19, 2026

huggingface / tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Rust 10,835 1,125 Updated Jun 22, 2026

billmei / every-chatgpt-gui

Every front-end GUI client for ChatGPT, Claude, and other LLMs

3,985 274 Updated Jan 22, 2026

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,351 410 Updated Apr 20, 2026

usc-isi / PipeEdge

PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices

Python 40 28 Updated Jan 31, 2024

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,939 1,919 Updated Jun 21, 2026

NVIDIA / Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downs…

Python 2,964 453 Updated Jun 22, 2026

Guangxuan-Xiao / torch-int

This repository contains integer operators on GPUs for PyTorch.

Python 235 55 Updated Sep 29, 2023

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,583 1,071 Updated Jul 1, 2024

Tebmer / Awesome-Knowledge-Distillation-of-LLMs

This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

1,293 72 Updated Mar 9, 2025