Skip to content
View hanzz2007's full-sized avatar

Block or report hanzz2007

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TokenSpeed is a speed-of-light LLM inference engine.

Python 1,538 183 Updated Jul 2, 2026

High Performance LLM Inference Operator Library

C++ 989 102 Updated Jul 2, 2026

A Triton JIT runtime and ffi provider in C++

C++ 37 18 Updated Jun 30, 2026

Efficient Triton Kernels for LLM Training

Python 6,476 549 Updated Jul 2, 2026

Towards Holistic evaluation of Generative Diffusion Transformers!

Python 91 4 Updated Jul 1, 2026

An LLM post-training framework with vLLM for RL Scaling

Python 318 42 Updated Jul 2, 2026

Skills for writing tilelang and debugging with CUDA toolkits.

Python 127 5 Updated May 20, 2026

Open Neural Network Exchange to C compiler.

C 400 76 Updated Apr 23, 2026

Large Language Model (LLM) Systems Paper List

2,157 113 Updated Jun 21, 2026

从零开始玩转OpenClaw:最全面的中文教程,涵盖安装、配置、实战案例和避坑指南(github版)

Shell 4,512 678 Updated Jun 19, 2026

claude code profiler

Python 7 Updated Mar 6, 2026

TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)

C++ 73 26 Updated Jul 1, 2026

let coding agents use ncu skills analysis cuda program automatically!

Shell 116 8 Updated May 25, 2026

Glamourous agentic coding for all 💘

Go 25,972 1,894 Updated Jul 2, 2026

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Python 3,898 260 Updated Mar 7, 2026

Fast and memory efficient c++ flat hash table/map/set

C++ 721 73 Updated Jul 2, 2026

Claude Code agent skill for autonomous AI/scientific research workflows

HTML 12 Updated Mar 5, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 10,314 770 Updated Jun 16, 2026

KsanaDiT: High-Performance DiT (Diffusion Transformer) Inference Framework for Video & Image Generation

Python 59 6 Updated May 13, 2026

from vibe coding to agentic engineering - practice makes claude perfect

HTML 61,829 6,183 Updated Jul 2, 2026

Material for gpu-mode lectures

Jupyter Notebook 6,275 629 Updated Jun 15, 2026

how to optimize some algorithm in cuda.

Cuda 3,119 283 Updated Jun 28, 2026

AI Tensor Engine for ROCm

Python 476 386 Updated Jul 2, 2026

A lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, Telegram, Slack, Discord, Gmail and other messaging apps,, has memory, scheduled jobs, and runs dir…

TypeScript 30,079 12,901 Updated Jul 2, 2026
Python 1 Updated Feb 25, 2026

LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model

Python 118 6 Updated Apr 28, 2026

Create beautiful slides on the web using a coding agent's frontend skills

JavaScript 24,320 1,983 Updated Jun 23, 2026
Python 5 Updated Feb 26, 2026
Next