Skip to content
View yhyang201's full-sized avatar

Highlights

  • Pro

Block or report yhyang201

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 1,077 120 Updated Jun 12, 2026

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 890 155 Updated Jun 23, 2026

A compiler, optimizer and executor for financial expressions and factors

C++ 292 56 Updated May 29, 2026
Python 267 32 Updated Jun 9, 2026
Python 196 31 Updated Jun 23, 2026

Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini C…

TypeScript 66,525 5,520 Updated Jun 20, 2026

LLM KV cache compression made easy

Python 1,117 155 Updated Jun 22, 2026

Simple samples for TensorRT programming

Python 1,662 349 Updated May 5, 2026

TokenSpeed is a speed-of-light LLM inference engine.

Python 1,484 168 Updated Jun 23, 2026

A project to improve skills of large language models

Python 984 190 Updated Jun 22, 2026

high-performance linear attention kernel library built on TileLang

Python 556 48 Updated May 7, 2026

From Automated Idea Factory to Realization

Shell 1,177 97 Updated Jun 13, 2026

A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downs…

Python 2,970 453 Updated Jun 23, 2026

Run your GitHub Actions locally 🚀

Go 70,820 1,960 Updated Jun 1, 2026

CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

Python 525 65 Updated Jun 23, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,195 109,906 Updated Jun 8, 2026

An agentic skills framework & software development methodology that works.

Shell 236,491 20,987 Updated Jun 23, 2026

Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1

Python 68,013 11,063 Updated Jun 22, 2026

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 314 23 Updated Jun 23, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 903 265 Updated Jun 22, 2026

An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation

Python 169 7 Updated Jan 15, 2024

AI agents running research on single-GPU nanochat training automatically

Python 88,230 12,770 Updated Mar 26, 2026

Automated High-Performance GPU Kernel Generation

Python 116 22 Updated Jun 1, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,710 1,063 Updated Apr 30, 2026

0 - 1 learn OpenClaw: sections to build an claw-AI agent from scratch

Python 2,979 344 Updated Mar 18, 2026

Public repository for Agent Skills

Python 154,164 18,171 Updated Jun 9, 2026
Cuda 154 20 Updated Mar 18, 2024

💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minec…

TypeScript 41,197 4,146 Updated Jun 23, 2026
Next