Skip to content
View jason-huang03's full-sized avatar
  • Tsinghua University
  • Beijing, China

Organizations

@thu-nics @thu-ml

Block or report jason-huang03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
269 results for source starred repositories
Clear filter

🚀🚀 Efficient implementations of Native Sparse Attention

Python 994 8 Updated Sep 29, 2025

A powerful toolkit for compressing large models including LLM, VLM, and video generation models.

Python 612 62 Updated Nov 5, 2025

High-throughput tensor loading for PyTorch

Python 194 11 Updated Oct 27, 2025

Development repository for the Triton language and compiler

MLIR 17,502 2,365 Updated Nov 7, 2025

Propositions of solutions to the exercises from Terence Tao's textbooks, Analysis I & II. Mirrored from https://gitlab.com/f-santos/taoanalysissolutions

TeX 97 11 Updated Jan 17, 2023

Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Large Language Models.

Python 72 14 Updated Oct 1, 2025

NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer

Cuda 142 11 Updated Sep 18, 2025

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 433 72 Updated Nov 7, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,399 244 Updated Nov 7, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,553 84 Updated Nov 4, 2025

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 395 8 Updated Nov 7, 2025

Light Video Generation Inference Framework

Python 769 48 Updated Nov 7, 2025

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python 916 36 Updated Oct 14, 2025

CUDA Kernel Benchmarking Library

Cuda 762 90 Updated Oct 21, 2025
Python 120 6 Updated Aug 18, 2025

青稞Talk

159 1 Updated Nov 5, 2025

Hands-On Practical MLIR Tutorial

C++ 647 93 Updated Oct 20, 2023

DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.

Python 66 4 Updated Nov 5, 2025
C++ 311 27 Updated Nov 6, 2025
Python 9 Updated Jul 25, 2025

This is a repo to track the latest autoregressive visual generation papers.

409 5 Updated Jun 25, 2025

😼 优雅地使用基于 clash/mihomo 的代理环境

Shell 5,591 709 Updated Nov 7, 2025

CUDA on non-NVIDIA GPUs

Rust 13,386 848 Updated Nov 6, 2025

The missing star history graph of GitHub repos - https://star-history.com

TypeScript 8,027 302 Updated Nov 7, 2025

Distributed query engine providing simple and reliable data processing for any modality and scale

Rust 4,676 330 Updated Nov 7, 2025

A compiler for the SYSY language (a subset of C). My homework for the course "compiler principles"

C++ 8 Updated Aug 6, 2024

NanoGPT (124M) in 3 minutes

Python 3,774 489 Updated Nov 6, 2025

A Quirky Assortment of CuTe Kernels

Python 650 58 Updated Oct 30, 2025
Next