xinhaoc

🕶️

Focusing

Xinhao Cheng xinhaoc

🕶️

Focusing

38 followers · 16 following

Carnegie Mellon University
Pittsburgh, PA
11:45 (UTC -07:00)
https://xinhaoc.github.io

Achievements

x3 x2

Achievements

x3 x2

Stars

jiazhihao / sass-agents

Sass 12 4 Updated Apr 24, 2026

lithos-ai / motus

The open-source agent-serving project

Python 348 23 Updated Apr 28, 2026

technillogue / ptx-isa-markdown

PTX ISA 9.1 documentation converted to searchable markdown. Includes Claude Code skill for CUDA development.

Python 185 34 Updated Dec 24, 2025

0xD0GF00D / DocumentSASS

Unofficial description of the CUDA assembly (SASS) instruction sets.

Python 211 19 Updated Jul 18, 2025

SandAI-org / MagiCompiler

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 299 23 Updated Apr 27, 2026

EcthelionLiu / TodoEvolve

Python 18 2 Updated Feb 10, 2026

ChenLiu-1996 / figures4papers

My Python scripts to make high-quality figures for publications in top AI conferences and journals.

Python 887 69 Updated Apr 19, 2026

reHackable / awesome-reMarkable

A curated list of projects related to the reMarkable tablet

7,360 255 Updated Apr 15, 2026

NVIDIA / cuda-tile

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

C++ 949 77 Updated Apr 1, 2026

infinigence / FlashOverlap

A lightweight design for computation-communication overlap.

Python 229 15 Updated Jan 20, 2026

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,688 1,066 Updated Apr 28, 2026

MekkCyber / CutlassAcademy

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

256 14 Updated May 6, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 5,525 939 Updated Apr 28, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,604 1,015 Updated Apr 27, 2026

gavinliu6 / Makefile-Tutorial-zh-CN

Makefile 教程

HTML 302 35 Updated Mar 4, 2024

facebookexperimental / triton

Github mirror of trition-lang/triton repo.

MLIR 162 43 Updated Apr 28, 2026

flexflow / flexflow-serve

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 83 11 Updated Sep 15, 2025

GenseeAI / cognify

Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower execution latency, and lower execution cost. Also has a simple …

Python 275 29 Updated May 16, 2025