Skip to content
View xinhaoc's full-sized avatar
🕶️
Focusing
🕶️
Focusing

Block or report xinhaoc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
32 results for source starred repositories
Clear filter

A curated list of projects related to the reMarkable tablet

7,223 247 Updated Feb 2, 2026

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

MLIR 825 60 Updated Jan 14, 2026

A lightweight design for computation-communication overlap.

Python 219 10 Updated Jan 20, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,052 846 Updated Feb 8, 2026

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

251 12 Updated May 6, 2025

FlashInfer: Kernel Library for LLM Serving

Python 4,913 698 Updated Feb 8, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,458 981 Updated Feb 6, 2026

Makefile 教程

HTML 302 35 Updated Mar 4, 2024

Github mirror of trition-lang/triton repo.

MLIR 128 37 Updated Feb 8, 2026

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 72 8 Updated Sep 15, 2025

Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower execution latency, and lower execution cost. Also has a simple …

Python 269 32 Updated May 16, 2025

Translation of C++ Core Guidelines [https://github.com/isocpp/CppCoreGuidelines] into Simplified Chinese.

2,523 348 Updated Dec 22, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,120 172 Updated Jan 29, 2026

Make a personal website using Notion and GitHub Pages

Shell 142 66 Updated Oct 27, 2023

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,241 1,664 Updated Feb 4, 2026

An Attention Superoptimizer

C++ 22 Updated Jan 20, 2025

MLX: An array framework for Apple silicon

C++ 23,836 1,500 Updated Feb 8, 2026

Paper collections of retrieval-based (augmented) language model.

232 12 Updated May 24, 2024

paper and its code for AI System

347 23 Updated Dec 13, 2025

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ 451 111 Updated Jan 23, 2026

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,859 248 Updated Feb 7, 2026

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 53,383 6,229 Updated Sep 18, 2024

📝 My blog / notes

245 34 Updated Sep 22, 2022

Quick, visual, principled introduction to pytorch code through five colab notebooks.

Jupyter Notebook 460 71 Updated Jan 13, 2025

we want to create a repo to illustrate usage of transformers in chinese

Shell 3,099 497 Updated Aug 18, 2024

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

C++ 1,265 128 Updated Aug 12, 2024
Next