Skip to content
View xinhaoc's full-sized avatar
🕶️
Focusing
🕶️
Focusing

Block or report xinhaoc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 53,809 6,301 Updated Sep 18, 2024

MLX: An array framework for Apple silicon

C++ 24,873 1,616 Updated Mar 30, 2026

A curated list of awesome READMEs

20,655 3,936 Updated Mar 10, 2026

A collection of full time roles in SWE, Quant, and PM for new grads.

16,592 1,271 Updated Mar 30, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,545 1,007 Updated Feb 6, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,505 1,756 Updated Mar 30, 2026

A curated list of projects related to the reMarkable tablet

7,317 250 Updated Mar 4, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,441 971 Updated Mar 30, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,237 833 Updated Mar 30, 2026

we want to create a repo to illustrate usage of transformers in chinese

Shell 3,168 501 Updated Aug 18, 2024

Translation of C++ Core Guidelines [https://github.com/isocpp/CppCoreGuidelines] into Simplified Chinese.

2,546 348 Updated Dec 22, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,179 186 Updated Mar 30, 2026

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,870 248 Updated Mar 25, 2026

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

C++ 1,304 128 Updated Mar 30, 2026

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…

C++ 884 67 Updated Mar 24, 2026

My Python scripts to make high-quality figures for publications in top AI conferences and journals.

Python 725 55 Updated Mar 18, 2026

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ 471 116 Updated Feb 20, 2026

Quick, visual, principled introduction to pytorch code through five colab notebooks.

Jupyter Notebook 471 75 Updated Jan 13, 2025

paper and its code for AI System

356 23 Updated Feb 10, 2026

Makefile 教程

HTML 301 35 Updated Mar 4, 2024

Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower execution latency, and lower execution cost. Also has a simple …

Python 273 30 Updated May 16, 2025

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 254 18 Updated Mar 28, 2026

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

254 13 Updated May 6, 2025

📝 My blog / notes

245 34 Updated Sep 22, 2022

Paper collections of retrieval-based (augmented) language model.

232 12 Updated May 24, 2024

A lightweight design for computation-communication overlap.

Python 225 15 Updated Jan 20, 2026

Unofficial description of the CUDA assembly (SASS) instruction sets.

Python 207 19 Updated Jul 18, 2025
Next