Skip to content
View kayzee3327's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kayzee3327

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

📖 作为对《C++ Concurrency in Action - SECOND EDITION》的中文翻译。

2,321 457 Updated Jan 26, 2021

A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.

Go 105,820 14,985 Updated Mar 29, 2026

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

Cuda 435 27 Updated Mar 30, 2026

仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理

Jupyter Notebook 4,073 564 Updated Mar 26, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 90,597 13,900 Updated Apr 11, 2026

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 923 114 Updated Apr 9, 2026

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,716 255 Updated Nov 12, 2025

paper list, tutorial, and nano code snippet for Diffusion Large Language Models.

Jupyter Notebook 164 9 Updated Jan 19, 2026

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

761 37 Updated Feb 28, 2026

A curated list for Efficient Large Language Models

Python 1,980 159 Updated Jun 17, 2025

From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓

3,588 202 Updated May 7, 2025

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 534 26 Updated Mar 19, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,141 361 Updated Apr 9, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,561 1,783 Updated Apr 9, 2026

Mirror of the Xen Repository (PRs not accepted see: http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches)

C 800 385 Updated Apr 10, 2026

清华大学操作系统课程实验 (OS Kernel Labs)

C 2,239 453 Updated Aug 26, 2022

OS Labs for MOOC

C 417 199 Updated Sep 28, 2014

100 Days of ML Coding

50,447 11,437 Updated Dec 29, 2023

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,289 15,488 Updated Apr 12, 2026

Rust 程序设计语言(2024 edition 施工完毕)

Markdown 5,416 727 Updated Mar 28, 2026

Let's write an OS which can run on RISC-V in Rust from scratch!

Rust 2,011 546 Updated Mar 30, 2026

2023秋冬季开源操作系统训练营

1 Updated Apr 13, 2025

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 120,879 13,246 Updated Apr 10, 2026

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 9,069 2,312 Updated Mar 30, 2026

Material for gpu-mode lectures

Jupyter Notebook 5,945 600 Updated Feb 1, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,249 1,039 Updated Apr 12, 2026

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Python 1,530 204 Updated Apr 29, 2021

Machine Learning Engineering Open Book

Python 17,670 1,121 Updated Mar 16, 2026
Next