Skip to content
View kayzee3327's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kayzee3327

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

📖 作为对《C++ Concurrency in Action - SECOND EDITION》的中文翻译。

2,323 457 Updated Jan 26, 2021

A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.

Go 105,840 14,984 Updated Mar 29, 2026

CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning

Cuda 436 27 Updated Mar 30, 2026

仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理

Jupyter Notebook 4,078 564 Updated Mar 26, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 90,637 13,906 Updated Apr 11, 2026

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 925 114 Updated Apr 9, 2026

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,718 256 Updated Nov 12, 2025

paper list, tutorial, and nano code snippet for Diffusion Large Language Models.

Jupyter Notebook 164 9 Updated Jan 19, 2026

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

761 37 Updated Feb 28, 2026

A curated list for Efficient Large Language Models

Python 1,983 159 Updated Jun 17, 2025

From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓

3,590 202 Updated May 7, 2025

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 534 26 Updated Mar 19, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 5,140 360 Updated Apr 9, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,564 1,784 Updated Apr 9, 2026

Mirror of the Xen Repository (PRs not accepted see: http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches)

C 800 385 Updated Apr 10, 2026

清华大学操作系统课程实验 (OS Kernel Labs)

C 2,239 453 Updated Aug 26, 2022

OS Labs for MOOC

C 417 199 Updated Sep 28, 2014

100 Days of ML Coding

50,489 11,441 Updated Dec 29, 2023

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,371 15,513 Updated Apr 13, 2026

Rust 程序设计语言(2024 edition 施工完毕)

Markdown 5,419 727 Updated Mar 28, 2026

Let's write an OS which can run on RISC-V in Rust from scratch!

Rust 2,012 546 Updated Mar 30, 2026

2023秋冬季开源操作系统训练营

1 Updated Apr 13, 2025

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 120,922 13,250 Updated Apr 13, 2026

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 9,072 2,312 Updated Mar 30, 2026

Material for gpu-mode lectures

Jupyter Notebook 5,948 600 Updated Feb 1, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,256 1,041 Updated Apr 12, 2026

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Python 1,530 204 Updated Apr 29, 2021

Machine Learning Engineering Open Book

Python 17,681 1,122 Updated Mar 16, 2026
Next