Skip to content
View aizyler's full-sized avatar

Block or report aizyler

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📖 作为对《C++ Concurrency in Action - SECOND EDITION》的中文翻译。

2,319 456 Updated Jan 26, 2021

RDMA core userspace libraries and daemons

C 2,175 836 Updated Mar 16, 2026

Learning Deep Representations of Data Distributions

TeX 919 90 Updated Mar 30, 2026
C 9 1 Updated Mar 12, 2026

Flash Attention from Scratch on CUDA Ampere

Assembly 158 23 Updated Sep 1, 2025

This is an implementation of flash attention from scratch, without importing any external libraries.

Cuda 22 2 Updated Mar 15, 2026

Perplexity open source garden for inference technology

Rust 386 36 Updated Dec 25, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,265 385 Updated Jan 17, 2026

《Build a Large Language Model (From Scratch)》是一本深入探讨大语言模型原理与实现的电子书,适合希望深入了解 GPT 等大模型架构、训练过程及应用开发的学习者。为了让更多中文读者能够接触到这本极具价值的教材,我决定将其翻译成中文,并通过 GitHub 进行开源共享。

HTML 3,464 586 Updated Sep 7, 2025

The best ChatGPT that $100 can buy.

Python 50,769 6,663 Updated Mar 27, 2026

Ongoing research training transformer models at scale

Python 15,869 3,771 Updated Mar 31, 2026

Implement a Pytorch-like DL library in C++ from scratch, step by step

C++ 230 32 Updated Mar 26, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,547 1,005 Updated Mar 31, 2026

😱 从源码层面,剖析挖掘互联网行业主流技术的底层实现原理,为广大开发者 “提升技术深度” 提供便利。目前开放 Spring 全家桶,Mybatis、Netty、Dubbo 框架,及 Redis、Tomcat 中间件等

Java 23,133 4,254 Updated Jan 9, 2026

Fast and memory-efficient exact attention

Python 23,064 2,569 Updated Mar 31, 2026

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 45,090 5,449 Updated Mar 31, 2026

A tutorial for CUDA&PyTorch

Cuda 378 51 Updated Mar 23, 2026

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,667 248 Updated Jan 22, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 343,012 67,812 Updated Mar 31, 2026

My learning notes for ML SYS.

Python 5,818 377 Updated Mar 19, 2026

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 6,584 866 Updated Dec 22, 2025

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 1,461 116 Updated Mar 28, 2026

Machine Learning Engineering Open Book

Python 17,587 1,114 Updated Mar 16, 2026

Nano vLLM

Python 12,613 1,833 Updated Nov 3, 2025
Cuda 3 Updated Jan 23, 2026

Source code for the book Real-Time C++, by Christopher Kormanyos

C++ 776 190 Updated Feb 27, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,245 835 Updated Mar 31, 2026

High Performance LLM Inference Operator Library

C++ 805 75 Updated Feb 5, 2026

《Template Metaprogramming with C++ 》的非专业个人翻译

TeX 96 17 Updated May 25, 2023

《Designing Data-Intensive Application》DDIA 第一版 / 第二版 中文翻译

Python 22,844 4,526 Updated Feb 24, 2026
Next