aizyler

Follow

aizyler

Follow

4 followers · 12 following

Stars

xiaoweiChen / CPP-Concurrency-In-Action-2ed-2019

📖 作为对《C++ Concurrency in Action - SECOND EDITION》的中文翻译。

2,319 456 Updated Jan 26, 2021

linux-rdma / rdma-core

RDMA core userspace libraries and daemons

C 2,175 836 Updated Mar 16, 2026

Ma-Lab-Berkeley / deep-representation-learning-book

Learning Deep Representations of Data Distributions

TeX 919 90 Updated Mar 30, 2026

eunomia-bpf / nccl-eBPF

C 9 1 Updated Mar 12, 2026

sonnyli / flash_attention_from_scratch

Flash Attention from Scratch on CUDA Ampere

Assembly 158 23 Updated Sep 1, 2025

Roger-Fpeng / ampere_flash_attention_from_scratch

This is an implementation of flash attention from scratch, without importing any external libraries.

Cuda 22 2 Updated Mar 15, 2026

perplexityai / pplx-garden

Perplexity open source garden for inference technology

Rust 386 36 Updated Dec 25, 2025

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,265 385 Updated Jan 17, 2026

skindhu / Build-A-Large-Language-Model-CN

《Build a Large Language Model (From Scratch)》是一本深入探讨大语言模型原理与实现的电子书，适合希望深入了解 GPT 等大模型架构、训练过程及应用开发的学习者。为了让更多中文读者能够接触到这本极具价值的教材，我决定将其翻译成中文，并通过 GitHub 进行开源共享。

HTML 3,464 586 Updated Sep 7, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 50,769 6,663 Updated Mar 27, 2026

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 15,869 3,771 Updated Mar 31, 2026

jinbooooom / OriginDL

Implement a Pytorch-like DL library in C++ from scratch, step by step

C++ 230 32 Updated Mar 26, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,547 1,005 Updated Mar 31, 2026

doocs / source-code-hunter

😱 从源码层面，剖析挖掘互联网行业主流技术的底层实现原理，为广大开发者 “提升技术深度” 提供便利。目前开放 Spring 全家桶，Mybatis、Netty、Dubbo 框架，及 Redis、Tomcat 中间件等

Java 23,133 4,254 Updated Jan 9, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 23,064 2,569 Updated Mar 31, 2026

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT！🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 45,090 5,449 Updated Mar 31, 2026

CalvinXKY / BasicCUDA

A tutorial for CUDA&PyTorch

Cuda 378 51 Updated Mar 23, 2026

CVCUDA / CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,667 248 Updated Jan 22, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 343,012 67,812 Updated Mar 31, 2026

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes for ML SYS.

Python 5,818 377 Updated Mar 19, 2026

Infrasys-AI / AIInfra

AIInfra（AI 基础设施）指AI系统从底层芯片等硬件，到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 6,584 866 Updated Dec 22, 2025

CalvinXKY / InfraTech

分享AI Infra知识&代码练习：PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 1,461 116 Updated Mar 28, 2026

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 17,587 1,114 Updated Mar 16, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 12,613 1,833 Updated Nov 3, 2025

ChenYuHo / Hybrid-EP

Cuda 3 Updated Jan 23, 2026

ckormanyos / real-time-cpp

Source code for the book Real-Time C++, by Christopher Kormanyos

C++ 776 190 Updated Feb 27, 2026

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 5,245 835 Updated Mar 31, 2026

Tencent / hpc-ops

High Performance LLM Inference Operator Library

C++ 805 75 Updated Feb 5, 2026

xiaoweiChen / Template-Metaprogramming-with-CPP

《Template Metaprogramming with C++ 》的非专业个人翻译

TeX 96 17 Updated May 25, 2023

Vonng / ddia

《Designing Data-Intensive Application》DDIA 第一版 / 第二版中文翻译

Python 22,844 4,526 Updated Feb 24, 2026