fengli1702

Daifeng Li fengli1702

USTC

4 followers · 28 following

ustc
hefei

Achievements

Highlights

Stars

causalflow-ai / petit-kernel

Optimized FP16/BF16 x FP4 GPU kernels for AMD GPUs

C++ 35 6 Updated Oct 9, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,920 918 Updated Dec 15, 2025

Relaxed-System-Lab / Flash-Sparse-Attention

🚀🚀 Efficient implementations of Native Sparse Attention

Python 1,039 12 Updated Sep 29, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 38,858 4,902 Updated Dec 9, 2025

USTC-System-Courses / MA-Digital-Lab-Guide

中国科学技术大学数字电路实验入门指南，2022年由马子睿助教创建。本仓库旨在让各位后续助教能够不断对其进行迭代

7 1 Updated Oct 10, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,982 1,586 Updated Dec 18, 2025

hgl71964 / cuasmrl

C++ 14 7 Updated Nov 9, 2024

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,965 877 Updated Dec 4, 2025

Gar-b-age / CookLikeHOC

🥢像老乡鸡🐔那样做饭。主要部分于2024年完工，非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》，并做归纳、编辑与整理。CookLikeHOC.

JavaScript 22,543 2,282 Updated Oct 17, 2025

OSH-2025 / MicroRust

在融合ArkFS+vivo50后，加入IPFS/Filecoin的分布式存储设计

Python 2 Updated Jul 3, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,435 1,991 Updated Nov 1, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,723 12,047 Updated Dec 18, 2025

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,371 1,349 Updated Jul 9, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,253 350 Updated Dec 18, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 17,876 2,455 Updated Dec 18, 2025

guch8017 / USTC_CS_EXAM

中科大计算机学院部分课程的试卷

89 6 Updated Jul 25, 2025

wyanzhao / s2e

a SymDrive copy

C 2 1 Updated Jan 26, 2018

dongyubin / DockerHub

2025年12月更新，目前国内可用Docker镜像源汇总，DockerHub国内镜像加速列表，🚀DockerHub镜像加速器

6,891 326 Updated Dec 16, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 21,178 2,230 Updated Dec 18, 2025

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,023 100 Updated Dec 30, 2024

ddiwu / alchitecture_2024_ustc

Assembly 1 Updated Jun 2, 2024

huangrt01 / TCP-Lab

Stanford computer networking lab, an elegant TCP/IP implementation

C++ 132 24 Updated May 27, 2023

MAdrid1011 / Digital-Image

存放中国科大2023春季学期《数字图像处理与分析》课程部分资源

MATLAB 8 Updated Sep 17, 2023

torvalds / linux

Linux kernel source tree

C 211,128 59,543 Updated Dec 18, 2025

travitch / whole-program-llvm

A wrapper script to build whole-program LLVM bitcode files

Python 724 132 Updated Dec 11, 2024

IoMT-Lab / neuralert_firmware

C 5 1 Updated Jul 7, 2025

efeslab / fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

Python 249 30 Updated Nov 18, 2024

stanford-cs149 / asst1

Stanford CS149 -- Assignment 1

C++ 139 176 Updated Oct 15, 2025

MAdrid1011 / USTC-NSCSCC

中国科学技术大学龙芯杯参赛作品仓库合集

15 Updated Oct 2, 2024

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,737 2,405 Updated Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daifeng Li fengli1702

Achievements

Achievements

Highlights

Block or report fengli1702

Stars

causalflow-ai / petit-kernel

deepseek-ai / FlashMLA

Relaxed-System-Lab / Flash-Sparse-Attention

karpathy / nanochat

USTC-System-Courses / MA-Digital-Lab-Guide

NVIDIA / cutlass

hgl71964 / cuasmrl

xlite-dev / LeetCUDA

Gar-b-age / CookLikeHOC

OSH-2025 / MicroRust

openai / gpt-oss

vllm-project / vllm

HW-whistleblower / True-Story-of-Pangu

tile-ai / tilelang

triton-lang / triton

guch8017 / USTC_CS_EXAM

wyanzhao / s2e

dongyubin / DockerHub

Dao-AILab / flash-attention

tspeterkim / flash-attention-minimal

ddiwu / alchitecture_2024_ustc

huangrt01 / TCP-Lab

MAdrid1011 / Digital-Image

torvalds / linux

travitch / whole-program-llvm

IoMT-Lab / neuralert_firmware

efeslab / fiddler

stanford-cs149 / asst1

MAdrid1011 / USTC-NSCSCC

huggingface / open-r1