Skip to content
View airhaohan's full-sized avatar
😁
keep learning
😁
keep learning

Block or report airhaohan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
C++ 22 8 Updated Aug 14, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,080 11,032 Updated Nov 5, 2025

Expert Parallelism Load Balancer

Python 1,291 195 Updated Mar 24, 2025

A curated list for Efficient Large Language Models

Python 1,891 144 Updated Jun 17, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,658 187 Updated Jun 25, 2024

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,873 305 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,861 736 Updated Oct 15, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,691 972 Updated Nov 5, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,842 896 Updated Sep 30, 2025

Awesome LLMs on Device: A Comprehensive Survey

1,243 109 Updated Jan 12, 2025

Tensor library for machine learning

C++ 13,373 1,374 Updated Nov 4, 2025

MLX: An array framework for Apple silicon

C++ 22,703 1,377 Updated Nov 5, 2025

LLM inference in C/C++

C++ 89,010 13,535 Updated Nov 5, 2025

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 13,414 2,098 Updated Nov 4, 2025

The full minitorch student suite.

Python 2,213 516 Updated Aug 17, 2024

高性能计算相关知识学习笔记,包含学习笔记和相关知识的代码demo,在持续完善中。 如果有帮助的话请Star一下,对作者帮助很大,谢谢!

Jupyter Notebook 448 37 Updated Mar 28, 2023

Pretrained Language Models for Source code

Jupyter Notebook 254 32 Updated Jun 1, 2021

4 labs + 2 challenges + 4 docs

Shell 1,565 251 Updated Oct 14, 2023

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,451 788 Updated Apr 28, 2025
Jupyter Notebook 12 1 Updated Jan 19, 2022

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 21,702 2,540 Updated Oct 19, 2025

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 25,705 2,583 Updated Nov 4, 2025

A list of awesome research on log analysis, anomaly detection, fault localization, and AIOps

776 125 Updated Dec 31, 2023

量化研究-券商金工研报复现

Jupyter Notebook 4,033 1,042 Updated Jul 23, 2025

基于Python的开源量化交易平台开发框架

Python 33,500 10,292 Updated Nov 2, 2025

🔥 Linux下C++轻量级WebServer服务器

C++ 18,832 4,193 Updated Jul 5, 2024

中文的C++ Template的教学指南。与知名书籍C++ Templates不同,该系列教程将C++ Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)

C++ 10,442 1,618 Updated Aug 20, 2024

音频可视化展示模块

JavaScript 255 70 Updated Mar 6, 2024

Paper Lists for Graph Neural Networks

2,279 383 Updated Dec 29, 2023
Next