Skip to content
View JayFzh's full-sized avatar
🏠
Working from home
🏠
Working from home
  • SJTU & Alibaba Cloud
  • Hangzhou, China
  • 13:06 (UTC -12:00)

Highlights

  • Pro

Block or report JayFzh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Large Language Model (LLM) Systems Paper List

1,705 90 Updated Dec 22, 2025
Python 5 Updated May 10, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,465 1,999 Updated Nov 1, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 4,090 240 Updated Dec 15, 2025

基于Python的开源量化交易平台开发框架

Python 34,875 10,544 Updated Dec 24, 2025

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,645 2,256 Updated Dec 1, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

9,760 709 Updated Nov 7, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,212 85 Updated Aug 28, 2025

Linux kernel source tree

C 211,660 59,558 Updated Dec 24, 2025

Copilot Chat extension for VS Code

TypeScript 9,166 1,519 Updated Dec 24, 2025

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Python 14,192 3,058 Updated Jul 31, 2025

Aims to implement dual-port and multi-qp solutions in deepEP ibrc transport

Cuda 73 3 Updated May 9, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,868 1,815 Updated Oct 13, 2025

hostCC is a congestion control architecture which handles host congestion, along with in-network congestion

Shell 58 13 Updated Aug 10, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 21,943 3,861 Updated Dec 25, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,116 12,169 Updated Dec 25, 2025

Set of datasets for the deep learning recommendation model (DLRM).

48 17 Updated Dec 21, 2022

Efficient and easy multi-instance LLM serving

Python 518 44 Updated Sep 3, 2025

rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.

C++ 137 42 Updated Dec 22, 2025

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 20,015 1,673 Updated Nov 26, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,767 2,890 Updated Dec 24, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,454 124 Updated Dec 24, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,289 114 Updated Dec 16, 2025

DeepSeek-V3/R1 inference performance simulator

Jupyter Notebook 174 26 Updated Mar 27, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,679 753 Updated Dec 25, 2025

A lightweight, powerful framework for multi-agent workflows

Python 17,955 3,011 Updated Dec 24, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,433 7,814 Updated Dec 24, 2025
Next