JamesTheZ

ZHENG, Zhen JamesTheZ

106 followers · 71 following

https://jamesthez.github.io/

Achievements

Highlights

Stars

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 1,027 136 Updated Jun 20, 2026

SandAI-org / MagiAttention

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 855 58 Updated Jun 22, 2026

GAIR-NLP / ASI-Evolve

Python 764 190 Updated Apr 17, 2026

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,331 104 Updated Aug 28, 2025

microsoft / MixLLM

LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

C++ 6 2 Updated Mar 31, 2026

pranjalssh / fast.cu

Fastest kernels written from scratch

Cuda 583 76 Updated Sep 18, 2025

d2l-ai / d2l-en

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

Python 29,034 5,074 Updated Aug 18, 2024

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,538 1,312 Updated Jul 9, 2025

microsoft / Tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

C 994 109 Updated Jun 21, 2026

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,381 8,855 Updated Jun 22, 2026

Kevinstone-199898 / vllm

Python 1 Updated Oct 6, 2025

dafish-ai / NTU-Machine-learning

台湾大学李宏毅老师机器学习

Jupyter Notebook 1,178 383 Updated Jul 15, 2019

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

1,848 128 Updated Feb 23, 2026

deepseek-ai / DeepSeek-V3

Python 103,781 16,734 Updated Aug 28, 2025

udlbook / udlbook

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook 9,579 2,260 Updated Feb 24, 2026

dropbox / gemlite

Fast low-bit matmul kernels in Triton

Python 474 35 Updated May 15, 2026

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 24,564 2,809 Updated May 25, 2026

facebookresearch / SpinQuant

Code repo for the paper "SpinQuant LLM quantization with learned rotations"

Python 405 90 Updated Feb 14, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes for ML SYS.

Python 6,565 448 Updated Jun 18, 2026

openai / human-eval

Code for the paper "Evaluating Large Language Models Trained on Code"

Python 3,270 444 Updated Jan 17, 2025

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Python 1,048 264 Updated Jul 22, 2025

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python 2,663 309 Updated Jun 22, 2026

microsoft / sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Python 506 64 Updated Jan 8, 2026

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 963 50 Updated Mar 29, 2026

ROCm / flash-attention

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python 233 78 Updated Jun 22, 2026

ROCm / bitsandbytes

Forked from bitsandbytes-foundation/bitsandbytes

8-bit CUDA functions for PyTorch

Python 72 13 Updated Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZHENG, Zhen JamesTheZ

Achievements

Achievements

Highlights

Block or report JamesTheZ

Stars

Dao-AILab / quack

SandAI-org / MagiAttention

GAIR-NLP / ASI-Evolve

bytedance / flux

microsoft / MixLLM

pranjalssh / fast.cu

d2l-ai / d2l-en

HW-whistleblower / True-Story-of-Pangu

microsoft / Tutel

hiyouga / LlamaFactory

Kevinstone-199898 / vllm

dafish-ai / NTU-Machine-learning

HuangOwen / Awesome-LLM-Compression

deepseek-ai / DeepSeek-V3

udlbook / udlbook

dropbox / gemlite

liguodongiot / llm-action

facebookresearch / SpinQuant

zhaochenyang20 / Awesome-ML-SYS-Tutorial

openai / human-eval

bigcode-project / bigcode-evaluation-harness

intel / neural-compressor

microsoft / sarathi-serve

efeslab / Nanoflow

ROCm / flash-attention

ROCm / bitsandbytes

Infini-AI-Lab / MagicPIG

FMInference / FlexLLMGen

October2001 / Awesome-KV-Cache-Compression

bytedance / ABQ-LLM