-
Z.ai
- Beijing
-
21:46
(UTC +08:00) - https://www.zhihu.com/people/zhu-xiao-lin-22-96
-
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
-
-
-
es Public archive
A JavaScript interpreter from scratch, supporting ES5 syntax.
-
Megatron-Bridge Public
Forked from fzyzcjy/Megatron-BridgeTraining library for Megatron-based models
Python Apache License 2.0 UpdatedDec 22, 2025 -
asystem-amem Public
Forked from inclusionAI/asystem-amemA NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.
-
ring-flash-attention Public
Ring attention implementation with flash attention
-
torch_memory_saver Public
Forked from fzyzcjy/torch_memory_saverAllow torch tensor memory to be released and resumed later
-
-
pytorch-malloc Public
An external memory allocator example for PyTorch.
-
-
-
torch_utils Public
Forked from fzyzcjy/torch_utilsUtility scripts for PyTorch
Python UpdatedJul 5, 2025 -
-
-
-
Megatron-LM Public
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
Python Other UpdatedMar 20, 2025 -
faster-nougat Public
Implementation of nougat that focuses on processing pdf locally.
-
OpenRLHF Public
Forked from OpenRLHF/OpenRLHFAn Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
-
pdf-with-its-own-md5 Public
A PDF template that contains its own MD5!
-
-
unilm Public
Forked from microsoft/unilmLarge-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
-
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedOct 7, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedSep 30, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
-
-
grouped_gemm Public
Forked from fanshiqing/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
Cuda Apache License 2.0 UpdatedJul 18, 2024 -
-
scattermoe Public
Forked from shawntan/scattermoeTriton-based implementation of Sparse Mixture of Experts.
-
instruct-eval Public
Forked from declare-lab/instruct-evalThis repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
Python Apache License 2.0 UpdatedJan 18, 2024