Stars
xhx1022 / vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
NiklasFreymuth / troll
Forked from verl-project/verlTROLL: Trust Region Optimization for Large Language models
meituan-search / verl
Forked from verl-project/verlverl: Volcano Engine Reinforcement Learning for LLMs
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
yuanzhoulvpi2017 / nano_rl
Forked from verl-project/verl在verl上做reward的定制开发
sail-sg / VocabularyParallelism
Forked from NVIDIA/Megatron-LMVocabulary Parallelism
Zero Bubble Pipeline Parallelism
Zero Bubble Pipeline Parallelism
thunlp / Seq1F1B
Forked from NVIDIA/Megatron-LMSequence-level 1F1B schedule for LLMs.
MayDomine / Seq1F1B
Forked from NVIDIA/Megatron-LMSequence-level 1F1B schedule for LLMs.
fanshiqing / grouped_gemm
Forked from tgale96/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Adlik / smoothquantplus
Forked from mit-han-lab/smoothquant[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
AniZpZ / smoothquant
Forked from mit-han-lab/smoothquant[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
alibaba / Megatron-LLaMA
Forked from NVIDIA/Megatron-LMBest practice for training LLaMA models in Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2