Skip to content
View popomen's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report popomen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
78 results for source starred repositories
Clear filter

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 3,574 241 Updated Jan 14, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,449 981 Updated Jan 20, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,158 813 Updated Feb 3, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 8,962 1,088 Updated Feb 5, 2026

Magnum IO community repo

C++ 109 20 Updated Dec 5, 2025

Train transformer language models with reinforcement learning.

Python 17,294 2,472 Updated Feb 6, 2026

ByteCheckpoint: An Unified Checkpointing Library for LFMs

Python 269 19 Updated Feb 2, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,950 873 Updated Feb 5, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,019 3,196 Updated Feb 6, 2026

Infiniband Verbs Performance Tests

C 908 374 Updated Jan 11, 2026

RDMA core userspace libraries and daemons

C 2,126 822 Updated Feb 4, 2026

Large Context Attention

Python 766 53 Updated Oct 13, 2025

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 9,485 1,279 Updated Feb 4, 2026

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,703 6,741 Updated Feb 6, 2026

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,965 337 Updated Jan 18, 2026

A PyTorch native platform for training generative AI models

Python 5,039 699 Updated Feb 6, 2026

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

4,995 534 Updated Sep 25, 2024

Rotary Transformer

Python 1,078 61 Updated Mar 21, 2022

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 26,494 1,872 Updated Jan 9, 2026

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 926 55 Updated Nov 27, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,148 31,970 Updated Feb 5, 2026

深度学习经典、新论文逐段精读

32,527 2,774 Updated Mar 22, 2025

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

7,644 935 Updated Aug 21, 2024

A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.

Go 1,651 405 Updated Feb 5, 2026

An industrial deep learning framework for high-dimension sparse data

PureBasic 4,307 1,028 Updated Sep 25, 2024

Kubernetes-native Deep Learning Framework

Python 746 116 Updated Jan 26, 2024

DLRover: An Automatic Distributed Deep Learning System

Python 1,631 210 Updated Feb 6, 2026

flannel is a network fabric for containers, designed for Kubernetes

Go 9,394 2,899 Updated Feb 4, 2026

gRPC to JSON proxy generator following the gRPC HTTP spec

Go 19,800 2,368 Updated Feb 6, 2026

PyTorch extensions for high performance and large scale training.

Python 3,397 295 Updated Apr 26, 2025
Next