Skip to content
View KivenChen's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Mountain View, CA
  • 14:13 (UTC -07:00)

Block or report KivenChen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature.

Python 13,810 3,115 Updated May 23, 2026

Mamba SSM architecture

Python 18,436 1,755 Updated Jun 9, 2026

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 2,069 140 Updated Jun 13, 2026

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,634 321 Updated Jun 8, 2026

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,999 503 Updated Feb 10, 2026

Offline optimization of your disaggregated Dynamo graph

Python 335 126 Updated Jun 13, 2026

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,875 600 Updated Jun 13, 2026

Contexts Optical Compression

Python 23,288 2,152 Updated Jan 27, 2026

The best ChatGPT that $100 can buy.

Python 54,987 7,492 Updated May 5, 2026

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 2,013 212 Updated Jun 12, 2026

Allow torch tensor memory to be released and resumed later

Python 250 58 Updated May 16, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 888 252 Updated Jun 13, 2026

An extremely fast Python package and project manager, written in Rust.

Rust 86,336 3,197 Updated Jun 13, 2026

Powerful system-level package manager for Linux, macOS and Windows written in Rust – building on top of the Conda ecosystem.

Rust 7,279 530 Updated Jun 13, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,373 1,044 Updated Jun 4, 2026

A version of verl to support diverse tool use [TMLR 2026]

Python 997 83 Updated Jun 8, 2026

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 16,616 1,204 Updated Mar 24, 2026

Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.

Python 72 7 Updated May 5, 2025

Implementation for FP8/INT8 Rollout for RL training without performence drop.

Python 303 23 Updated Nov 7, 2025
Python 71 7 Updated Jun 8, 2026

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM

Go 1,764 298 Updated May 12, 2026
Python 2 Updated Nov 12, 2025

slime is an LLM post-training framework for RL Scaling.

Python 6,109 893 Updated Jun 13, 2026

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,577 229 Updated Dec 15, 2025

Manage multiple AI terminal agents like Claude Code, Codex, OpenCode, and Amp.

Go 7,798 553 Updated May 18, 2026

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,569 755 Updated Jun 13, 2026

Open-source unified multimodal model

Python 6,007 532 Updated May 4, 2026

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 265 25 Updated Aug 9, 2025

Fast, Flexible and Portable Structured Generation

C++ 1,739 153 Updated Jun 11, 2026

Interactive visualization and analytics on ADS-B data with ClickHouse

JavaScript 446 14 Updated Mar 17, 2026
Next