Highlights
- Pro
Lists (11)
Sort Name ascending (A-Z)
Stars
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)
A prototype implementation of the "dataset as a queue" pattern for processing web pages into interleaved image/text content.
Ongoing research training transformer models at scale
Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
Multilingual Document Layout Parsing in a Single Vision-Language Model
This repo contains the code for 1D tokenizer and generator
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
Simple IO APIs with pluggable storage backends and rich format handlers.
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
Tiny AutoEncoder for Hunyuan Video (and other video models)
Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
RES: Refined Exponential Solver. https://arxiv.org/abs/2308.02157
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
Evaluation harness for diffusion world models
A place to store reusable transformer components of my own creation or found on the interwebs
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
open-source coding LLM for software engineering tasks
[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination
The official implementation of CVPR'25 Oral paper "Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise"
Quickly rewrite git repository history (filter-branch replacement)