Skip to content
View Pan-Yuqi's full-sized avatar

Block or report Pan-Yuqi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fully open reproduction of DeepSeek-R1

Python 25,742 2,405 Updated Nov 24, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,641 2,856 Updated Dec 20, 2025
Python 960 101 Updated Dec 16, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,918 351 Updated Dec 19, 2025

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Jupyter Notebook 230 21 Updated Sep 2, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 2,017 128 Updated Apr 3, 2025

Large World Model -- Modeling Text and Video with Millions Context

Python 7,390 560 Updated Oct 19, 2024

Fully Open Framework for Democratized Multimodal Training

Python 659 50 Updated Dec 15, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,439 121 Updated Dec 20, 2025

Triton-based implementation of Sparse Mixture of Experts.

Python 257 24 Updated Oct 3, 2025

Code release for paper "Test-Time Training Done Right"

Python 342 16 Updated Nov 18, 2025

Ring attention implementation with flash attention

Python 948 90 Updated Sep 10, 2025

NanoGPT (124M) in 3 minutes

Python 3,972 523 Updated Dec 17, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 51,241 8,582 Updated Nov 12, 2025

Ongoing research training transformer models at scale

Python 14,652 3,398 Updated Dec 20, 2025

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 1,293 85 Updated Jul 14, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,747 271 Updated Jul 18, 2025

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,442 705 Updated Dec 17, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,201 745 Updated Dec 12, 2025

Puzzles for learning Triton

Jupyter Notebook 2,186 178 Updated Nov 18, 2024

Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Cuda 76 5 Updated Jul 14, 2024

Automatically split your PyTorch models on multiple GPUs for training & inference

Python 657 45 Updated Jan 2, 2024

A PyTorch native platform for training generative AI models

Python 4,858 644 Updated Dec 20, 2025

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 1,396 118 Updated Nov 13, 2025

Mamba SSM architecture

Python 16,770 1,541 Updated Nov 11, 2025

Fast and memory-efficient exact attention

Python 21,196 2,232 Updated Dec 18, 2025

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,289 367 Updated Dec 4, 2025

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,698 294 Updated Aug 14, 2024
Next