Skip to content
View fuyw's full-sized avatar

Block or report fuyw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 833 101 Updated Jan 28, 2026

Implement a reasoning LLM in PyTorch from scratch, step by step

Jupyter Notebook 2,904 415 Updated Feb 11, 2026

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,730 80 Updated May 11, 2025

Minimal yet performant LLM examples in pure JAX

Python 241 31 Updated Jan 14, 2026

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

1,439 139 Updated Jul 18, 2025

[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)

Python 520 36 Updated Sep 27, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,435 164 Updated Mar 20, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 51,998 4,306 Updated Feb 12, 2026

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,894 302 Updated Jan 16, 2024

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,514 208 Updated Jan 25, 2026

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,708 2,238 Updated Feb 1, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,739 1,555 Updated Apr 24, 2025

Simple RL training for reasoning

Python 3,828 284 Updated Dec 23, 2025

Fully open reproduction of DeepSeek-R1

Python 25,881 2,412 Updated Nov 24, 2025

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,068 586 Updated Apr 24, 2024

DeepSeek LLM: Let there be answers

Makefile 6,729 1,055 Updated Feb 4, 2024
Python 1,391 126 Updated Sep 12, 2025

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 20,360 1,708 Updated Jan 30, 2026

DeepSeek R1 distilled into smaller OSS models

Python 15 3 Updated Dec 2, 2025

Friends of OLMo and their links.

357 30 Updated Sep 15, 2025

A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.

Python 223 18 Updated Jul 25, 2025

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 698 70 Updated Apr 20, 2025
Python 2 Updated Jun 12, 2024
Python 554 65 Updated Jan 2, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,805 103 Updated Mar 18, 2025

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,993 876 Updated Feb 6, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,184 3,239 Updated Feb 12, 2026

Let your Claude able to think

TypeScript 16,808 1,981 Updated Nov 4, 2025
Next