Skip to content
View fuyw's full-sized avatar

Block or report fuyw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 912 113 Updated Jan 28, 2026

Implement a reasoning LLM in PyTorch from scratch, step by step

Jupyter Notebook 3,960 559 Updated Apr 1, 2026

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,772 83 Updated May 11, 2025

Minimal yet performant LLM examples in pure JAX

Python 246 32 Updated Jan 14, 2026

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

1,441 139 Updated Jul 18, 2025

[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and ReasonFlux-Coder.

Python 528 38 Updated Sep 27, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,444 164 Updated Mar 20, 2025

Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.

Python 58,982 5,006 Updated Apr 2, 2026

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,906 304 Updated Jan 16, 2024

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,576 214 Updated Mar 28, 2026

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,701 2,235 Updated Feb 1, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 13,014 1,585 Updated Feb 27, 2026

Simple RL training for reasoning

Python 3,846 289 Updated Dec 23, 2025

Fully open reproduction of DeepSeek-R1

Python 25,966 2,410 Updated Apr 2, 2026

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,088 585 Updated Apr 24, 2024

DeepSeek LLM: Let there be answers

Makefile 6,796 1,062 Updated Feb 4, 2024
Python 1,406 127 Updated Sep 12, 2025

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 20,902 1,761 Updated Mar 5, 2026

DeepSeek R1 distilled into smaller OSS models

Python 17 3 Updated Dec 2, 2025

Friends of OLMo and their links.

359 29 Updated Sep 15, 2025

A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.

Python 223 19 Updated Jul 25, 2025

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 709 73 Updated Apr 20, 2025
Python 2 Updated Jun 12, 2024
Python 553 65 Updated Jan 2, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,838 108 Updated Mar 18, 2025

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,296 910 Updated Mar 30, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,396 3,558 Updated Apr 2, 2026

Let your Claude able to think

TypeScript 16,981 1,977 Updated Nov 4, 2025
Next