Skip to content
View wengrx's full-sized avatar

Block or report wengrx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The agent that grows with you

Python 199,802 35,527 Updated Jun 22, 2026

Lightweight, open-source AI agent for your tools, chats, and workflows.

Python 44,576 7,871 Updated Jun 22, 2026

[TPAMI 2026] Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey.

82 2 Updated Mar 25, 2026

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

Python 10,151 906 Updated Jun 21, 2026

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,182 2,099 Updated Jun 9, 2026

Book_5_《统计至简》 | 鸢尾花书:从加减乘除到机器学习;上架!

Jupyter Notebook 3,675 753 Updated May 1, 2026

Pioneering Automated GUI Interaction with Native Agents

Python 11,023 832 Updated Jan 27, 2026

所有小初高、大学PDF教材。

Roff 74,452 16,675 Updated Oct 18, 2025

A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.

321 18 Updated Feb 28, 2025

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

Jupyter Notebook 188 14 Updated Jul 23, 2025

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,866 94 Updated Apr 18, 2025

Agent2Agent (A2A) is an open protocol enabling communication and interoperability between opaque agentic applications.

Shell 24,400 2,472 Updated Jun 22, 2026

🐉 Loong: Synthesize Long CoTs at Scale through Verifiers.

Python 503 42 Updated Jun 12, 2026

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,829 84 Updated May 11, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,490 102 Updated Jun 22, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 16,306 2,892 Updated Mar 5, 2026

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 809 98 Updated Mar 13, 2025

Official Repo for Open-Reasoner-Zero

Python 2,095 120 Updated Jun 2, 2025
Python 84 7 Updated Mar 11, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 13,177 1,585 Updated Feb 27, 2026

[COLM 2025] LIMO: Less is More for Reasoning

Python 1,077 54 Updated Jul 30, 2025

Fully open reproduction of DeepSeek-R1

Python 26,341 2,442 Updated Apr 2, 2026

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 3,437 330 Updated Jul 7, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,863 112 Updated Mar 18, 2025

A series of technical report on Slow Thinking with LLM

Python 766 41 Updated Aug 13, 2025
Next