pengzju

Follow

peng2001 pengzju

Follow

39 followers · 15 following

Zhejiang University

Stars

200 results for source starred repositories

hkust-nlp / RL-Verifier-Robustness

From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.

Python 24 1 Updated Oct 7, 2025

IAAR-Shanghai / xVerify

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Jupyter Notebook 143 7 Updated Nov 13, 2025

daonet / HPA-HLE

HPA-HLE is an open-source framework for Human Last Examing using multi-agent collaboration, dynamic routing, and entropy-reducing evaluation. It achieved 27.5% accuracy across multiple tests withou…

Python 2 1 Updated Jun 13, 2025

BytedTsinghua-SIA / Enigmata

Resources for the Enigmata Project.

Python 77 5 Updated Aug 13, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 2,205 185 Updated Dec 2, 2025

jiangjin1999 / LogicPro

[ACL2025] A novel complex reasoning enhancement method that utilizes widely available algorithmic questions and their codes to generate logical reasoning data.

8 Updated Aug 4, 2025

Mihir3009 / LogicBench

LogicBench is a natural language question-answering dataset consisting of 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics.

36 4 Updated May 2, 2024

Simple-Efficient / RL-Factory

Train your Agent model via our easy and efficient framework

Python 1,701 159 Updated Dec 5, 2025

InternLM / InternBootcamp

Python 333 25 Updated Aug 29, 2025

multimodal-art-projection / KORGym

Python 55 2 Updated May 21, 2025

langfengQ / verl-agent

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,492 130 Updated Jan 30, 2026

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,537 247 Updated Feb 4, 2026

mll-lab-nu / VAGEN

Training VLM agents with multi-turn reinforcement learning

Python 390 43 Updated Feb 1, 2026

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 3,889 533 Updated Oct 8, 2025

mll-lab-nu / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,504 205 Updated Jan 25, 2026

hkust-nlp / CodeIO

[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Python 566 32 Updated May 6, 2025

huziyi19 / RMath

4 Updated Aug 30, 2024

lmgame-org / GamingAgent

[ICLR 2026] LLM/VLM gaming agents and model evaluation through games.

Python 857 91 Updated Nov 16, 2025

TextArena / TextArena

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 348 81 Updated Feb 3, 2026

bytedance / UI-TARS-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 25,553 2,476 Updated Jan 14, 2026

neoneye / arc-dataset-collection

Multiple datasets for ARC (Abstraction and Reasoning Corpus)

Python 87 15 Updated Mar 28, 2025

linhaowei1 / kumo

☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models

Jupyter Notebook 19 Updated Jun 4, 2025

Aiden0526 / Aristotle

Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".

Python 23 5 Updated Oct 3, 2025

R2E-Gym / R2E-Gym

[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents

Python 230 49 Updated Jul 13, 2025

MAC-AutoML / Awesome-Abstract-Reasoning-Benchmark-List

4 Updated Feb 26, 2025

victorvikram / ConceptARC

Materials for ConceptARC paper

112 9 Updated Nov 6, 2024

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,205 56 Updated Aug 27, 2025

fchollet / ARC-AGI

The Abstraction and Reasoning Corpus

JavaScript 4,716 700 Updated Apr 4, 2025

warpdynamicsltd / guessn

Logical puzzles generator

Python 1 Updated Dec 7, 2024

open-thought / reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,329 112 Updated Jan 16, 2026