Skip to content
View jxzhangjhu's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Mountain View
  • 06:56 (UTC -07:00)

Block or report jxzhangjhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

"🐈 nanobot: The Ultra-Lightweight OpenClaw"

Python 35,923 6,127 Updated Mar 24, 2026

🦞+🔬: NanoResearch: The Autonomous AI Research Assistant

Python 233 19 Updated Mar 24, 2026

We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference…

353 10 Updated Mar 23, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,129 405 Updated Mar 23, 2026

Can AI agents predict whether they will succeed at a task?

Python 6 1 Updated Feb 9, 2026

Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 386 101 Updated Mar 24, 2026

Train transformer language models with reinforcement learning.

Python 17,772 2,586 Updated Mar 24, 2026

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 650 61 Updated Mar 21, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 681 64 Updated Feb 18, 2026

Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.

Python 749 107 Updated Sep 11, 2025

Training Recipes for Agentic Reinforcement Learning in LLMs: A Survey

21 Updated Jan 30, 2026

📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

Python 30,090 3,423 Updated Mar 24, 2026
Python 52 7 Updated Feb 12, 2025
Python 289 19 Updated Jan 3, 2026

Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"

Python 170 20 Updated Oct 20, 2025

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

301 13 Updated Mar 23, 2026

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7 and MiroThinker-H1, achieve 74.0 and 88.2 on the BrowseComp, respectively.

Python 8,061 585 Updated Mar 24, 2026

Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

Python 1,994 140 Updated Mar 22, 2026

We introduce BabyVision, a benchmark revealing the infancy of AI vision.

Python 201 7 Updated Jan 13, 2026
Python 12 1 Updated Mar 14, 2026

Salesforce Enterprise Deep Research

Python 1,146 178 Updated Jan 30, 2026

AgentEvolver: Towards Efficient Self-Evolving Agent System

Python 1,302 148 Updated Jan 30, 2026

This is AI implementation (not official) of the DreamGym framework from the paper "Scaling Agent Learning via Experience Synthesis" (arXiv:2511.03773).

Python 39 3 Updated Nov 9, 2025

[WWW 2026] 🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets

Python 1,030 130 Updated Mar 9, 2026
Python 21 2 Updated Dec 14, 2024
Python 454 60 Updated Mar 23, 2026

Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks

Python 85 24 Updated Oct 16, 2025

A live stream development of RL tunning for LLM agents

Python 3,960 543 Updated Oct 8, 2025
Python 77 10 Updated Oct 1, 2025

Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Python 94 9 Updated Nov 25, 2024
Next