Skip to content
View jxzhangjhu's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Mountain View
  • 17:11 (UTC -07:00)

Block or report jxzhangjhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

"🐈 nanobot: The Ultra-Lightweight Personal AI Agent"

Python 39,804 6,984 Updated Apr 16, 2026

🦞+🔬: NanoResearch: The Autonomous AI Research Assistant

Python 697 138 Updated Apr 13, 2026

We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference…

391 10 Updated Mar 29, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,991 527 Updated Apr 16, 2026

Can AI agents predict whether they will succeed at a task?

Python 7 1 Updated Feb 9, 2026

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 441 129 Updated Apr 16, 2026

Train transformer language models with reinforcement learning.

Python 18,072 2,647 Updated Apr 16, 2026

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 660 63 Updated Mar 21, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 776 83 Updated Feb 18, 2026

Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.

Python 766 109 Updated Sep 11, 2025

Training Recipes for Agentic Reinforcement Learning in LLMs: A Survey

26 Updated Jan 30, 2026

📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

Python 37,550 4,451 Updated Apr 15, 2026
Python 53 7 Updated Feb 12, 2025
Python 303 20 Updated Jan 3, 2026

Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"

Python 171 20 Updated Oct 20, 2025

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

352 18 Updated Apr 16, 2026

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7 and MiroThinker-H1, achieve 74.0 and 88.2 on the BrowseComp, respectively.

Python 8,123 607 Updated Apr 13, 2026

Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

Python 2,151 149 Updated Apr 15, 2026

We introduce BabyVision, a benchmark revealing the infancy of AI vision.

Python 211 7 Updated Jan 13, 2026
Python 13 1 Updated Mar 14, 2026

Salesforce Enterprise Deep Research

Python 1,153 180 Updated Jan 30, 2026

AgentEvolver: Towards Efficient Self-Evolving Agent System

Python 1,392 160 Updated Apr 1, 2026

This is AI implementation (not official) of the DreamGym framework from the paper "Scaling Agent Learning via Experience Synthesis" (arXiv:2511.03773).

Python 39 3 Updated Nov 9, 2025

[WWW‘26 Oral🔥] DeepAgent: A General Reasoning Agent with Scalable Toolsets

Python 1,052 133 Updated Apr 13, 2026
Python 21 2 Updated Dec 14, 2024
Python 465 64 Updated Apr 9, 2026

Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks

Python 87 26 Updated Oct 16, 2025

A live stream development of RL tunning for LLM agents

Python 3,985 539 Updated Oct 8, 2025
Python 80 9 Updated Mar 30, 2026
Next