Skip to content
View jxzhangjhu's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Mountain View
  • 22:06 (UTC -07:00)

Block or report jxzhangjhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

"🐈 nanobot: The Ultra-Lightweight Personal AI Assistant"

Python 37,712 6,535 Updated Apr 3, 2026

🦞+🔬: NanoResearch: The Autonomous AI Research Assistant

Python 437 77 Updated Apr 3, 2026

We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference…

378 10 Updated Mar 29, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,588 462 Updated Mar 31, 2026

Can AI agents predict whether they will succeed at a task?

Python 7 1 Updated Feb 9, 2026

Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 406 110 Updated Apr 3, 2026

Train transformer language models with reinforcement learning.

Python 17,894 2,602 Updated Apr 2, 2026

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 654 61 Updated Mar 21, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 721 77 Updated Feb 18, 2026

Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.

Python 759 107 Updated Sep 11, 2025

Training Recipes for Agentic Reinforcement Learning in LLMs: A Survey

23 Updated Jan 30, 2026

📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

Python 33,283 3,816 Updated Mar 30, 2026
Python 54 7 Updated Feb 12, 2025
Python 293 19 Updated Jan 3, 2026

Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"

Python 170 20 Updated Oct 20, 2025

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

322 13 Updated Apr 2, 2026

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7 and MiroThinker-H1, achieve 74.0 and 88.2 on the BrowseComp, respectively.

Python 8,216 602 Updated Mar 31, 2026

Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

Python 2,072 144 Updated Apr 2, 2026

We introduce BabyVision, a benchmark revealing the infancy of AI vision.

Python 205 7 Updated Jan 13, 2026
Python 12 1 Updated Mar 14, 2026

Salesforce Enterprise Deep Research

Python 1,150 179 Updated Jan 30, 2026

AgentEvolver: Towards Efficient Self-Evolving Agent System

Python 1,333 151 Updated Apr 1, 2026

This is AI implementation (not official) of the DreamGym framework from the paper "Scaling Agent Learning via Experience Synthesis" (arXiv:2511.03773).

Python 39 3 Updated Nov 9, 2025

[WWW 2026] 🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets

Python 1,037 132 Updated Mar 9, 2026
Python 21 2 Updated Dec 14, 2024
Python 459 62 Updated Apr 2, 2026

Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks

Python 86 25 Updated Oct 16, 2025

A live stream development of RL tunning for LLM agents

Python 3,973 542 Updated Oct 8, 2025
Python 78 10 Updated Mar 30, 2026
Next