Skip to content
View dong-river's full-sized avatar

Block or report dong-river

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A benchmark for evaluating realistic preference-following in personalized user-LLM interactions.

Python 3 Updated Mar 13, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,578 461 Updated Mar 31, 2026

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

Python 12 1 Updated Apr 1, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 720 77 Updated Feb 18, 2026

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 973 68 Updated Jul 31, 2025

The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"

Python 193 17 Updated Dec 25, 2025
Python 18 Updated Mar 3, 2026

Code and Data for Tau-Bench

Python 1,152 188 Updated Mar 18, 2026

Training Proactive and Personalized LLM Agents

Python 106 9 Updated Jan 20, 2026
Python 27 6 Updated Nov 27, 2025

Baselines for personalized RLHF methods including GPO, DPO, and various reward modeling approaches

Python 2 Updated Oct 16, 2025

[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

Python 3,580 248 Updated Dec 21, 2025

Official Repo for Open-Reasoner-Zero

Python 2,087 119 Updated Jun 2, 2025

Code and data for "Inferring Rewards from Language in Context" [ACL 2022].

Python 16 Updated May 22, 2022

The official repo for the code and data of paper SMART

Python 40 4 Updated Feb 20, 2025

Official Implementation of "DeLLMa: Decision Making Under Uncertainty with Large Language Models"

Python 72 12 Updated Oct 21, 2024

DialOp: Decision-oriented dialogue environments for collaborative language agents

Python 111 8 Updated Nov 15, 2024
Python 37 6 Updated May 15, 2025
Python 34 3 Updated Jun 10, 2025

Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals

Python 11 Updated Jan 8, 2026

[ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization

Python 15 Updated Apr 21, 2025

Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)

Python 545 35 Updated Oct 24, 2025

[ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths

HTML 512 48 Updated Mar 9, 2026

[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Python 106 9 Updated Aug 5, 2024

[NeurIPS D&B '25] The one-stop repository for LLM unlearning

Python 515 147 Updated Mar 18, 2026

Framework and toolkits for building and evaluating collaborative agents that can work together with humans.

Python 124 19 Updated Dec 4, 2025

[EMNLP 2024] Ask-before-Plan: Proactive Language Agents for Real-World Planning

Python 21 3 Updated Jul 28, 2025
Next