Skip to content
View dong-river's full-sized avatar

Block or report dong-river

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository is based on our paper "How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants"

Python 4 Updated Jan 23, 2026

Codebase for PersistBench: When Should Long-Term Memories Be Forgotten by LLMs to benchmark cross-domain leakage and sycophancy in memory augmented LLMs.

Python 6 2 Updated Feb 16, 2026

HumanLM: Simulating Users with State Alignment Beats Response Imitation

Python 71 9 Updated Feb 27, 2026

A benchmark for evaluating realistic preference-following in personalized user-LLM interactions.

Python 4 Updated Mar 13, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,817 503 Updated Apr 11, 2026

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

Python 13 1 Updated Apr 1, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 757 81 Updated Feb 18, 2026

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 984 69 Updated Jul 31, 2025

The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"

Python 194 17 Updated Dec 25, 2025
Python 19 Updated Mar 3, 2026

Code and Data for Tau-Bench

Python 1,175 191 Updated Mar 18, 2026

Training Proactive and Personalized LLM Agents

Python 106 9 Updated Jan 20, 2026
Python 27 6 Updated Nov 27, 2025

Baselines for personalized RLHF methods including GPO, DPO, and various reward modeling approaches

Python 2 Updated Oct 16, 2025

[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

Python 3,609 252 Updated Dec 21, 2025

Official Repo for Open-Reasoner-Zero

Python 2,089 119 Updated Jun 2, 2025

Code and data for "Inferring Rewards from Language in Context" [ACL 2022].

Python 16 Updated May 22, 2022

The official repo for the code and data of paper SMART

Python 40 4 Updated Feb 20, 2025

Official Implementation of "DeLLMa: Decision Making Under Uncertainty with Large Language Models"

Python 72 12 Updated Oct 21, 2024

DialOp: Decision-oriented dialogue environments for collaborative language agents

Python 111 8 Updated Nov 15, 2024
Python 37 6 Updated May 15, 2025
Python 34 3 Updated Jun 10, 2025

Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals

Python 11 Updated Jan 8, 2026

[ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization

Python 15 Updated Apr 21, 2025

Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)

Python 642 42 Updated Oct 24, 2025

[ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths

HTML 510 48 Updated Mar 9, 2026

[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Python 106 8 Updated Aug 5, 2024

[NeurIPS D&B '25] The one-stop repository for LLM unlearning

Python 522 147 Updated Mar 18, 2026
Next