dong-river

Follow

dong-river

Follow

3 followers · 2 following

Achievements

Achievements

Stars

XueyangFeng / RPEval

This repository is based on our paper "How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants"

Python 4 Updated Jan 23, 2026

ivaxi0s / PersistBench

Codebase for PersistBench: When Should Long-Term Memories Be Forgotten by LLMs to benchmark cross-domain leakage and sycophancy in memory augmented LLMs.

Python 6 2 Updated Feb 16, 2026

zou-group / humanlm

HumanLM: Simulating Users with State Alignment Beats Response Imitation

Python 71 9 Updated Feb 27, 2026

GG14127 / RealPref

A benchmark for evaluating realistic preference-following in personalized user-LLM interactions.

Python 4 Updated Mar 13, 2026

Gen-Verse / OpenClaw-RL

OpenClaw-RL: Train any agent simply by talking

Python 4,817 503 Updated Apr 11, 2026

bowen-upenn / PersonaMem-v2

PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory

Python 13 1 Updated Apr 1, 2026

lasgroup / SDPO

Reinforcement Learning via Self-Distillation (SDPO)

Python 757 81 Updated Feb 18, 2026

BytedTsinghua-SIA / MemAgent

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 984 69 Updated Jul 31, 2025

wangyu-ustc / Mem-alpha

The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"

Python 194 17 Updated Dec 25, 2025

microsoft / SimulatorArena

Python 19 Updated Mar 3, 2026

THU-MAIC / Awesome-AI-Era-Edu

Python 54 8 Updated Jun 9, 2025

sierra-research / tau-bench

Code and Data for Tau-Bench

Python 1,175 191 Updated Mar 18, 2026

sunnweiwei / PPP-Agent

Training Proactive and Personalized LLM Agents

Python 106 9 Updated Jan 20, 2026

novelty-bench / novelty-bench

Python 27 6 Updated Nov 27, 2025

dong-river / personalized-rlhf-baselines

Baselines for personalized RLHF methods including GPO, DPO, and various reward modeling approaches

Python 2 Updated Oct 16, 2025

Paper2Poster / Paper2Poster

[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

Python 3,609 252 Updated Dec 21, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 2,089 119 Updated Jun 2, 2025

jlin816 / rewards-from-language

Code and data for "Inferring Rewards from Language in Context" [ACL 2022].

Python 16 Updated May 22, 2022

qiancheng0 / Open-SMARTAgent

The official repo for the code and data of paper SMART

Python 40 4 Updated Feb 20, 2025

DeLLMa / DeLLMa

Official Implementation of "DeLLMa: Decision Making Under Uncertainty with Large Language Models"

Python 72 12 Updated Oct 21, 2024

jlin816 / dialop

DialOp: Decision-oriented dialogue environments for collaborative language agents

Python 111 8 Updated Nov 15, 2024

google-deepmind / questbench

Python 37 6 Updated May 15, 2025

SALT-NLP / SynthesizeMe

Python 34 3 Updated Jun 10, 2025

AntResearchNLP / AlignXplore

Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals

Python 11 Updated Jan 8, 2026

BigBinnie / GDPO

[ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization

Python 15 Updated Apr 21, 2025

xiaowu0162 / LongMemEval

Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)

Python 642 42 Updated Oct 24, 2025

ZongqianLi / ReasonGraph

[ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths

HTML 510 48 Updated Mar 9, 2026

apple / ml-entity-deduction-arena

Python 39 6 Updated May 31, 2024

zhiyuanhubj / UoT

[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Python 106 8 Updated Aug 5, 2024

locuslab / open-unlearning

[NeurIPS D&B '25] The one-stop repository for LLM unlearning

Python 522 147 Updated Mar 18, 2026