Gebruikersprofielen voor Paul Weng

Paul Weng

Duke Kunshan University
Geverifieerd e-mailadres voor duke.edu
Geciteerd door 3608

A survey of reinforcement learning from human feedback

T Kaufmann, P Weng, V Bengs… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL)
that learns from human feedback instead of relying on an engineered reward function. …

Analytics and machine learning in vehicle routing research

…, J Jin, G Kendall, J Li, Z Lu, J Ren, P Weng… - … Journal of Production …, 2023 - Taylor & Francis
The Vehicle Routing Problem (VRP) is one of the most intensively studied combinatorial
optimisation problems for which numerous models and algorithms have been proposed. To …

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer
Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake …

Dual graph attention networks for deep latent representation of multifaceted social effects in recommender systems

Q Wu, H Zhang, X Gao, P He, P Weng, H Gao… - The world wide web …, 2019 - dl.acm.org
Social recommendation leverages social information to solve data sparsity and cold-start
problems in traditional collaborative filtering methods. However, most existing models assume …

Invit: A generalizable routing problem solver with invariant nested view transformer

H Fang, Z Song, P Weng, Y Ban - arXiv preprint arXiv:2402.02317, 2024 - arxiv.org
Recently, deep reinforcement learning has shown promising results for learning fast heuristics
to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an …

Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards

U Siddique, P Weng, M Zimmer - … Conference on Machine …, 2020 - proceedings.mlr.press
As the operations of autonomous systems generally affect simultaneously several users, it is
crucial that their designs account for fairness considerations. In contrast to standard (deep) …

Teacher-student framework: a reinforcement learning approach

M Zimmer, P Viappiani, P Weng - AAMAS Workshop autonomous …, 2014 - hal.science
We propose a reinforcement learning approach to learning to teach. Following Torrey and
Taylor’s framework [18], an agent (the “teacher”) advises another one (the “student”) by …

Learning fair policies in decentralized cooperative multi-agent reinforcement learning

…, C Glanois, U Siddique, P Weng - … conference on machine …, 2021 - proceedings.mlr.press
We consider the problem of learning fair policies in (deep) cooperative multi-agent reinforcement
learning (MARL). We formalize it in a principled way as the problem of optimizing a …

Top-k selection based on adaptive sampling of noisy preferences

…, B Szorenyi, W Cheng, P Weng… - International …, 2013 - proceedings.mlr.press
We consider the problem of reliably selecting an optimal subset of fixed size from a given
set of choice alternatives, based on noisy information about the quality of these alternatives. …

Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm

R Busa-Fekete, B Szörényi, P Weng, W Cheng… - Machine learning, 2014 - Springer
We introduce a novel approach to preference-based reinforcement learning, namely a
preference-based variant of a direct policy search method based on evolutionary optimization. …