extended-abstract

How to Guide a Non-Cooperative Learner to Cooperate: Exploiting No-Regret Algorithms in System Design

Authors:

Nicholas Bishop,

Le Cong Dinh,

Long Tran-ThanhAuthors Info & Claims

AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

Pages 1464 - 1466

Published: 03 May 2021 Publication History

Get Access

Abstract

We investigate a repeated two-player game setting where the column player is also a designer of the system, and has full control over payoff matrices. In addition, we assume that the row player uses a no-regret algorithm to efficiently learn how to adapt their strategy to the column player's behaviour over time. The goal of the column player is to guide her opponent into picking a mixed strategy which is preferred by the system designer. Therefore, she needs to: (i) design appropriate payoffs for both players; and (ii) strategically interact with the row player during a sequence of plays in order to guide her opponent to converge to the desired mixed strategy. To design appropriate payoffs, we propose a novel zero-sum game construction whose unique minimax solution contains the desired behaviour. We also propose another construction in which only the minimax strategy of the row player is unique. Finally, we propose a new game playing algorithm for the system designer and show that it can guide the row player to its minimax strategy, under the assumption that the row player adopts a stable no-regret algorithm.

References

[1]

HF Bohnenblust, S Karlin, and LS Shapley. 1950. Solutions of discrete, two-person games. Contributions to the Theory of Games, Vol. 1 (1950), 51--72.

Google Scholar

[2]

Nicolo Cesa-Bianchi and Gábor Lugosi. 2006. Prediction, learning, and games. Cambridge university press.

Google Scholar

[3]

Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, and Haoyang Zeng. 2017. Training gans with optimism. arXiv preprint arXiv:1711.00141 (2017).

Google Scholar

[4]

C. Daskalakis and I. Panageas. 2018a. Last-iterate convergence: Zero-sum games and constrained min-max optimization. arXiv preprint arXiv:1807.04252 (2018a).

Google Scholar

[5]

Le Cong Dinh, Long Tran-Thanh, Tri-Dung Nguyen, and Alain B Zemkoho. 2020. Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information. arXiv preprint arXiv:2003.11727 (2020).

Google Scholar

[6]

Yoav Freund and Robert E Schapire. 1999. Adaptive game playing using multiplicative weights. Games and Economic Behavior, Vol. 29, 1--2 (1999), 79--103.

Crossref

Google Scholar

[7]

Panayotis Mertikopoulos, Christos Papadimitriou, and Georgios Piliouras. 2018. Cycles in adversarial regularized learning. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2703--2717.

Digital Library

Google Scholar

Index Terms

How to Guide a Non-Cooperative Learner to Cooperate: Exploiting No-Regret Algorithms in System Design
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Convergence and learning in games

Recommendations

Online Learning against Strategic Adversary
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems

Our work considers repeated games in which one player has a different objective than others. In particular, we investigate repeated two-player zero-sum games where the column player not only aims to minimize her regret but also stabilize the actions. ...
The reality of fantasy: uncovering information-seeking behaviors and needs in online fantasy sports
CHI EA '12: CHI '12 Extended Abstracts on Human Factors in Computing Systems

Online fantasy sports are rapidly growing in popularity. Fantasy sports players consume massive amounts of sports and player statistics in order to manage their teams, such as to determine who they want on their fantasy sports team and what changes they ...
Using counterfactual regret minimization to create competitive multiplayer poker agents
AAMAS '10: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1

Games are used to evaluate and advance Multiagent and Artificial Intelligence techniques. Most of these games are deterministic with perfect information (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information ...

Comments

Information & Contributors

Information

Published In

AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

May 2021

1899 pages

ISBN:9781450383073

General Chairs:
Frank Dignum
Umeå University, Sweden
,
Alessio Lomuscio
Imperial College London, UK
,
Program Chairs:
Ulle Endriss
University of Amsterdam, Netherlands
,
Ann Nowé
Vrije Universiteit Brussel, Belgium

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 03 May 2021

Check for updates

Author Tags

Qualifiers

Extended-abstract

Conference

AAMAS '21

Sponsor:

SIGAI

AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems

May 3 - 7, 2021

Virtual Event, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
24
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Online Learning against Strategic Adversary

The reality of fantasy: uncovering information-seeking behaviors and needs in online fantasy sports

Using counterfactual regret minimization to create competitive multiplayer poker agents