research-article

Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games

Authors:

Thomas W. Anthony,

David Balduzzi,

Yoram BachrachAuthors Info & Claims

AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

Pages 538 - 547

Published: 13 May 2020 Publication History

Abstract

Zero-sum games have long guided artificial intelligence research, since they possess both a rich strategy space of best-responses and a clear evaluation metric. What's more, competition is a vital mechanism in many real-world multi-agent systems capable of generating intelligent innovations: Darwinian evolution, the market economy and the AlphaZero algorithm, to name a few. In two-player zero-sum games, the challenge is usually viewed as finding Nash equilibrium strategies, safeguarding against exploitation regardless of the opponent. While this captures the intricacies of chess or Go, it avoids the notion of cooperation with co-players, a hallmark of the major transitions leading from unicellular organisms to human civilization. Beyond two players, alliance formation often confers an advantage; however this requires trust, namely the promise of mutual cooperation in the face of incentives to defect. Successful play therefore requires adaptation to co-players rather than the pursuit of non-exploitability. Here we argue that a systematic study of many-player zero-sum games is a crucial element of artificial intelligence research. Using symmetric zero-sum matrix games, we demonstrate formally that alliance formation may be seen as a social dilemma, and empirically that naïve multi-agent reinforcement learning therefore fails to form alliances. We introduce a toy model of economic competition, and show how reinforcement learning may be augmented with a peer-to-peer contract mechanism to discover and enforce alliances. Finally, we generalize our agent model to incorporate temporally-extended contracts, presenting opportunities for further work.

References

[1]

Thomas Anthony, Zheng Tian, and David Barber. 2017. Thinking Fast andSlow with Deep Learning and Tree Search.CoRRabs/1705.08439 (2017).arXiv:1705.08439 http://arxiv.org/abs/1705.08439

[2]

Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Leibo, Marc Lanctot, Mike Johanson, Wojtek Czarnecki, and Thore Graepel.2019. Negotiating Team Formation Using Deep Reinforcement Learning. (2019). https://openreview.net/forum?id=HJG0ojCcFm

[3]

Yoram Bachrach, Pushmeet Kohli, Vladimir Kolmogorov, and Morteza Zadi-moghaddam. 2013. Optimal coalition structure generation in cooperative graph games. In Twenty-Seventh AAAI Conference on Artificial Intelligence.

[4]

Yoram Bachrach, Evangelos Markakis, Ezra Resnick, Ariel D Procaccia, Jeffrey SRosenschein, and Amin Saberi. 2010. Approximating power indices: theoretical and empirical analysis. Autonomous Agents and Multi-Agent Systems 20, 2 (2010), 105--122.

Digital Library

[5]

David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Pérolat, Max Jaderberg, and Thore Graepel. 2019. Open-ended Learning in Symmetric Zero-sum Games. CoRRabs/1901.08106 (2019). arXiv:1901.08106http://arxiv.org/abs/1901.08106

[6]

Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, and Michael Bowling. 2019. The Hanabi Challenge: A New Frontier for AI Research.CoRRabs/1902.00506 (2019). arXiv:1902.00506 http://arxiv.org/abs/1902.00506

[7]

Samuel Barrett, Avi Rosenfeld, Sarit Kraus, and Peter Stone. 2017. Making friendson the fly: Cooperating with new teammates. Artificial Intelligence 242 (2017),132--171. https://doi.org/10.1016/j.artint.2016.10.005

Digital Library

[8]

Samuel Barrett, Peter Stone, Sarit Kraus, and Avi Rosenfeld. 2012. Learning Teammate Models for Ad Hoc Teamwork. In AAMAS 2012.

[9]

Martin Beckenkamp, Heike Hennig-Schmidt, and Frank Maier-Rigaud. 2007. Cooperation in Symmetric and Asymmetric Prisoner's Dilemma Games. Max Planck Institute for Research on Collective Goods, Working Paper Series of the Max Planck Institute for Research on Collective Goods(03 2007). https://doi.org/10.2139/ssrn.968942

[10]

JM Bilbao, JR Fernandez, A Jiménez Losada, and JJ Lopez. 2000. Generating functions for computing power indices efficiently. Top 8, 2 (2000), 191--213.

[11]

François Bonnet, Todd W Neller, and Simon Viennot. 2018. Towards Optimal Play of Three-Player Piglet and Pig. (2018).

[12]

Rodica Branzei, Dinko Dimitrov, and Stef Tijs. 2008. Models in cooperative game theory. Vol. 556. Springer Science & Business Media.

[13]

Noam Brown and Tuomas Sandholm. 2019. Superhuman AI for multiplayer poker.Science 365, 6456 (2019), 885--890.

[14]

Yang Cai and Constantinos Daskalakis. 2011. On Minmax Theorems for Multi-player Games. In Proceedings of the Twenty-second Annual ACM-SIAM Symposiumon Discrete Algorithms (SODA '11). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 217--234. http://dl.acm.org/citation.cfm?id=2133036.2133056

Digital Library

[15]

Murray Campbell, A.Joseph Hoane, and Feng hsiung Hsu. 2002. Deep Blue. Artificial Intelligence 134, 1 (2002), 57--83. https://doi.org/10.1016/S0004-3702(01)00129-1

Digital Library

[16]

Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, and Stephen Clark. 2018. Emergent Communication through Negotiation. CoRRabs/1804.03980 (2018). arXiv:1804.03980 http://arxiv.org/abs/1804.03980

[17]

Georgios Chalkiadakis, Edith Elkind, and Michael Wooldridge. 2011. Computational aspects of cooperative game theory. Synthesis Lectures on Artificial Intelligence and Machine Learning 5, 6 (2011), 1--168.

[18]

A.R.A.M. Chammah, A. Rapoport, A.M. Chammah, and C.J. Orwant. 1965. Prisoner's Dilemma: A Study in Conflict and Cooperation. University of Michigan Press. https://books.google.co.uk/books?id=yPtNnKjXaj4C

[19]

Han-Lim Choi, Luc Brunet, and Jonathan P How. 2009. Consensus-based decentralized auctions for robust task allocation. IEEE transactions on robotics 25, 4(2009), 912--926.

[20]

Vincent Conitzer and Tuomas Sandholm. 2004. Computing Shapley values, manipulating value division schemes, and checking core membership in multi-issue domains. In AAAI, Vol. 4. 219--225.

[21]

Vincent Conitzer and Tuomas Sandholm. 2006. Complexity of constructing solutions in the core based on synergies among coalitions. Artificial Intelligence 170, 6--7 (2006), 607--619.

Digital Library

[22]

Charles Darwin. 1859.On the Origin of Species by Means of Natural Selection. Murray, London. or the Preservation of Favored Races in the Struggle for Life.

[23]

Julia Drechsel. 2010. Selected Topics in Cooperative Game Theory. (07 2010). https://doi.org/10.1007/978-3-642-13725-9_2

[24]

Paul E Dunne, Wiebe van der Hoek, Sarit Kraus, and Michael Wooldridge. 2008. Cooperative boolean games. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 2. International Foundation for Autonomous Agents and Multiagent Systems, 1015--1022.

[25]

Claude E. Shannon. 1950. XXII. Programming a computer for playing chess. Philos. Mag. 41 (03 1950), 256--275. https://doi.org/10.1080/14786445008521796

[26]

Tom Eccles, Edward Hughes, János Kramár, Steven Wheelwright, and Joel Z. Leibo. 2019. Learning Reciprocity in Complex Sequential Social Dilemmas. CoRRabs/1903.08082 (2019). arXiv:1903.08082 http://arxiv.org/abs/1903.08082

[27]

Lasse Espeholt, Hubert Soyer, Rémi Munos, Karen Simonyan, Volodymyr Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, and Koray Kavukcuoglu. 2018. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. CoRRabs/1802.01561 (2018). arXiv:1802.01561 http://arxiv.org/abs/1802.01561

[28]

Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, and Shimon Whiteson. 2016. Learning to Communicate with Deep Multi-Agent Reinforcement Learning.CoRRabs/1605.06676 (2016). arXiv:1605.06676 http://arxiv.org/abs/1605.06676

[29]

Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, PieterAbbeel, and Igor Mordatch. 2017. Learning with Opponent-Learning Awareness. CoRRabs/1709.04326 (2017). arXiv:1709.04326 http://arxiv.org/abs/1709.04326

[30]

Nicoletta Fornara and Marco Colombetti. 2002. Operational specification of acommitment-based agent communication language. In Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2. ACM, 536--542.

Digital Library

[31]

Matthew E Gaston and Marie DesJardins. 2005. Agent-organized networks for dynamic team formation. In Proceedings of the fourth international joint conferenceon Autonomous agents and multiagent systems. ACM, 230--237.

Digital Library

[32]

Geoffrey J. Gordon, Amy Greenwald, and Casey Marks. 2008. No-regret Learning in Convex Games. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, New York, NY, USA, 360--367. https://doi.org/10.1145/1390156.1390202

Digital Library

[33]

Leslie Green. 2012. Legal Obligation and Authority. In The Stanford Encyclopedia of Philosophy(winter 2012 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University.

[34]

Edward Hughes, Joel Z. Leibo, Matthew G. Philips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, and Thore Graepel. 2018. Inequity aversion resolves intertemporal social dilemmas. CoRRabs/1803.08884 (2018). arXiv:1803.08884http://arxiv.org/abs/1803.08884

[35]

Samuel Ieong and Yoav Shoham. 2005. Marginal contribution nets: a compactre presentation scheme for coalitional games. In Proceedings of the 6th ACM conference on Electronic commerce. ACM, 193--202.

Digital Library

[36]

Nicholas R Jennings, Peyman Faratin, Alessio R Lomuscio, Simon Parsons, Michael J Wooldridge, and Carles Sierra. 2001. Automated negotiation: prospects, methods and challenges. Group Decision and Negotiation 10, 2 (2001), 199--215.

[37]

Dave Jonge and Dongmo Zhang. 2020. Strategic negotiations for extensive-form games. Autonomous Agents and Multi-Agent Systems 34 (04 2020). https://doi.org/10.1007/s10458-019-09424-y

[38]

Matthias Klusch and Andreas Gerber. 2002. Dynamic coalition formation amongrational agents. IEEE Intelligent Systems 17, 3 (2002), 42--47.

Digital Library

[39]

Sarit Kraus. 1997. Negotiation and cooperation in multi-agent environments. Artificial intelligence 94, 1--2 (1997), 79--97.

[40]

Sarit Kraus and Ronald C Arkin. 2001. Strategic negotiation in multiagent envi-ronments. MIT press.

[41]

Sarit Kraus, Daniel Lehmann, and E. Ephrati. 1989. An automated Diplomacyplayer. Heuristic Programming in Artificial Intelligence: The 1st Computer Olympia(1989), 134--153.

[42]

Kazuhiro Kuwabara, Toru Ishida, and Nobuyasu Osato. 1995. AgenTalk: Coordination Protocol Description for Multiagent Systems. In ICMAS, Vol. 95. 455--461.

[43]

Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, and Thore Graepel. 2017. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. CoRRabs/1711.00832(2017). arXiv:1711.00832 http://arxiv.org/abs/1711.00832

[44]

Joel Z. Leibo, Vinícius Flores Zambaldi, Marc Lanctot, Janusz Marecki, and ThoreGraepel. 2017. Multi-agent Reinforcement Learning in Sequential Social Dilemmas. CoRRabs/1702.03037 (2017). arXiv:1702.03037 http://arxiv.org/abs/1702.03037

[45]

Adam Lerer and Alexander Peysakhovich. 2017. Maintaining cooperation incomplex social dilemmas using deep reinforcement learning.CoRRabs/1707.01068(2017). arXiv:1707.01068 http://arxiv.org/abs/1707.01068

[46]

M. L. Littman. 1994. Markov Games as a Framework for Multi-Agent Reinforcement Learning. In Proceedings of the 11th International Conference on Machine Learning (ICML). 157--163.

[47]

Michael L. Littman. 1994. Markov Games As a Framework for Multi-agent Reinforcement Learning. In Proceedings of the Eleventh International Conference on International Conference on Machine Learning (ICML'94). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 157--163. http://dl.acm.org/citation.cfm?id=3091574.3091594

[48]

Siqi Liu, Guy Lever, Nicholas Heess, Josh Merel, Saran Tunyasuvunakool, and Thore Graepel. 2019. Emergent Coordination Through Competition. In International Conference on Learning Representations. https://openreview.net/forum?id=BkG8sjR5Km

[49]

Michael W. Macy and Andreas Flache. 2002. Learning dynamics in social dilem-mas. Proceedings of the National Academy of Sciences99 (2002), 7229--7236.https://doi.org/10.1073/pnas.092080099

[50]

David Martimort. 2017.Contract Theory. Palgrave Macmillan UK, London, 1--11.https://doi.org/10.1057/978-1-349-95121-5_2542-1

[51]

Moshe Mash, Yoram Bachrach, and Yair Zick. 2017. How to form winning coalitions in mixed human-computer settings. In Proceedings of the 26th international joint conference on artificial intelligence (IJCAI). 465--471.

[52]

Tomasz P Michalak, Karthik V Aadithya, Piotr L Szczepanski, Balaraman Ravin-dran, and Nicholas R Jennings. 2013. Efficient computation of the Shapley valuefor game-theoretic network centrality. Journal of Artificial Intelligence Research46 (2013), 607--650.

[53]

Matej Moravcík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, and Michael H. Bowling. 2017. Deep Stack: Expert-Level Artificial Intelligence in No-Limit Poker. CoRRabs/1701.01724 (2017). arXiv:1701.01724 http://arxiv.org/abs/1701.01724

[54]

Igor Mordatch and Pieter Abbeel. 2017. Emergence of Grounded Compositional Language in Multi-Agent Populations. CoRRabs/1703.04908 (2017). arXiv:1703.04908 http://arxiv.org/abs/1703.04908

[55]

Joseph Antonius Maria Nijssen. 2013. Monte-Carlo tree search for multi-playergames. Maastricht University.

[56]

OpenAI. 2018. OpenAI Five. (2018). https://blog.openai.com/openai-five/

[57]

Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, and Aaron Courville.2019. No Press Diplomacy: Modeling Multi-Agent Gameplay.(2019).arXiv:cs.AI/1909.02128

[58]

Julien Pérolat, Joel Z. Leibo, Vinícius Flores Zambaldi, Charles Beattie, Karl Tuyls,and Thore Graepel. 2017. A multi-agent reinforcement learning model of common-pool resource appropriation. CoRRabs/1707.06600 (2017). arXiv:1707.06600http://arxiv.org/abs/1707.06600

[59]

Simon M. Reader and Kevin N. Laland. 2002. Social intelligence, innovation, and enhanced brain size in primates. Proceedings of the National Academy of Sciences 99, 7 (2002), 4436--4441. https://doi.org/10.1073/pnas.062041299arXiv:https://www.pnas.org/content/99/7/4436.full.pdf

[60]

Ezra Resnick, Yoram Bachrach, Reshef Meir, and Jeffrey S Rosenschein. 2009. The cost of stability in network flow games. In International Symposium on Mathematical Foundations of Computer Science. Springer, 636--650.

Digital Library

[61]

Ariel Rosenfeld and Sarit Kraus. 2018. Predicting human decision-making: From prediction to action. Synthesis Lectures on Artificial Intelligence and Machine Learning 12, 1 (2018), 1--150.

[62]

Jeffrey S Rosenschein and Gilad Zlotkin. 1994. Rules of encounter: designing conventions for automated negotiation among computers. MIT press.

[63]

Tuomas Sandholm, Kate Larson, Martin Andersson, Onn Shehory, and Fernando Tohmé. 1999. Coalition structure generation with worst case guarantees. Artificial Intelligence 111, 1--2 (1999), 209--238.

Digital Library

[64]

Tuomas Sandholm, Victor R Lesser, et al. 1995. Issues in automated negotiation and electronic commerce: Extending the contract net framework. InICMAS,Vol. 95. 12--14.

[65]

Tuomas W Sandholm and Victor R Lesser. 1996. Advantages of a leveled commitment contracting protocol. InAAAI/IAAI, Vol. 1. 126--133.

[66]

Jonathan Schaeffer, Joseph Culberson, Norman Treloar, Brent Knight, Paul Lu, and Duane Szafron. 1992. A world championship caliber checkers program. Artificial Intelligence 53, 2 (1992), 273--289. https://doi.org/10.1016/0004--3702(92)90074--8"

Digital Library

[67]

Alan Schwartz and Robert E. Scott. 2003. Contract Theory and the Limits of Contract Law. The Yale Law Journal113, 3 (2003), 541--619. http://www.jstor.org/stable/3657531

[68]

Jeff S Shamma. 2007.Cooperative control of distributed multi-agent systems. Wiley Online Library.

[69]

L. S. Shapley. 1953. Stochastic Games.In Proc. of the National Academy of Sciences of the United States of America(1953).

[70]

Onn Shehory and Sarit Kraus. 1998. Methods for task allocation via agent coalition formation. Artificial intelligence 101, 1--2 (1998), 165--200.

[71]

Alexander Shmakov, John Lanier, Stephen McAleer, Rohan Achar, Cristina Lopes,and Pierre Baldi. 2019. ColosseumRL: A Framework for Multiagent Reinforcement Learning in N-Player Games. (2019). arXiv:cs.MA/1912.04451

[72]

Tianmin Shu and Yuandong Tian. 2019. M3RL: Mind-aware Multi-agent Management Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=BkzeUiRcY7

[73]

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, Georgevan den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershel-vam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalch-brenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. 2016. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 529, 7587 (Jan. 2016), 484--489.https://doi.org/10.1038/nature16961

[74]

Adam Smith. 1776. An Inquiry into the Nature and Causes of the Wealth of Nations. McMaster University Archive for the History of Economic Thought. https://EconPapers.repec.org/RePEc:hay:hetboo:smith1776

[75]

Reid G Smith. 1980. The contract net protocol: High-level communication andcontrol in a distributed problem solver. IEEE Transactions on computers 12 (1980),1104--1113.

Digital Library

[76]

E. Sodomka, E.M. Hilliard, M.L. Littman, and Amy Greenwald. 2013. Coco-Q: Learning in stochastic games with side payments. 30th International Conference on Machine Learning, ICML 2013(01 2013), 2521--2529.

[77]

Peter Stone, Gal A. Kaminka, Sarit Kraus, and Jeffrey S. Rosenschein. 2010. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination. In AAAI.

[78]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduc-tion(second ed.). The MIT Press. http://incompleteideas.net/book/the-book-2nd.html

[79]

Gerald Tesauro. 1995. Temporal Difference Learning and TD-Gammon.Commun. ACM 38, 3 (March 1995), 58--68. https://doi.org/10.1145/203330.203343

Digital Library

[80]

Maksim Tsvetovat, Katia Sycara, Yian Chen, and James Ying. 2000. Customer coalitions in the electronic market place. In AAAI/IAAI. 1133--1134.

[81]

J. v. Neumann. 1928. Zur Theorie der Gesellschaftsspiele. Math. Ann. 100, 1 (01Dec 1928), 295--320. https://doi.org/10.1007/BF01448847

[82]

P.A.M. van Lange, B. Rockenbach, and T. Yamagishi. 2017.Trust in Social Dilemmas. Oxford University Press. https://books.google.co.uk/books?id=e-fwswEACAAJ

[83]

L Van Valen. 1980. Evolution as a zero-sum game for energy. Evolutionary Theory 4 (1980), 289--300.

[84]

Alexander Sasha Vezhnevets, Yuhuai Wu, Rémi Leblond, and Joel Z. Leibo. 2019.Options as responses: Grounding behavioural hierarchies in multi-agent RL. CoRRabs/1906.01470 (2019). arXiv:1906.01470 http://arxiv.org/abs/1906.01470

[85]

Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy P. Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ek-ermo, Jacob Repp, and Rodney Tsing. 2017. Star Craft II: A New Challenge for Reinforcement Learning. CoRRabs/1708.04782 (2017). arXiv:1708.04782http://arxiv.org/abs/1708.04782

[86]

John VON NEUMANN and Oskar Morgenstern. 1953. Theory of Games and Economic Behavior / J. von Neumann, O. Morgenstern; introd. de Harold W. Kuhn. SERBIULA (sistema Librum 2.0)26 (01 1953).

[87]

Jane X. Wang, Edward Hughes, Chrisantha Fernando, Wojciech M. Czarnecki,Edgar A. Duéñez-Guzmán, and Joel Z. Leibo. 2018. Evolving intrinsic motivationsfor altruistic behavior.CoRRabs/1811.05931 (2018). arXiv:1811.05931 http://arxiv.org/abs/1811.05931

[88]

Colin Williams, Valentin Robu, Enrico Gerding, and Nicholas Jennings. 2012.Negotiating Concurrently with Unknown Opponents in Complex, Real-TimeDomains.

[89]

Gilad Zlotkin and Jeffrey S Rosenschein. 1994.Coalition, cryptography, and stability: Mechanisms for coalition formation in task oriented domains. Alfred P. Sloan School of Management, Massachusetts Institute of Technology.

Cited By

Willis RDu YLeibo JLuck M(2024)Resolving social dilemmas with minimal reward transferAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09675-438:2Online publication date: 12-Oct-2024
https://dl.acm.org/doi/10.1007/s10458-024-09675-4
Christoffersen PHaupt AHadfield-Menell DAgmon NAn BRicci AYeoh W(2023)Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RLProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598670(448-456)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598670
Fu JTacchetti APerolat JBachrach Y(2021)Evaluating Strategic Structures in Multi-Agent Inverse Reinforcement LearningJournal of Artificial Intelligence Research10.1613/jair.1.1259471(925-951)Online publication date: 10-Sep-2021
https://dl.acm.org/doi/10.1613/jair.1.12594

Index Terms

Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Algorithmic game theory
    2. Machine learning theory
      1. Reinforcement learning
        Multi-agent reinforcement learning

Recommendations

Learning with Opponent-Learning Awareness
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora of recent work on deep multi-agent reinforcement learning, but also can be extended to hierarchical reinforcement learning, generative adversarial ...
Efficient Double Oracle for Extensive-Form Two-Player Zero-Sum Games
Neural Information Processing
Abstract
Policy Space Response Oracles (PSRO) is a powerful tool for large two-player zero-sum games, which is based on the tabular Double Oracle (DO) method and has achieved state-of-the-art performance. Though having guarantee to converge to a Nash ...
Pure strategy equilibria in symmetric two-player zero-sum games
Abstract
We observe that a symmetric two-player zero-sum game has a pure strategy equilibrium if and only if it is not a generalized rock-paper-scissors matrix. Moreover, we show that every finite symmetric quasiconcave two-player zero-sum game has a pure ...

Comments

Information & Contributors

Information

Published In

AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

May 2020

2289 pages

ISBN:9781450375184

General Chairs:
Amal El Fallah Seghrouchni
Sorbonne University, France
,
Gita Sukthankar
University of Central Florida, United States
,
Program Chairs:
Bo An
Nanyang Technological University, Singapore
,
Neil Yorke-Smith Yorke-Smith
Delft University of Technology, Netherlands

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 13 May 2020

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '19

Sponsor:

SIGAI

AAMAS '19: International Conference on Autonomous Agents and Multiagent Systems

May 9 - 13, 2020

Auckland, New Zealand

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
67
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Willis RDu YLeibo JLuck M(2024)Resolving social dilemmas with minimal reward transferAutonomous Agents and Multi-Agent Systems10.1007/s10458-024-09675-438:2Online publication date: 12-Oct-2024
https://dl.acm.org/doi/10.1007/s10458-024-09675-4
Christoffersen PHaupt AHadfield-Menell DAgmon NAn BRicci AYeoh W(2023)Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RLProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598670(448-456)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598670
Fu JTacchetti APerolat JBachrach Y(2021)Evaluating Strategic Structures in Multi-Agent Inverse Reinforcement LearningJournal of Artificial Intelligence Research10.1613/jair.1.1259471(925-951)Online publication date: 10-Sep-2021
https://dl.acm.org/doi/10.1613/jair.1.12594

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten