default search action
Thomas W. Anthony 0001
Person information
- affiliation: DeepMind, UK
Other persons with the same name
- Tom Anthony — disambiguation page
- Thomas Anthony 0002 — University of Alabama at Birmingham, AL, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [j4]Max Olan Smith, Thomas W. Anthony, Michael P. Wellman:
Learning to play against any mixture of opponents. Frontiers Artif. Intell. 6 (2023) - [j3]Max Olan Smith, Thomas W. Anthony, Michael P. Wellman:
Strategic Knowledge Transfer. J. Mach. Learn. Res. 24: 233:1-233:96 (2023) - [j2]Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [i17]Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas W. Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. CoRR abs/2303.03196 (2023) - [i16]Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas W. Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán:
Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas. CoRR abs/2305.00768 (2023) - [i15]Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas W. Anthony, Brian Tanner, Anna Koop:
Evaluating Agents using Social Choice Theory. CoRR abs/2312.03121 (2023) - 2022
- [j1]Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, evaluating and scaling learning agents in multi-agent environments. AI Commun. 35(4): 271-284 (2022) - [c9]Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. AAMAS 2022: 507-515 - [c8]Luke Marris, Ian Gemp, Thomas Anthony, Andrea Tacchetti, Siqi Liu, Karl Tuyls:
Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers. NeurIPS 2022 - [d1]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022 - [i14]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - [i13]Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments. CoRR abs/2209.10958 (2022) - [i12]Luke Marris, Ian Gemp, Thomas W. Anthony, Andrea Tacchetti, Siqi Liu, Karl Tuyls:
Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers. CoRR abs/2210.09257 (2022) - 2021
- [b1]Thomas W. Anthony:
Expert iteration. University College London (University of London), UK, 2021 - [c7]Jessica B. Hamrick, Abram L. Friesen, Feryal M. P. Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Holger Buesing, Petar Velickovic, Theophane Weber:
On the role of planning in model-based deep reinforcement learning. ICLR 2021 - [c6]Max Olan Smith, Thomas Anthony, Michael P. Wellman:
Iterative Empirical Game Solving via Single Policy Best Response. ICLR 2021 - [c5]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. ICML 2021: 8525-8535 - [i11]Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. CoRR abs/2106.01285 (2021) - [i10]Max Olan Smith, Thomas Anthony, Michael P. Wellman:
Iterative Empirical Game Solving via Single Policy Best Response. CoRR abs/2106.01901 (2021) - 2020
- [c4]Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach:
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games. AAMAS 2020: 538-547 - [c3]David Balduzzi, Wojciech M. Czarnecki, Tom Anthony, Ian Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel:
Smooth markets: A basic mechanism for organizing gradient-based learners. ICLR 2020 - [c2]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. NeurIPS 2020 - [i9]David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M. Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel:
Smooth markets: A basic mechanism for organizing gradient-based learners. CoRR abs/2001.04678 (2020) - [i8]Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. CoRR abs/2002.08456 (2020) - [i7]Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach:
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games. CoRR abs/2003.00799 (2020) - [i6]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. CoRR abs/2006.04635 (2020) - [i5]Max Olan Smith, Thomas Anthony, Yongzhao Wang, Michael P. Wellman:
Learning to Play against Any Mixture of Opponents. CoRR abs/2009.14180 (2020) - [i4]Jessica B. Hamrick, Abram L. Friesen, Feryal M. P. Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Buesing, Petar Velickovic, Théophane Weber:
On the role of planning in model-based deep reinforcement learning. CoRR abs/2011.04021 (2020)
2010 – 2019
- 2019
- [i3]Thomas Anthony, Robert Nishihara, Philipp Moritz, Tim Salimans, John Schulman:
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees. CoRR abs/1904.03646 (2019) - [i2]Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas W. Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis:
OpenSpiel: A Framework for Reinforcement Learning in Games. CoRR abs/1908.09453 (2019) - 2017
- [c1]Thomas Anthony, Zheng Tian, David Barber:
Thinking Fast and Slow with Deep Learning and Tree Search. NIPS 2017: 5360-5370 - [i1]Thomas Anthony, Zheng Tian, David Barber:
Thinking Fast and Slow with Deep Learning and Tree Search. CoRR abs/1705.08439 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-30 21:32 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint