default search action

combined dblp search
author search
venue search
publication search

ask others

Thomas W. Anthony 0001

Tom Anthony 0001 – Thomas Anthony 0001

> Home > Persons

Person information

affiliation: DeepMind, UK

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2023
[j4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/frai/SmithAW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/frai/SmithAW23
Max Olan Smith, Thomas W. Anthony, Michael P. Wellman:
Learning to play against any mixture of opponents. Frontiers Artif. Intell. 6 (2023)
[j3]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/SmithAW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/SmithAW23
Max Olan Smith, Thomas W. Anthony, Michael P. Wellman:
Strategic Knowledge Transfer. J. Mach. Learn. Res. 24: 233:1-233:96 (2023)
[j2]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - journals/tmlr/LanctotSBSH0P23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmlr/LanctotSBSH0P23
Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-03196
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-03196
Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas W. Anthony, Julien Pérolat:
Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning. CoRR abs/2303.03196 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-00768
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-00768
Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas W. Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán:
Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas. CoRR abs/2305.00768 (2023)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-03121
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-03121
Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas W. Anthony, Brian Tanner, Anna Koop:
Evaluating Agents using Social Choice Theory. CoRR abs/2312.03121 (2023)
2022
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/aicom/GempABBBCDVDEEH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/aicom/GempABBBCDVDEEH22
Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, evaluating and scaling learning agents in multi-agent environments. AI Commun. 35(4): 271-284 (2022)
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/atal/GempSLBA0TEK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/GempSLBA0TEK22
Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. AAMAS 2022: 507-515
[c8]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/MarrisG0TLT22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MarrisG0TLT22
Luke Marris, Ian Gemp, Thomas Anthony, Andrea Tacchetti, Siqi Liu, Karl Tuyls:
Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers. NeurIPS 2022
[d1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - data/10/PerolatVHTSBMCBAMECWGMK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/data/10/PerolatVHTSBMCBAMECWGMK22
Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-15378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-15378
Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-10958
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-10958
Ian Gemp, Thomas W. Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome T. Connor, Vibhavari Dasagi, Bart De Vylder, Edgar A. Duéñez-Guzmán, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Pérolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls:
Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments. CoRR abs/2209.10958 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-09257
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-09257
Luke Marris, Ian Gemp, Thomas W. Anthony, Andrea Tacchetti, Siqi Liu, Karl Tuyls:
Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers. CoRR abs/2210.09257 (2022)
2021
[b1]
- view
  - electronic edition @ bl.uk
  - details & citations
- export record
  dblp key:
  - phd/ethos/Anthony21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/ethos/Anthony21
Thomas W. Anthony:
Expert iteration. University College London (University of London), UK, 2021
[c7]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HamrickFBGVWABV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HamrickFBGVWABV21
Jessica B. Hamrick, Abram L. Friesen, Feryal M. P. Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Holger Buesing, Petar Velickovic, Theophane Weber:
On the role of planning in model-based deep reinforcement learning. ICLR 2021
[c6]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/SmithAW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/SmithAW21
Max Olan Smith, Thomas Anthony, Michael P. Wellman:
Iterative Empirical Game Solving via Single Policy Best Response. ICLR 2021
[c5]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/PerolatMLOROBAB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PerolatMLOROBAB21
Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. ICML 2021: 8525-8535
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-01285
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-01285
Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas W. Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár:
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent. CoRR abs/2106.01285 (2021)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-01901
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-01901
Max Olan Smith, Thomas Anthony, Michael P. Wellman:
Iterative Empirical Game Solving via Single Policy Best Response. CoRR abs/2106.01901 (2021)
2020
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/atal/HughesAELBB20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/atal/HughesAELBB20
Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach:
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games. AAMAS 2020: 538-547
[c3]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/BalduzziCAGHLPG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/BalduzziCAGHLPG20
David Balduzzi, Wojciech M. Czarnecki, Tom Anthony, Ian Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel:
Smooth markets: A basic mechanism for organizing gradient-based learners. ICLR 2020
[c2]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/AnthonyETKGHPLP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AnthonyETKGHPLP20
Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. NeurIPS 2020
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2001-04678
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2001-04678
David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M. Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel:
Smooth markets: A basic mechanism for organizing gradient-based learners. CoRR abs/2001.04678 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-08456
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-08456
Julien Pérolat, Rémi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro A. Ortega, Neil Burch, Thomas W. Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls:
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization. CoRR abs/2002.08456 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-00799
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-00799
Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach:
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games. CoRR abs/2003.00799 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-04635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-04635
Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. CoRR abs/2006.04635 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2009-14180
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2009-14180
Max Olan Smith, Thomas Anthony, Yongzhao Wang, Michael P. Wellman:
Learning to Play against Any Mixture of Opponents. CoRR abs/2009.14180 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-04021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-04021
Jessica B. Hamrick, Abram L. Friesen, Feryal M. P. Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Buesing, Petar Velickovic, Théophane Weber:
On the role of planning in model-based deep reinforcement learning. CoRR abs/2011.04021 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1904-03646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-03646
Thomas Anthony, Robert Nishihara, Philipp Moritz, Tim Salimans, John Schulman:
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees. CoRR abs/1904.03646 (2019)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1908-09453
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1908-09453
Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinícius Flores Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas W. Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis:
OpenSpiel: A Framework for Reinforcement Learning in Games. CoRR abs/1908.09453 (2019)
2017
[c1]
- view
- export record
  dblp key:
  - conf/nips/AnthonyTB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AnthonyTB17
Thomas Anthony, Zheng Tian, David Barber:
Thinking Fast and Slow with Deep Learning and Tree Search. NIPS 2017: 5360-5370
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/AnthonyTB17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AnthonyTB17
Thomas Anthony, Zheng Tian, David Barber:
Thinking Fast and Slow with Deep Learning and Tree Search. CoRR abs/1705.08439 (2017)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.