default search action
Tom Zahavy
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [j2]Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy:
POMRL: No-Regret Learning-to-Plan with Increasing Horizons. Trans. Mach. Learn. Res. 2023 (2023) - [c29]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dalibard, Chris Lu, Satinder Singh, Sebastian Flennerhag:
Discovering Evolution Strategies via Meta-Black-Box Optimization. GECCO Companion 2023: 29-30 - [c28]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag:
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization. GECCO 2023: 929-937 - [c27]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dalibard, Chris Lu, Satinder Singh, Sebastian Flennerhag:
Discovering Evolution Strategies via Meta-Black-Box Optimization. ICLR 2023 - [c26]Tom Zahavy, Yannick Schroecker, Feryal M. P. Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh:
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality. ICLR 2023 - [c25]Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy:
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs. ICML 2023: 25303-25336 - [c24]Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado Philip van Hasselt, András György, Satinder Singh:
Optimistic Meta-Gradients. NeurIPS 2023 - [i40]Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh:
Optimistic Meta-Gradients. CoRR abs/2301.03236 (2023) - [i39]Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy:
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs. CoRR abs/2302.01275 (2023) - [i38]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag:
Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization. CoRR abs/2304.03995 (2023) - [i37]Veronica Chelu, Tom Zahavy, Arthur Guez, Doina Precup, Sebastian Flennerhag:
Optimism and Adaptivity in Policy Optimization. CoRR abs/2306.10587 (2023) - [i36]Tom Zahavy, Vivek Veeriah, Shaobo Hou, Kevin Waugh, Matthew Lai, Edouard Leurent, Nenad Tomasev, Lisa Schut, Demis Hassabis, Satinder Singh:
Diversifying AI: Towards Creative Chess with AlphaZero. CoRR abs/2308.09175 (2023) - [i35]Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen:
APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT. CoRR abs/2308.12649 (2023) - 2022
- [c23]Lior Shani, Tom Zahavy, Shie Mannor:
Online Apprenticeship Learning. AAAI 2022: 8240-8248 - [c22]Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh:
Meta-Gradients in Non-Stationary Environments. CoLLAs 2022: 886-901 - [c21]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. ICLR 2022 - [c20]Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder Singh:
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining. NeurIPS 2022 - [i34]Tom Zahavy, Yannick Schroecker, Feryal M. P. Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh:
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality. CoRR abs/2205.13521 (2022) - [i33]Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh:
Meta-Gradients in Non-Stationary Environments. CoRR abs/2209.06159 (2022) - [i32]Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder Singh:
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining. CoRR abs/2210.10913 (2022) - [i31]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag:
Discovering Evolution Strategies via Meta-Black-Box Optimization. CoRR abs/2211.11260 (2022) - [i30]Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy:
POMRL: No-Regret Learning-to-Plan with Increasing Horizons. CoRR abs/2212.14530 (2022) - 2021
- [j1]Stav Belogolovsky, Philip Korsunsky, Shie Mannor, Chen Tessler, Tom Zahavy:
Inverse reinforcement learning in contextual MDPs. Mach. Learn. 110(9): 2295-2334 (2021) - [c19]Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy A. Mann:
Balancing Constraints and Rewards with Meta-Gradient D4PG. ICLR 2021 - [c18]Tom Zahavy, André Barreto, Daniel J. Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh:
Discovering a set of policies for the worst case reward. ICLR 2021 - [c17]Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt:
Emphatic Algorithms for Deep Reinforcement Learning. ICML 2021: 5023-5033 - [c16]Ofir Nabati, Tom Zahavy, Shie Mannor:
Online Limited Memory Neural-Linear Bandits with Likelihood Matching. ICML 2021: 7905-7915 - [c15]Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh:
Reward is enough for convex MDPs. NeurIPS 2021: 25746-25759 - [c14]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. NeurIPS 2021: 29861-29873 - [i29]Ofir Nabati, Tom Zahavy, Shie Mannor:
Online Limited Memory Neural-Linear Bandits with Likelihood Matching. CoRR abs/2102.03799 (2021) - [i28]Tom Zahavy, André Barreto, Daniel J. Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh:
Discovering a set of policies for the worst case reward. CoRR abs/2102.04323 (2021) - [i27]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. CoRR abs/2102.06741 (2021) - [i26]Lior Shani, Tom Zahavy, Shie Mannor:
Online Apprenticeship Learning. CoRR abs/2102.06924 (2021) - [i25]Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh:
Reward is enough for convex MDPs. CoRR abs/2106.00661 (2021) - [i24]Tom Zahavy, Brendan O'Donoghue, André Barreto, Volodymyr Mnih, Sebastian Flennerhag, Satinder Singh:
Discovering Diverse Nearly Optimal Policies withSuccessor Features. CoRR abs/2106.00669 (2021) - [i23]Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt:
Emphatic Algorithms for Deep Reinforcement Learning. CoRR abs/2106.11779 (2021) - [i22]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. CoRR abs/2109.04504 (2021) - 2020
- [c13]Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour:
Apprenticeship Learning via Frank-Wolfe. AAAI 2020: 6720-6728 - [c12]Tom Zahavy, Avinatan Hassidim, Haim Kaplan, Yishay Mansour:
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies. ALT 2020: 906-934 - [c11]Uri Shaham, Tom Zahavy, Cesar Caraballo, Shiwani Mahajan, Daisy Massey, Harlan M. Krumholz:
Learning to Ask Medical Questions using Reinforcement Learning. MLHC 2020: 2-26 - [c10]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
A Self-Tuning Actor-Critic Algorithm. NeurIPS 2020 - [c9]Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour:
Unknown mixing times in apprenticeship and reinforcement learning. UAI 2020: 430-439 - [i21]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Self-Tuning Deep Reinforcement Learning. CoRR abs/2002.12928 (2020) - [i20]Uri Shaham, Tom Zahavy, Cesar Caraballo, Shiwani Mahajan, Daisy Massey, Harlan M. Krumholz:
Learning to Ask Medical Questions using Reinforcement Learning. CoRR abs/2004.00994 (2020) - [i19]Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy A. Mann:
Balancing Constraints and Rewards with Meta-Gradient D4PG. CoRR abs/2010.06324 (2020)
2010 – 2019
- 2019
- [i18]Tom Zahavy, Shie Mannor:
Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching. CoRR abs/1901.08612 (2019) - [i17]Tom Zahavy, Avinatan Hassidim, Haim Kaplan, Yishay Mansour:
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies. CoRR abs/1902.10140 (2019) - [i16]Chen Tessler, Tom Zahavy, Deborah Cohen, Daniel J. Mankowitz, Shie Mannor:
Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces. CoRR abs/1905.09700 (2019) - [i15]Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour:
Average reward reinforcement learning with unknown mixing times. CoRR abs/1905.09704 (2019) - [i14]Philip Korsunsky, Stav Belogolovsky, Tom Zahavy, Chen Tessler, Shie Mannor:
Inverse Reinforcement Learning in Contextual MDPs. CoRR abs/1905.09710 (2019) - [i13]Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour:
Apprenticeship Learning via Frank-Wolfe. CoRR abs/1911.01679 (2019) - [i12]Ron Ziv, Alex Dikopoltsev, Tom Zahavy, Ittai Rubinstein, Pavel Sidorenko, Oren Cohen, Mordechai Segev:
Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an all-in-line system in a single-shot. CoRR abs/1911.10326 (2019) - 2018
- [c8]Tom Zahavy, Abhinandan Krishnan, Alessandro Magnani, Shie Mannor:
Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce. AAAI 2018: 7873-7881 - [c7]Matan Haroush, Tom Zahavy, Daniel J. Mankowitz, Shie Mannor:
Learning How Not to Act in Text-based Games. ICLR (Workshop) 2018 - [c6]Tom Zahavy, Bingyi Kang, Alex Sivak, Jiashi Feng, Huan Xu, Shie Mannor:
Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms. ICLR (Workshop) 2018 - [c5]Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz, Shie Mannor:
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning. NeurIPS 2018: 3566-3577 - [i11]Guy Tennenholtz, Tom Zahavy, Shie Mannor:
Train on Validation: Squeezing the Data Lemon. CoRR abs/1802.05846 (2018) - [i10]Tom Zahavy, Avinatan Hassidim, Haim Kaplan, Yishay Mansour:
Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies. CoRR abs/1803.04674 (2018) - [i9]Tom Zahavy, Alex Dikopoltsev, Oren Cohen, Shie Mannor, Mordechai Segev:
Deep Learning Reconstruction of Ultra-Short Pulses. CoRR abs/1803.06024 (2018) - [i8]Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz, Shie Mannor:
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning. CoRR abs/1809.02121 (2018) - 2017
- [c4]Chen Tessler, Shahar Givony, Tom Zahavy, Daniel J. Mankowitz, Shie Mannor:
A Deep Hierarchical Approach to Lifelong Learning in Minecraft. AAAI 2017: 1553-1561 - [c3]Nir Levine, Tom Zahavy, Daniel J. Mankowitz, Aviv Tamar, Shie Mannor:
Shallow Updates for Deep Reinforcement Learning. NIPS 2017: 3135-3145 - [i7]Nir Levine, Tom Zahavy, Daniel J. Mankowitz, Aviv Tamar, Shie Mannor:
Shallow Updates for Deep Reinforcement Learning. CoRR abs/1705.07461 (2017) - 2016
- [c2]Tom Zahavy, Nir Ben-Zrihem, Shie Mannor:
Graying the black box: Understanding DQNs. ICML 2016: 1899-1908 - [i6]Jiashi Feng, Tom Zahavy, Bingyi Kang, Huan Xu, Shie Mannor:
Ensemble Robustness of Deep Learning Algorithms. CoRR abs/1602.02389 (2016) - [i5]Tom Zahavy, Nir Ben-Zrihem, Shie Mannor:
Graying the black box: Understanding DQNs. CoRR abs/1602.02658 (2016) - [i4]Chen Tessler, Shahar Givony, Tom Zahavy, Daniel J. Mankowitz, Shie Mannor:
A Deep Hierarchical Approach to Lifelong Learning in Minecraft. CoRR abs/1604.07255 (2016) - [i3]Nir Baram, Tom Zahavy, Shie Mannor:
Deep Reinforcement Learning Discovers Internal Models. CoRR abs/1606.05174 (2016) - [i2]Nir Ben-Zrihem, Tom Zahavy, Shie Mannor:
Visualizing Dynamics: from t-SNE to SEMI-MDPs. CoRR abs/1606.07112 (2016) - [i1]Tom Zahavy, Alessandro Magnani, Abhinandan Krishnan, Shie Mannor:
Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce. CoRR abs/1611.09534 (2016) - 2014
- [c1]Tom Zahavy, Oran Shayer, Deborah Cohen, Alex Tolmachev, Yonina C. Eldar:
Sub-Nyquist sampling of OFDM signals for cognitive radios. ICASSP 2014: 8092-8096
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-20 21:55 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint