default search action
Benjamin Van Roy
Person information
- affiliation: Stanford University, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j44]Wanqiao Xu, Shi Dong, Benjamin Van Roy:
Posterior Sampling for Continuing Environments. RLJ 5: 2107-2122 (2024) - [c60]Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy:
Efficient Exploration for LLMs. ICML 2024 - [c59]Hong Jun Jeon, Jason D. Lee, Qi Lei, Benjamin Van Roy:
An Information-Theoretic Analysis of In-Context Learning. ICML 2024 - [i89]Anmol Kagrecha, Henrik Marklund, Benjamin Van Roy, Hong Jun Jeon, Richard Zeckhauser:
Adaptive Crowdsourcing Via Self-Supervised Learning. CoRR abs/2401.13239 (2024) - [i88]Hong Jun Jeon, Jason D. Lee, Qi Lei, Benjamin Van Roy:
An Information-Theoretic Analysis of In-Context Learning. CoRR abs/2401.15530 (2024) - [i87]Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy:
Efficient Exploration for LLMs. CoRR abs/2402.00396 (2024) - [i86]Hong Jun Jeon, Benjamin Van Roy:
Information-Theoretic Foundations for Neural Scaling Laws. CoRR abs/2407.01456 (2024) - [i85]Dilip Arumugam, Wanqiao Xu, Benjamin Van Roy:
Exploration Unbound. CoRR abs/2407.12178 (2024) - [i84]Dilip Arumugam, Saurabh Kumar, Ramki Gummadi, Benjamin Van Roy:
Satisficing Exploration for Deep Reinforcement Learning. CoRR abs/2407.12185 (2024) - [i83]Hong Jun Jeon, Benjamin Van Roy:
Information-Theoretic Foundations for Machine Learning. CoRR abs/2407.12288 (2024) - [i82]Saurabh Kumar, Hong Jun Jeon, Alex Lewandowski, Benjamin Van Roy:
The Need for a Big World Simulator: A Scientific Challenge for Continual Learning. CoRR abs/2408.02930 (2024) - [i81]Hong Jun Jeon, Benjamin Van Roy:
Aligning AI Agents via Information-Directed Sampling. CoRR abs/2410.14807 (2024) - [i80]Henrik Marklund, Benjamin Van Roy:
Choice between Partial Trajectories. CoRR abs/2410.22690 (2024) - 2023
- [j43]Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen:
Reinforcement Learning, Bit by Bit. Found. Trends Mach. Learn. 16(6): 733-865 (2023) - [j42]Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy:
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping. Trans. Mach. Learn. Res. 2023 (2023) - [c58]Yueyang Liu, Benjamin Van Roy, Kuang Xu:
Nonstationary Bandit Learning via Predictive Sampling. AISTATS 2023: 6215-6244 - [c57]Zheqing Zhu, Benjamin Van Roy:
Scalable Neural Contextual Bandit for Recommender Systems. CIKM 2023: 3636-3646 - [c56]Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen:
Leveraging Demonstrations to Improve Online Learning: Quality Matters. ICML 2023: 12527-12545 - [c55]David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado Philip van Hasselt, Satinder Singh:
A Definition of Continual Reinforcement Learning. NeurIPS 2023 - [c54]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Epistemic Neural Networks. NeurIPS 2023 - [c53]Zheqing Zhu, Benjamin Van Roy:
Deep Exploration for Recommendation Systems. RecSys 2023: 963-970 - [c52]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Approximate Thompson Sampling via Epistemic Neural Networks. UAI 2023: 1586-1595 - [i79]Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen:
Leveraging Demonstrations to Improve Online Learning: Quality Matters. CoRR abs/2302.03319 (2023) - [i78]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy:
Approximate Thompson Sampling via Epistemic Neural Networks. CoRR abs/2302.09205 (2023) - [i77]Yueyang Liu, Benjamin Van Roy, Kuang Xu:
A Definition of Non-Stationary Bandits. CoRR abs/2302.12202 (2023) - [i76]Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy:
Bayesian Reinforcement Learning with Limited Cognitive Load. CoRR abs/2305.03263 (2023) - [i75]Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy:
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models. CoRR abs/2305.11455 (2023) - [i74]Zheqing Zhu, Benjamin Van Roy:
Scalable Neural Contextual Bandit for Recommender Systems. CoRR abs/2306.14834 (2023) - [i73]Saurabh Kumar, Henrik Marklund, Ashish Rao, Yifan Zhu, Hong Jun Jeon, Yueyang Liu, Benjamin Van Roy:
Continual Learning as Computationally Constrained Reinforcement Learning. CoRR abs/2307.04345 (2023) - [i72]David Abel, André Barreto, Hado van Hasselt, Benjamin Van Roy, Doina Precup, Satinder Singh:
On the Convergence of Bounded Agents. CoRR abs/2307.11044 (2023) - [i71]David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh:
A Definition of Continual Reinforcement Learning. CoRR abs/2307.11046 (2023) - [i70]Saurabh Kumar, Henrik Marklund, Benjamin Van Roy:
Maintaining Plasticity via Regenerative Regularization. CoRR abs/2308.11958 (2023) - [i69]Zheqing Zhu, Yueyang Liu, Xu Kuang, Benjamin Van Roy:
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling. CoRR abs/2310.07786 (2023) - [i68]Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy:
RLHF and IIA: Perverse Incentives. CoRR abs/2312.01057 (2023) - 2022
- [j41]Shi Dong, Benjamin Van Roy, Zhengyuan Zhou:
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent States. J. Mach. Learn. Res. 23: 255:1-255:54 (2022) - [j40]Daniel Russo, Benjamin Van Roy:
Satisficing in Time-Sensitive Bandit Learning. Math. Oper. Res. 47(4): 2815-2839 (2022) - [c51]Dilip Arumugam, Benjamin Van Roy:
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning. NeurIPS 2022 - [c50]Hong Jun Jeon, Benjamin Van Roy:
An Information-Theoretic Framework for Deep Learning. NeurIPS 2022 - [c49]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Dieterich Lawson, Botao Hao, Brendan O'Donoghue, Benjamin Van Roy:
The Neural Testbed: Evaluating Joint Predictions. NeurIPS 2022 - [c48]Chao Qin, Zheng Wen, Xiuyuan Lu, Benjamin Van Roy:
An Analysis of Ensemble Sampling. NeurIPS 2022 - [c47]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy:
Evaluating high-order predictive distributions in deep learning. UAI 2022: 1552-1560 - [i67]Yueyang Liu, Adithya M. Devraj, Benjamin Van Roy, Kuang Xu:
Gaussian Imagination in Bandit Learning. CoRR abs/2201.01902 (2022) - [i66]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy:
Evaluating High-Order Predictive Distributions in Deep Learning. CoRR abs/2202.13509 (2022) - [i65]Hong Jun Jeon, Benjamin Van Roy:
Sample Complexity versus Depth: An Information Theoretic Analysis. CoRR abs/2203.00246 (2022) - [i64]Chao Qin, Zheng Wen, Xiuyuan Lu, Benjamin Van Roy:
An Analysis of Ensemble Sampling. CoRR abs/2203.01303 (2022) - [i63]Yueyang Liu, Benjamin Van Roy, Kuang Xu:
Nonstationary Bandit Learning via Predictive Sampling. CoRR abs/2205.01970 (2022) - [i62]Dilip Arumugam, Benjamin Van Roy:
Between Rate-Distortion Theory & Value Equivalence in Model-Based Reinforcement Learning. CoRR abs/2206.02025 (2022) - [i61]Dilip Arumugam, Benjamin Van Roy:
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning. CoRR abs/2206.02072 (2022) - [i60]Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy:
Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping. CoRR abs/2206.03633 (2022) - [i59]Xiuyuan Lu, Ian Osband, Seyed Mohammad Asghari, Sven Gowal, Vikranth Dwaracherla, Zheng Wen, Benjamin Van Roy:
Robustness of Epinets against Distributional Shifts. CoRR abs/2207.00137 (2022) - [i58]Yifan Zhu, Hong Jun Jeon, Benjamin Van Roy:
Is Stochastic Gradient Descent Near Optimal? CoRR abs/2209.08627 (2022) - [i57]Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy:
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning. CoRR abs/2210.16877 (2022) - [i56]Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving:
Fine-Tuning Language Models via Epistemic Neural Networks. CoRR abs/2211.01568 (2022) - [i55]Wanqiao Xu, Shi Dong, Benjamin Van Roy:
Posterior Sampling for Continuing Environments. CoRR abs/2211.15931 (2022) - [i54]Hong Jun Jeon, Benjamin Van Roy:
An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws. CoRR abs/2212.01365 (2022) - [i53]Dilip Arumugam, Shi Dong, Benjamin Van Roy:
Inclusive Artificial Intelligence. CoRR abs/2212.12633 (2022) - 2021
- [c46]Dilip Arumugam, Benjamin Van Roy:
Deciding What to Learn: A Rate-Distortion Approach. ICML 2021: 373-382 - [c45]Dilip Arumugam, Benjamin Van Roy:
The Value of Information When Deciding What to Learn. NeurIPS 2021: 9816-9827 - [i52]Dilip Arumugam, Benjamin Van Roy:
Deciding What to Learn: A Rate-Distortion Approach. CoRR abs/2101.06197 (2021) - [i51]Shi Dong, Benjamin Van Roy, Zhengyuan Zhou:
Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State. CoRR abs/2102.05261 (2021) - [i50]Adithya M. Devraj, Benjamin Van Roy, Kuang Xu:
A Bit Better? Quantifying Information for Bandit Learning. CoRR abs/2102.09488 (2021) - [i49]Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen:
Reinforcement Learning, Bit by Bit. CoRR abs/2103.04047 (2021) - [i48]Ian Osband, Zheng Wen, Mohammad Asghari, Morteza Ibrahimi, Xiyuan Lu, Benjamin Van Roy:
Epistemic Neural Networks. CoRR abs/2107.08924 (2021) - [i47]Xiuyuan Lu, Ian Osband, Benjamin Van Roy, Zheng Wen:
Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions. CoRR abs/2107.09224 (2021) - [i46]Zheqing Zhu, Benjamin Van Roy:
Deep Exploration for Recommendation Systems. CoRR abs/2109.12509 (2021) - [i45]Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Botao Hao, Morteza Ibrahimi, Dieterich Lawson, Xiuyuan Lu, Brendan O'Donoghue, Benjamin Van Roy:
Evaluating Predictive Distributions: Does Bayesian Deep Learning Work? CoRR abs/2110.04629 (2021) - [i44]Dilip Arumugam, Benjamin Van Roy:
The Value of Information When Deciding What to Learn. CoRR abs/2110.13973 (2021) - 2020
- [c44]Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy:
Hypermodels for Exploration. ICLR 2020 - [c43]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c42]Zheng Wen, Doina Precup, Morteza Ibrahimi, André Barreto, Benjamin Van Roy, Satinder Singh:
On Efficiency in Hierarchical Reinforcement Learning. NeurIPS 2020 - [i43]Vikranth Dwaracherla, Benjamin Van Roy:
Langevin DQN. CoRR abs/2002.07282 (2020) - [i42]Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy:
Hypermodels for Exploration. CoRR abs/2006.07464 (2020) - [i41]Dilip Arumugam, Benjamin Van Roy:
Randomized Value Functions via Posterior State-Abstraction Sampling. CoRR abs/2010.02383 (2020)
2010 – 2019
- 2019
- [j39]Ian Osband, Benjamin Van Roy, Daniel J. Russo, Zheng Wen:
Deep Exploration via Randomized Value Functions. J. Mach. Learn. Res. 20: 124:1-124:62 (2019) - [c41]Shi Dong, Tengyu Ma, Benjamin Van Roy:
On the Performance of Thompson Sampling on Logistic Bandits. COLT 2019: 1158-1160 - [c40]Xiuyuan Lu, Benjamin Van Roy:
Information-Theoretic Confidence Bounds for Reinforcement Learning. NeurIPS 2019: 2458-2466 - [i40]Shi Dong, Tengyu Ma, Benjamin Van Roy:
On the Performance of Thompson Sampling on Logistic Bandits. CoRR abs/1905.04654 (2019) - [i39]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - [i38]Benjamin Van Roy, Shi Dong:
Comments on the Du-Kakade-Wang-Yang Lower Bounds. CoRR abs/1911.07910 (2019) - [i37]Xiuyuan Lu, Benjamin Van Roy:
Information-Theoretic Confidence Bounds for Reinforcement Learning. CoRR abs/1911.09724 (2019) - [i36]Shi Dong, Benjamin Van Roy, Zhengyuan Zhou:
Provably Efficient Reinforcement Learning with Aggregated States. CoRR abs/1912.06366 (2019) - 2018
- [j38]Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen:
A Tutorial on Thompson Sampling. Found. Trends Mach. Learn. 11(1): 1-96 (2018) - [j37]Daniel Russo, Benjamin Van Roy:
Learning to Optimize via Information-Directed Sampling. Oper. Res. 66(1): 230-252 (2018) - [c39]Maria Dimakopoulou, Benjamin Van Roy:
Coordinated Exploration in Concurrent Reinforcement Learning. ICML 2018: 1270-1278 - [c38]Shi Dong, Benjamin Van Roy:
An Information-Theoretic Analysis for Thompson Sampling with Many Actions. NeurIPS 2018: 4161-4169 - [c37]Maria Dimakopoulou, Ian Osband, Benjamin Van Roy:
Scalable Coordinated Exploration in Concurrent Reinforcement Learning. NeurIPS 2018: 4223-4232 - [i35]Maria Dimakopoulou, Benjamin Van Roy:
Coordinated Exploration in Concurrent Reinforcement Learning. CoRR abs/1802.01282 (2018) - [i34]Daniel Russo, Benjamin Van Roy:
Satisficing in Time-Sensitive Bandit Learning. CoRR abs/1803.02855 (2018) - [i33]Maria Dimakopoulou, Ian Osband, Benjamin Van Roy:
Scalable Coordinated Exploration in Concurrent Reinforcement Learning. CoRR abs/1805.08948 (2018) - [i32]Shi Dong, Benjamin Van Roy:
An Information-Theoretic Analysis of Thompson Sampling for Large Action Spaces. CoRR abs/1805.11845 (2018) - 2017
- [j36]Zheng Wen, Benjamin Van Roy:
Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization. Math. Oper. Res. 42(3): 762-782 (2017) - [c36]Ian Osband, Benjamin Van Roy:
Why is Posterior Sampling Better than Optimism for Reinforcement Learning? ICML 2017: 2701-2710 - [c35]Xiuyuan Lu, Benjamin Van Roy:
Ensemble Sampling. NIPS 2017: 3258-3266 - [c34]Abbas Kazerouni, Mohammad Ghavamzadeh, Yasin Abbasi, Benjamin Van Roy:
Conservative Contextual Linear Bandits. NIPS 2017: 3910-3919 - [i31]Ian Osband, Benjamin Van Roy:
Gaussian-Dirichlet Posterior Dominance in Sequential Learning. CoRR abs/1702.04126 (2017) - [i30]Ian Osband, Daniel Russo, Zheng Wen, Benjamin Van Roy:
Deep Exploration via Randomized Value Functions. CoRR abs/1703.07608 (2017) - [i29]Daniel Russo, David Tse, Benjamin Van Roy:
Time-Sensitive Bandit Learning and Satisficing Thompson Sampling. CoRR abs/1704.09028 (2017) - [i28]Xiuyuan Lu, Benjamin Van Roy:
Ensemble Sampling. CoRR abs/1705.07347 (2017) - [i27]Ian Osband, Benjamin Van Roy:
On Optimistic versus Randomized Exploration in Reinforcement Learning. CoRR abs/1706.04241 (2017) - [i26]Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband:
A Tutorial on Thompson Sampling. CoRR abs/1707.02038 (2017) - [i25]Abbas Kazerouni, Benjamin Van Roy:
Learning to Price with Reference Effects. CoRR abs/1708.09020 (2017) - 2016
- [j35]Daniel Russo, Benjamin Van Roy:
An Information-Theoretic Analysis of Thompson Sampling. J. Mach. Learn. Res. 17: 68:1-68:30 (2016) - [c33]Ian Osband, Benjamin Van Roy, Zheng Wen:
Generalization and Exploration via Randomized Value Functions. ICML 2016: 2377-2386 - [c32]Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy:
Deep Exploration via Bootstrapped DQN. NIPS 2016: 4026-4034 - [i24]Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy:
Deep Exploration via Bootstrapped DQN. CoRR abs/1602.04621 (2016) - [i23]Ian Osband, Benjamin Van Roy:
Why is Posterior Sampling Better than Optimism for Reinforcement Learning. CoRR abs/1607.00215 (2016) - [i22]Ian Osband, Benjamin Van Roy:
Posterior Sampling for Reinforcement Learning Without Episodes. CoRR abs/1608.02731 (2016) - [i21]Ian Osband, Benjamin Van Roy:
On Lower Bounds for Regret in Reinforcement Learning. CoRR abs/1608.02732 (2016) - [i20]Abbas Kazerouni, Mohammad Ghavamzadeh, Benjamin Van Roy:
Conservative Contextual Linear Bandits. CoRR abs/1611.06426 (2016) - 2015
- [j34]Beomsoo Park, Benjamin Van Roy:
Adaptive Execution: Exploration and Learning of Price Impact. Oper. Res. 63(5): 1058-1076 (2015) - [i19]Ian Osband, Benjamin Van Roy:
Bootstrapped Thompson Sampling and Deep Exploration. CoRR abs/1507.00300 (2015) - 2014
- [j33]Yi-Hao Kao, Benjamin Van Roy:
Directed Principal Component Analysis. Oper. Res. 62(4): 957-972 (2014) - [j32]Daniel Russo, Benjamin Van Roy:
Learning to Optimize via Posterior Sampling. Math. Oper. Res. 39(4): 1221-1243 (2014) - [c31]Ian Osband, Benjamin Van Roy:
Near-optimal Reinforcement Learning in Factored MDPs. NIPS 2014: 604-612 - [c30]Ian Osband, Benjamin Van Roy:
Model-based Reinforcement Learning and the Eluder Dimension. NIPS 2014: 1466-1474 - [c29]Daniel Russo, Benjamin Van Roy:
Learning to Optimize via Information-Directed Sampling. NIPS 2014: 1583-1591 - [i18]Benjamin Van Roy, Zheng Wen:
Generalization and Exploration via Randomized Value Functions. CoRR abs/1402.0635 (2014) - [i17]Ian Osband, Benjamin Van Roy:
Near-optimal Regret Bounds for Reinforcement Learning in Factored MDPs. CoRR abs/1403.3741 (2014) - [i16]Daniel Russo, Benjamin Van Roy:
An Information-Theoretic Analysis of Thompson Sampling. CoRR abs/1403.5341 (2014) - [i15]Daniel Russo, Benjamin Van Roy:
Learning to Optimize Via Information Directed Sampling. CoRR abs/1403.5556 (2014) - [i14]Ian Osband, Benjamin Van Roy:
Model-based Reinforcement Learning and the Eluder Dimension. CoRR abs/1406.1853 (2014) - 2013
- [j31]Yi-Hao Kao, Benjamin Van Roy:
Learning a factor model via regularized PCA. Mach. Learn. 91(3): 279-303 (2013) - [c28]Daniel Russo, Benjamin Van Roy:
Eluder Dimension and the Sample Complexity of Optimistic Exploration. NIPS 2013: 2256-2264 - [c27]Ian Osband, Daniel Russo, Benjamin Van Roy:
(More) Efficient Reinforcement Learning via Posterior Sampling. NIPS 2013: 3003-3011 - [c26]Zheng Wen, Benjamin Van Roy:
Efficient Exploration and Value Function Generalization in Deterministic Systems. NIPS 2013: 3021-3029 - [i13]Paat Rusmevichientong, Benjamin Van Roy:
A Tractable POMDP for a Class of Sequencing Problems. CoRR abs/1301.2308 (2013) - [i12]Daniel Russo, Benjamin Van Roy:
Learning to Optimize Via Posterior Sampling. CoRR abs/1301.2609 (2013) - [i11]Morteza Ibrahimi, Adel Javanmard, Benjamin Van Roy:
Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems. CoRR abs/1303.5984 (2013) - [i10]Ian Osband, Daniel Russo, Benjamin Van Roy:
(More) Efficient Reinforcement Learning via Posterior Sampling. CoRR abs/1306.0940 (2013) - [i9]Zheng Wen, Benjamin Van Roy:
Efficient Exploration and Value Function Generalization in Deterministic Systems. CoRR abs/1307.4847 (2013) - 2012
- [j30]Michael Padilla, Benjamin Van Roy:
Intermediated Blind Portfolio Auctions. Manag. Sci. 58(9): 1747-1760 (2012) - [c25]Morteza Ibrahimi, Adel Javanmard, Benjamin Van Roy:
Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems. NIPS 2012: 2645-2653 - [i8]Yi-Hao Kao, Benjamin Van Roy:
Directed Time Series Regression for Control. CoRR abs/1206.6141 (2012) - [i7]Yi-Hao Kao, Benjamin Van Roy, Daniel L. Rubin, Jiajing Xu, Jessica S. Faruque, Sandy Napel:
A Hybrid Method for Distance Metric Learning. CoRR abs/1206.7112 (2012) - 2011
- [j29]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Resource Allocation via Message Passing. INFORMS J. Comput. 23(2): 205-219 (2011) - [j28]Gabriel Y. Weintraub, C. Lanier Benkard, Benjamin Van Roy:
Industry dynamics: Foundations for models with an infinite number of firms. J. Econ. Theory 146(5): 1965-1994 (2011) - [i6]Yi-Hao Kao, Benjamin Van Roy:
Learning a Factor Model via Regularized PCA. CoRR abs/1111.6201 (2011) - 2010
- [j27]Benjamin Van Roy:
On Regression-Based Stopping Times. Discret. Event Dyn. Syst. 20(3): 307-324 (2010) - [j26]Vivek F. Farias, Benjamin Van Roy:
Dynamic Pricing with a Prior on Market Response. Oper. Res. 58(1): 16-29 (2010) - [j25]Gabriel Y. Weintraub, C. Lanier Benkard, Benjamin Van Roy:
Computational Methods for Oblivious Equilibrium. Oper. Res. 58(4-Part-2): 1247-1265 (2010) - [j24]Ramesh Johari, Gabriel Y. Weintraub, Benjamin Van Roy:
Investment and Market Structure in Industries with Congestion. Oper. Res. 58(5): 1303-1317 (2010) - [j23]Benjamin Van Roy, Xiang Yan:
Manipulation Robustness of Collaborative Filtering. Manag. Sci. 56(11): 1911-1929 (2010) - [j22]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Convergence of min-sum message-passing for convex optimization. IEEE Trans. Inf. Theory 56(4): 2041-2050 (2010) - [j21]Vivek F. Farias, Ciamac Cyrus Moallemi, Benjamin Van Roy, Tsachy Weissman:
Universal reinforcement learning. IEEE Trans. Inf. Theory 56(5): 2441-2454 (2010)
2000 – 2009
- 2009
- [j20]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Convergence of min-sum message passing for quadratic optimization. IEEE Trans. Inf. Theory 55(5): 2413-2423 (2009) - [c24]Yi-Hao Kao, Benjamin Van Roy, Xiang Yan:
Directed Regression. NIPS 2009: 889-897 - [c23]Benjamin Van Roy, Xiang Yan:
Manipulation-resistant collaborative filtering systems. RecSys 2009: 165-172 - [i5]Xiang Yan, Benjamin Van Roy:
Manipulation Robustness of Collaborative Filtering Systems. CoRR abs/0903.0064 (2009) - 2008
- [j19]Haim H. Permuter, Paul Cuff, Benjamin Van Roy, Tsachy Weissman:
Capacity of the Trapdoor Channel With Feedback. IEEE Trans. Inf. Theory 54(7): 3150-3165 (2008) - [c22]Xiang Yan, Benjamin Van Roy:
Reputation markets. NetEcon 2008: 79-84 - 2007
- [j18]Benjamin Van Roy:
A short proof of optimality for the MIN cache replacement algorithm. Inf. Process. Lett. 102(2-3): 72-73 (2007) - [c21]Haim H. Permuter, Paul Cuff, Benjamin Van Roy, Tsachy Weissman:
Capacity and Zero-Error Capacity of the Chemical Channel with Feedback. ISIT 2007: 1866-1870 - [i4]Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Weissman, Benjamin Van Roy:
Universal Reinforcement Learning. CoRR abs/0707.3087 (2007) - 2006
- [j17]David Choi, Benjamin Van Roy:
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning. Discret. Event Dyn. Syst. 16(2): 207-239 (2006) - [j16]Paat Rusmevichientong, Benjamin Van Roy, Peter W. Glynn:
A Nonparametric Approach to Multiproduct Pricing. Oper. Res. 54(1): 82-98 (2006) - [j15]Benjamin Van Roy:
Performance Loss Bounds for Approximate Value Iteration with State Aggregation. Math. Oper. Res. 31(2): 234-244 (2006) - [j14]Daniela Pucci de Farias, Benjamin Van Roy:
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees. Math. Oper. Res. 31(3): 597-620 (2006) - [j13]Vivek F. Farias, Benjamin Van Roy:
Approximation algorithms for dynamic resource allocation. Oper. Res. Lett. 34(2): 180-190 (2006) - [j12]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Consensus Propagation. IEEE Trans. Inf. Theory 52(11): 4753-4766 (2006) - [i3]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Convergence of the Min-Sum Message Passing Algorithm for Quadratic Optimization. CoRR abs/cs/0603058 (2006) - [i2]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Consensus Propagation. CoRR abs/cs/0603078 (2006) - [i1]Haim H. Permuter, Paul Cuff, Benjamin Van Roy, Tsachy Weissman:
Capacity of the Trapdoor Channel with Feedback. CoRR abs/cs/0610047 (2006) - 2005
- [c20]Vivek F. Farias, Ciamac C. Moallemi, Benjamin Van Roy, Tsachy Weissman:
A universal scheme for learning. ISIT 2005: 1158-1162 - [c19]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Consensus Propagation. NIPS 2005: 899-906 - [c18]Benjamin Van Roy:
TD(0) Leads to Better Policies than Approximate Value Iteration. NIPS 2005: 1377-1384 - [c17]Gabriel Y. Weintraub, C. Lanier Benkard, Benjamin Van Roy:
Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games. NIPS 2005: 1489-1496 - 2004
- [j11]Daniela Pucci de Farias, Benjamin Van Roy:
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming. Math. Oper. Res. 29(3): 462-478 (2004) - [c16]Daniela Pucci de Farias, Benjamin Van Roy:
A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees. NIPS 2004: 417-424 - [c15]Xiang Yan, Persi Diaconis, Paat Rusmevichientong, Benjamin Van Roy:
Solitaire: Man Versus Machine. NIPS 2004: 1553-1560 - [c14]Hui Zhang, Ashish Goel, Ramesh Govindan, Kahn Mason, Benjamin Van Roy:
Making Eigenvector-Based Reputation Systems Robust to Collusion. WAW 2004: 92-104 - 2003
- [j10]Benjamin Van Roy:
Self-learning control of finite Markov chains: A.S. Poznyak, K. Najim, E. Gómez-Ramírez, Marcel Dekker, New York, 2000, $150, pp 298, ISBN 0-8247-9249-X. Autom. 39(2): 373-376 (2003) - [j9]Paat Rusmevichientong, Benjamin Van Roy:
Decentralized decision-making in a large team with local information. Games Econ. Behav. 43(2): 266-295 (2003) - [j8]Daniela Pucci de Farias, Benjamin Van Roy:
The Linear Programming Approach to Approximate Dynamic Programming. Oper. Res. 51(6): 850-865 (2003) - [c13]Daniela Pucci de Farias, Benjamin Van Roy:
On constraint sampling in the linear programming approach to approximate linear programming. CDC 2003: 2441-2446 - [c12]Ciamac Cyrus Moallemi, Benjamin Van Roy:
Distributed Optimization in Adaptive Networks. NIPS 2003: 887-894 - 2002
- [j7]John N. Tsitsiklis, Benjamin Van Roy:
On Average Versus Discounted Reward Temporal-Difference Learning. Mach. Learn. 49(2-3): 179-191 (2002) - [c11]Daniela Pucci de Farias, Benjamin Van Roy:
Approximate Linear Programming for Average-Cost Dynamic Programming. NIPS 2002: 1587-1594 - 2001
- [j6]Paat Rusmevichientong, Benjamin Van Roy:
An analysis of belief propagation on the turbo decoding graph with Gaussian densities. IEEE Trans. Inf. Theory 47(2): 745-765 (2001) - [j5]John N. Tsitsiklis, Benjamin Van Roy:
Regression methods for pricing complex American-style options. IEEE Trans. Neural Networks 12(4): 694-703 (2001) - [c10]David Choi, Benjamin Van Roy:
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal Difference Learning. ICML 2001: 43-50 - [c9]Daniela Pucci de Farias, Benjamin Van Roy:
Approximate Dynamic Programming via Linear Programming. NIPS 2001: 689-695 - [c8]Paat Rusmevichientong, Benjamin Van Roy:
A Tractable POMDP for Dynamic Sequencing with Applications to Personalized Internet Content Provision. UAI 2001: 480-487 - 2000
- [c7]Nathaniel Keohane, Benjamin Van Roy, Richard Zeckhauser:
The optimal harvesting of environmental bads. CDC 2000: 234-239 - [c6]Daniela Pucci de Farias, Benjamin Van Roy:
Approximate value iteration with randomized policies. CDC 2000: 3421-3426 - [c5]Daniela Pucci de Farias, Benjamin Van Roy:
Fixed Points of Approximate Value Iteration and Temporal-Difference Learning. ICML 2000: 207-214
1990 – 1999
- 1999
- [j4]John N. Tsitsiklis, Benjamin Van Roy:
Average cost temporal-difference learning. Autom. 35(11): 1799-1808 (1999) - [j3]John N. Tsitsiklis, Benjamin Van Roy:
Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives. IEEE Trans. Autom. Control. 44(10): 1840-1851 (1999) - [c4]Paat Rusmevichientong, Benjamin Van Roy:
An Analysis of Turbo Decoding with Gaussian Densities. NIPS 1999: 575-581 - 1998
- [b1]Benjamin Van Roy:
Learning and value function approximation in complex decision processes. Massachusetts Institute of Technology, Cambridge, MA, USA, 1998 - 1997
- [j2]John N. Tsitsiklis, Benjamin Van Roy:
An analysis of temporal-difference learning with function approximation. IEEE Trans. Autom. Control. 42(5): 674-690 (1997) - 1996
- [j1]John N. Tsitsiklis, Benjamin Van Roy:
Feature-Based Methods for Large Scale Dynamic Programming. Mach. Learn. 22(1-3): 59-94 (1996) - [c3]John N. Tsitsiklis, Benjamin Van Roy:
Analysis of Temporal-Diffference Learning with Function Approximation. NIPS 1996: 1075-1081 - [c2]John N. Tsitsiklis, Benjamin Van Roy:
Approximate Solutions to Optimal Stopping Problems. NIPS 1996: 1082-1088 - 1995
- [c1]Benjamin Van Roy, John N. Tsitsiklis:
Stable LInear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions. NIPS 1995: 1045-1051
Coauthor Index
aka: Vikranth Dwaracherla
aka: Ciamac Cyrus Moallemi
aka: Daniel J. Russo
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-01 00:11 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint