default search action
Jacob Steinhardt
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c63]Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy:
Describing Differences in Image Sets with Natural Language. CVPR 2024: 24199-24208 - [c62]Jiahai Feng, Jacob Steinhardt:
How do Language Models Bind Entities in Context? ICLR 2024 - [c61]Yossi Gandelsman, Alexei A. Efros, Jacob Steinhardt:
Interpreting CLIP's Image Representation via Text-Based Decomposition. ICLR 2024 - [c60]Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt:
Overthinking the Truth: Understanding how Language Models Process False Demonstrations. ICLR 2024 - [c59]Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen R. McKeown:
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations. ICML 2024 - [c58]Danny Halawi, Alexander Wei, Eric Wallace, Tony Tong Wang, Nika Haghtalab, Jacob Steinhardt:
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation. ICML 2024 - [c57]Alexander Pan, Erik Jones, Meena Jagadeesan, Jacob Steinhardt:
Feedback Loops With Language Models Drive In-Context Reward Hacking. ICML 2024 - [i82]Alexander Pan, Erik Jones, Meena Jagadeesan, Jacob Steinhardt:
Feedback Loops With Language Models Drive In-Context Reward Hacking. CoRR abs/2402.06627 (2024) - [i81]Danny Halawi, Fred Zhang, Chen Yueh-Han, Jacob Steinhardt:
Approaching Human-Level Forecasting with Language Models. CoRR abs/2402.18563 (2024) - [i80]Yossi Gandelsman, Alexei A. Efros, Jacob Steinhardt:
Interpreting the Second-Order Effects of Neurons in CLIP. CoRR abs/2406.04341 (2024) - [i79]Erik Jones, Anca D. Dragan, Jacob Steinhardt:
Adversaries Can Misuse Combinations of Safe Models. CoRR abs/2406.14595 (2024) - [i78]Jiahai Feng, Stuart Russell, Jacob Steinhardt:
Monitoring Latent World States in Language Models with Propositional Probes. CoRR abs/2406.19501 (2024) - [i77]Danny Halawi, Alexander Wei, Eric Wallace, Tony T. Wang, Nika Haghtalab, Jacob Steinhardt:
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation. CoRR abs/2406.20053 (2024) - [i76]Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt:
Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry. CoRR abs/2409.03734 (2024) - [i75]Ruiqi Zhong, Heng Wang, Dan Klein, Jacob Steinhardt:
Explaining Datasets in Words: Statistical Models with Natural Language Parameters. CoRR abs/2409.08466 (2024) - [i74]Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Bowman, He He, Shi Feng:
Language Models Learn to Mislead Humans via RLHF. CoRR abs/2409.12822 (2024) - [i73]Lisa Dunlap, Krishna Mandal, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez:
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models. CoRR abs/2410.12851 (2024) - 2023
- [j9]Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt:
Learning Equilibria in Matching Markets with Bandit Feedback. J. ACM 70(3): 19:1-19:46 (2023) - [c56]Kush Bhatia, Wenshuo Guo, Jacob Steinhardt:
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws. AISTATS 2023: 11149-11171 - [c55]Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt:
Discovering Latent Knowledge in Language Models Without Supervision. ICLR 2023 - [c54]Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt:
Progress measures for grokking via mechanistic interpretability. ICLR 2023 - [c53]Kevin Ro Wang, Alexandre Variengien, Arthur Conmy, Buck Shlegeris, Jacob Steinhardt:
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small. ICLR 2023 - [c52]Erik Jones, Anca D. Dragan, Aditi Raghunathan, Jacob Steinhardt:
Automatically Auditing Large Language Models via Discrete Optimization. ICML 2023: 15307-15329 - [c51]Yongyi Yang, Jacob Steinhardt, Wei Hu:
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations. ICML 2023: 39453-39487 - [c50]Alexander Wei, Nika Haghtalab, Jacob Steinhardt:
Jailbroken: How Does LLM Safety Training Fail? NeurIPS 2023 - [c49]Meena Jagadeesan, Nikhil Garg, Jacob Steinhardt:
Supply-Side Equilibria in Recommender Systems. NeurIPS 2023 - [c48]Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt, Nika Haghtalab:
Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition. NeurIPS 2023 - [c47]Shengbang Tong, Erik Jones, Jacob Steinhardt:
Mass-Producing Failures of Multimodal Systems with Language Models. NeurIPS 2023 - [c46]Ruiqi Zhong, Peter Zhang, Steve Li, Jinwoo Ahn, Dan Klein, Jacob Steinhardt:
Goal Driven Discovery of Distributional Differences via Language Descriptions. NeurIPS 2023 - [i72]Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt:
Progress measures for grokking via mechanistic interpretability. CoRR abs/2301.05217 (2023) - [i71]Kush Bhatia, Wenshuo Guo, Jacob Steinhardt:
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws. CoRR abs/2302.12349 (2023) - [i70]Ruiqi Zhong, Peter Zhang, Steve Li, Jinwoo Ahn, Dan Klein, Jacob Steinhardt:
Goal Driven Discovery of Distributional Differences via Language Descriptions. CoRR abs/2302.14233 (2023) - [i69]Erik Jones, Anca D. Dragan, Aditi Raghunathan, Jacob Steinhardt:
Automatically Auditing Large Language Models via Discrete Optimization. CoRR abs/2303.04381 (2023) - [i68]Nora Belrose, Zach Furman, Logan Smith, Danny Halawi, Igor Ostrovsky, Lev McKinney, Stella Biderman, Jacob Steinhardt:
Eliciting Latent Predictions from Transformers with the Tuned Lens. CoRR abs/2303.08112 (2023) - [i67]Xinyan Hu, Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt:
Incentivizing High-Quality Content in Online Recommender Systems. CoRR abs/2306.07479 (2023) - [i66]Shengbang Tong, Erik Jones, Jacob Steinhardt:
Mass-Producing Failures of Multimodal Systems with Language Models. CoRR abs/2306.12105 (2023) - [i65]Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt, Nika Haghtalab:
Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition. CoRR abs/2306.14670 (2023) - [i64]Yongyi Yang, Jacob Steinhardt, Wei Hu:
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations. CoRR abs/2306.17105 (2023) - [i63]Alexander Wei, Nika Haghtalab, Jacob Steinhardt:
Jailbroken: How Does LLM Safety Training Fail? CoRR abs/2307.02483 (2023) - [i62]Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen R. McKeown:
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations. CoRR abs/2307.08678 (2023) - [i61]Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt:
Overthinking the Truth: Understanding how Language Models Process False Demonstrations. CoRR abs/2307.09476 (2023) - [i60]Yossi Gandelsman, Alexei A. Efros, Jacob Steinhardt:
Interpreting CLIP's Image Representation via Text-Based Decomposition. CoRR abs/2310.05916 (2023) - [i59]Jiahai Feng, Jacob Steinhardt:
How do Language Models Bind Entities in Context? CoRR abs/2310.17191 (2023) - [i58]Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy:
Describing Differences in Image Sets with Natural Language. CoRR abs/2312.02974 (2023) - 2022
- [j8]Pang Wei Koh, Jacob Steinhardt, Percy Liang:
Stronger data poisoning attacks break data sanitization defenses. Mach. Learn. 111(1): 1-47 (2022) - [c45]Ye Wang, Norman Mu, Daniele Grandi, Nicolas Savva, Jacob Steinhardt:
A3D: Studying Pretrained Representations with Programmable Datasets. CVPR Workshops 2022: 4877-4885 - [c44]Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt:
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures. CVPR 2022: 16762-16771 - [c43]Alexander Pan, Kush Bhatia, Jacob Steinhardt:
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models. ICLR 2022 - [c42]Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joseph Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song:
Scaling Out-of-Distribution Detection for Real-World Settings. ICML 2022: 8759-8773 - [c41]Alexander Wei, Wei Hu, Jacob Steinhardt:
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize. ICML 2022: 23549-23588 - [c40]Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma, Jacob Steinhardt:
Predicting Out-of-Distribution Error with the Projection Norm. ICML 2022: 25721-25746 - [c39]Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt:
Describing Differences between Text Distributions with Natural Language. ICML 2022: 27099-27116 - [c38]Erik Jones, Jacob Steinhardt:
Capturing Failures of Large Language Models via Human Cognitive Biases. NeurIPS 2022 - [c37]Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David A. Forsyth, Jacob Steinhardt, Dan Hendrycks:
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios. NeurIPS 2022 - [c36]Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks:
Forecasting Future World Events With Neural Networks. NeurIPS 2022 - [i57]Alexander Pan, Kush Bhatia, Jacob Steinhardt:
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models. CoRR abs/2201.03544 (2022) - [i56]Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt:
Summarizing Differences between Text Distributions with Natural Language. CoRR abs/2201.12323 (2022) - [i55]Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma, Jacob Steinhardt:
Predicting Out-of-Distribution Error with the Projection Norm. CoRR abs/2202.05834 (2022) - [i54]Erik Jones, Jacob Steinhardt:
Capturing Failures of Large Language Models via Human Cognitive Biases. CoRR abs/2202.12299 (2022) - [i53]Alexander Wei, Wei Hu, Jacob Steinhardt:
More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize. CoRR abs/2203.06176 (2022) - [i52]Meena Jagadeesan, Nikhil Garg, Jacob Steinhardt:
Supply-Side Equilibria in Recommender Systems. CoRR abs/2206.13489 (2022) - [i51]Jean-Stanislas Denain, Jacob Steinhardt:
Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior. CoRR abs/2206.13498 (2022) - [i50]Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks:
Forecasting Future World Events with Neural Networks. CoRR abs/2206.15474 (2022) - [i49]Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David A. Forsyth, Jacob Steinhardt, Dan Hendrycks:
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios. CoRR abs/2210.10039 (2022) - [i48]Kevin Wang, Alexandre Variengien, Arthur Conmy, Buck Shlegeris, Jacob Steinhardt:
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small. CoRR abs/2211.00593 (2022) - [i47]Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt:
Discovering Latent Knowledge in Language Models Without Supervision. CoRR abs/2212.03827 (2022) - 2021
- [j7]Jacob Steinhardt:
Technical perspective: Robust statistics tackle new problems. Commun. ACM 64(5): 106 (2021) - [c35]Ruiqi Zhong, Dhruba Ghosh, Dan Klein, Jacob Steinhardt:
Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level. ACL/IJCNLP (Findings) 2021: 3813-3827 - [c34]Collin Burns, Jacob Steinhardt:
Limitations of Post-Hoc Feature Alignment for Robustness. CVPR 2021: 2525-2533 - [c33]Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song:
Natural Adversarial Examples. CVPR 2021: 15262-15271 - [c32]Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer:
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization. ICCV 2021: 8320-8329 - [c31]Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, Jacob Steinhardt:
Aligning AI With Shared Human Values. ICLR 2021 - [c30]Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt:
Measuring Massive Multitask Language Understanding. ICLR 2021 - [c29]Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, Jacob Steinhardt:
Agnostic Learning with Unknown Utilities. ITCS 2021: 55:1-55:20 - [c28]Frances Ding, Jean-Stanislas Denain, Jacob Steinhardt:
Grounding Representation Similarity Through Statistical Testing. NeurIPS 2021: 1556-1568 - [c27]Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, Jacob Steinhardt:
Measuring Mathematical Problem Solving With the MATH Dataset. NeurIPS Datasets and Benchmarks 2021 - [c26]Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt:
Measuring Coding Challenge Competence With APPS. NeurIPS Datasets and Benchmarks 2021 - [c25]Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt:
What Would Jiminy Cricket Do? Towards Agents That Behave Morally. NeurIPS Datasets and Benchmarks 2021 - [c24]Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt:
Learning Equilibria in Matching Markets from Bandit Feedback. NeurIPS 2021: 3323-3335 - [i46]Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, Jacob Steinhardt:
Measuring Mathematical Problem Solving With the MATH Dataset. CoRR abs/2103.03874 (2021) - [i45]Collin Burns, Jacob Steinhardt:
Limitations of Post-Hoc Feature Alignment for Robustness. CoRR abs/2103.05898 (2021) - [i44]Charlie Snell, Ruiqi Zhong, Dan Klein, Jacob Steinhardt:
Approximating How Single Head Attention Learns. CoRR abs/2103.07601 (2021) - [i43]Yaodong Yu, Zitong Yang, Edgar Dobriban, Jacob Steinhardt, Yi Ma:
Understanding Generalization in Adversarial Training via the Bias-Variance Decomposition. CoRR abs/2103.09947 (2021) - [i42]Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, Jacob Steinhardt:
Agnostic learning with unknown utilities. CoRR abs/2104.08482 (2021) - [i41]Ruiqi Zhong, Dhruba Ghosh, Dan Klein, Jacob Steinhardt:
Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level. CoRR abs/2105.06020 (2021) - [i40]Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt:
Measuring Coding Challenge Competence With APPS. CoRR abs/2105.09938 (2021) - [i39]Frances Ding, Jean-Stanislas Denain, Jacob Steinhardt:
Grounding Representation Similarity with Statistical Testing. CoRR abs/2108.01661 (2021) - [i38]Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt:
Learning Equilibria in Matching Markets from Bandit Feedback. CoRR abs/2108.08843 (2021) - [i37]Dan Hendrycks, Nicholas Carlini, John Schulman, Jacob Steinhardt:
Unsolved Problems in ML Safety. CoRR abs/2109.13916 (2021) - [i36]Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt:
What Would Jiminy Cricket Do? Towards Agents That Behave Morally. CoRR abs/2110.13136 (2021) - [i35]Alan Pham, Eunice Chan, Vikranth Srivatsa, Dhruba Ghosh, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez, Jacob Steinhardt:
The Effect of Model Size on Worst-Group Generalization. CoRR abs/2112.04094 (2021) - [i34]Dan Hendrycks, Andy Zou, Mantas Mazeika, Leonard Tang, Bo Li, Dawn Song, Jacob Steinhardt:
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures. CoRR abs/2112.05135 (2021) - 2020
- [c23]Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Jacob Steinhardt, Aleksander Madry:
Identifying Statistical Bias in Dataset Replication. ICML 2020: 2922-2932 - [c22]Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, Yi Ma:
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks. ICML 2020: 10767-10777 - [c21]Banghua Zhu, Jiantao Jiao, Jacob Steinhardt:
When does the Tukey Median work? ISIT 2020: 1201-1206 - [c20]Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian J. Goodfellow, Percy Liang, Pushmeet Kohli:
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming. NeurIPS 2020 - [i33]Banghua Zhu, Jiantao Jiao, Jacob Steinhardt:
When does the Tukey median work? CoRR abs/2001.07805 (2020) - [i32]Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, Yi Ma:
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks. CoRR abs/2002.11328 (2020) - [i31]Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Jacob Steinhardt, Aleksander Madry:
Identifying Statistical Bias in Dataset Replication. CoRR abs/2005.09619 (2020) - [i30]Banghua Zhu, Jiantao Jiao, Jacob Steinhardt:
Robust estimation via generalized quasi-gradients. CoRR abs/2005.14073 (2020) - [i29]Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, Justin Gilmer:
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization. CoRR abs/2006.16241 (2020) - [i28]Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, Jacob Steinhardt:
Aligning AI With Shared Human Values. CoRR abs/2008.02275 (2020) - [i27]Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt:
Measuring Massive Multitask Language Understanding. CoRR abs/2009.03300 (2020) - [i26]Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian J. Goodfellow, Percy Liang, Pushmeet Kohli:
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming. CoRR abs/2010.11645 (2020)
2010 – 2019
- 2019
- [j6]Zachary C. Lipton, Jacob Steinhardt:
Research for practice: troubling trends in machine-learning scholarship. Commun. ACM 62(6): 45-53 (2019) - [j5]Kensen Shi, Jacob Steinhardt, Percy Liang:
FrAngel: component-based synthesis with control structures. Proc. ACM Program. Lang. 3(POPL): 73:1-73:29 (2019) - [j4]Zachary C. Lipton, Jacob Steinhardt:
Troubling Trends in Machine Learning Scholarship. ACM Queue 17(1): 80 (2019) - [c19]Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart:
Sever: A Robust Meta-Algorithm for Stochastic Optimization. ICML 2019: 1596-1606 - [i25]Daniel Kang, Yi Sun, Tom Brown, Dan Hendrycks, Jacob Steinhardt:
Transfer of Adversarial Robustness Between Perturbation Types. CoRR abs/1905.01034 (2019) - [i24]Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song:
Natural Adversarial Examples. CoRR abs/1907.07174 (2019) - [i23]Daniel Kang, Yi Sun, Dan Hendrycks, Tom Brown, Jacob Steinhardt:
Testing Robustness Against Unforeseen Adversaries. CoRR abs/1908.08016 (2019) - [i22]Banghua Zhu, Jiantao Jiao, Jacob Steinhardt:
Generalized Resilience and Robust Statistics. CoRR abs/1909.08755 (2019) - [i21]Dan Hendrycks, Steven Basart, Mantas Mazeika, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song:
A Benchmark for Anomaly Segmentation. CoRR abs/1911.11132 (2019) - 2018
- [b1]Jacob Steinhardt:
Robust learning: information theory and algorithms. Stanford University, USA, 2018 - [c18]Aditi Raghunathan, Jacob Steinhardt, Percy Liang:
Certified Defenses against Adversarial Examples. ICLR (Poster) 2018 - [c17]Jacob Steinhardt, Moses Charikar, Gregory Valiant:
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. ITCS 2018: 45:1-45:21 - [c16]Aditi Raghunathan, Jacob Steinhardt, Percy Liang:
Semidefinite relaxations for certifying robustness to adversarial examples. NeurIPS 2018: 10900-10910 - [c15]Pravesh K. Kothari, Jacob Steinhardt, David Steurer:
Robust moment estimation and improved clustering via sum of squares. STOC 2018: 1035-1046 - [i20]Aditi Raghunathan, Jacob Steinhardt, Percy Liang:
Certified Defenses against Adversarial Examples. CoRR abs/1801.09344 (2018) - [i19]Miles Brundage, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, Hyrum S. Anderson, Heather Roff, Gregory C. Allen, Jacob Steinhardt, Carrick Flynn, Seán Ó hÉigeartaigh, Simon Beard, Haydn Belfield, Sebastian Farquhar, Clare Lyle, Rebecca Crootof, Owain Evans, Michael Page, Joanna Bryson, Roman Yampolskiy, Dario Amodei:
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation. CoRR abs/1802.07228 (2018) - [i18]Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart:
Sever: A Robust Meta-Algorithm for Stochastic Optimization. CoRR abs/1803.02815 (2018) - [i17]Zachary C. Lipton, Jacob Steinhardt:
Troubling Trends in Machine Learning Scholarship. CoRR abs/1807.03341 (2018) - [i16]Pang Wei Koh, Jacob Steinhardt, Percy Liang:
Stronger Data Poisoning Attacks Break Data Sanitization Defenses. CoRR abs/1811.00741 (2018) - [i15]Aditi Raghunathan, Jacob Steinhardt, Percy Liang:
Semidefinite relaxations for certifying robustness to adversarial examples. CoRR abs/1811.01057 (2018) - [i14]Kensen Shi, Jacob Steinhardt, Percy Liang:
FrAngel: Component-Based Synthesis with Control Structures. CoRR abs/1811.05175 (2018) - 2017
- [c14]Jacob Steinhardt, Pang Wei Koh, Percy Liang:
Certified Defenses for Data Poisoning Attacks. NIPS 2017: 3517-3529 - [c13]Moses Charikar, Jacob Steinhardt, Gregory Valiant:
Learning from untrusted data. STOC 2017: 47-60 - [i13]Jacob Steinhardt, Moses Charikar, Gregory Valiant:
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. CoRR abs/1703.04940 (2017) - [i12]Jacob Steinhardt:
Does robustness imply tractability? A lower bound for planted clique in the semi-random model. CoRR abs/1704.05120 (2017) - [i11]Jacob Steinhardt, Pang Wei Koh, Percy Liang:
Certified Defenses for Data Poisoning Attacks. CoRR abs/1706.03691 (2017) - [i10]Pravesh K. Kothari, Jacob Steinhardt:
Better Agnostic Clustering Via Relaxed Tensor Norms. CoRR abs/1711.07465 (2017) - [i9]Jacob Steinhardt:
Does robustness imply tractability? A lower bound for planted clique in the semi-random model. Electron. Colloquium Comput. Complex. TR17 (2017) - 2016
- [c12]Jacob Steinhardt, Gregory Valiant, Stefan Wager:
Memory, Communication, and Statistical Queries. COLT 2016: 1490-1516 - [c11]Jacob Steinhardt, Percy Liang:
Unsupervised Risk Estimation Using Only Conditional Independence Structure. NIPS 2016: 3657-3665 - [c10]Jacob Steinhardt, Gregory Valiant, Moses Charikar:
Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction. NIPS 2016: 4439-4447 - [i8]Jacob Steinhardt, Percy Liang:
Unsupervised Risk Estimation Using Only Conditional Independence Structure. CoRR abs/1606.05313 (2016) - [i7]Jacob Steinhardt, Gregory Valiant, Moses Charikar:
Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction. CoRR abs/1606.05374 (2016) - [i6]Dario Amodei, Chris Olah, Jacob Steinhardt, Paul F. Christiano, John Schulman, Dan Mané:
Concrete Problems in AI Safety. CoRR abs/1606.06565 (2016) - [i5]Moses Charikar, Jacob Steinhardt, Gregory Valiant:
Learning from Untrusted Data. CoRR abs/1611.02315 (2016) - 2015
- [c9]Tianlin Shi, Jacob Steinhardt, Percy Liang:
Learning Where to Sample in Structured Prediction. AISTATS 2015 - [c8]Jacob Steinhardt, John C. Duchi:
Minimax rates for memory-bounded sparse linear regression. COLT 2015: 1564-1587 - [c7]Jacob Steinhardt, Percy Liang:
Reified Context Models. ICML 2015: 1043-1052 - [c6]Jacob Steinhardt, Percy Liang:
Learning Fast-Mixing Models for Structured Prediction. ICML 2015: 1063-1072 - [c5]Jacob Steinhardt, Percy Liang:
Learning with Relaxed Supervision. NIPS 2015: 2827-2835 - [i4]Jacob Steinhardt, Percy Liang:
Reified Context Models. CoRR abs/1502.06665 (2015) - [i3]Jacob Steinhardt, Percy Liang:
Learning Fast-Mixing Models for Structured Prediction. CoRR abs/1502.06668 (2015) - [i2]Jacob Steinhardt, Gregory Valiant, Stefan Wager:
Memory, Communication, and Statistical Queries. Electron. Colloquium Comput. Complex. TR15 (2015) - 2014
- [c4]Jacob Steinhardt, Percy Liang:
Filtering with Abstract Particles. ICML 2014: 727-735 - [c3]Jacob Steinhardt, Percy Liang:
Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm. ICML 2014: 1593-1601 - [i1]Jacob Steinhardt, Stefan Wager, Percy Liang:
The Statistics of Streaming Sparse Regression. CoRR abs/1412.4182 (2014) - 2012
- [j3]Jacob Steinhardt, Russ Tedrake:
Finite-time regional verification of stochastic non-linear systems. Int. J. Robotics Res. 31(7): 901-923 (2012) - [c2]Jacob Steinhardt, Zoubin Ghahramani:
Flexible Martingale Priors for Deep Hierarchies. AISTATS 2012: 1108-1116 - 2011
- [c1]Jacob Steinhardt, Russ Tedrake:
Finite-Time Regional Verification of Stochastic Nonlinear Systems. Robotics: Science and Systems 2011 - 2010
- [j2]Jacob Steinhardt:
Permutations with Ascending and Descending Blocks. Electron. J. Comb. 17(1) (2010)
2000 – 2009
- 2009
- [j1]Jacob Steinhardt:
On Coloring the Odd-Distance Graph. Electron. J. Comb. 16(1) (2009)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-25 23:40 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint