Follow
Kushal Tirumala
Kushal Tirumala
Research Scientist, Facebook AI Research
Verified email at fb.com - Homepage
Title
Cited by
Cited by
Year
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
K Tirumala, AH Markosyan, L Zettlemoyer, A Aghajanyan
Neural Information Processing Systems, 2022
2352022
Machine learning for the zwicky transient facility
A Mahabal, U Rebbapragada, R Walters, FJ Masci, N Blagorodnova, ...
Publications of the Astronomical Society of the Pacific 131 (997), 038002, 2019
1552019
Semdedup: Data-efficient learning at web-scale through semantic deduplication
A Abbas, K Tirumala, D Simig, S Ganguli, AS Morcos
arXiv preprint arXiv:2303.09540, 2023
1502023
Chameleon: Mixed-modal early-fusion foundation models
C Team
arXiv preprint arXiv:2405.09818, 2024
1302024
The united states covid-19 forecast hub dataset
EY Cramer, Y Huang, Y Wang, EL Ray, M Cornell, J Bracher, A Brennen, ...
Scientific data 9 (1), 462, 2022
1072022
D4: Improving llm pretraining via document de-duplication and diversification
K Tirumala, D Simig, A Aghajanyan, A Morcos
Advances in Neural Information Processing Systems 36, 53983-53995, 2023
892023
Transfusion: Predict the next token and diffuse images with one multi-modal model
C Zhou, L Yu, A Babu, K Tirumala, M Yasunaga, L Shamis, J Kahn, X Ma, ...
arXiv preprint arXiv:2408.11039, 2024
742024
The unreasonable ineffectiveness of the deeper layers
A Gromov, K Tirumala, H Shapourian, P Glorioso, DA Roberts
arXiv preprint arXiv:2403.17887, 2024
612024
An introduction to vision-language modeling
F Bordes, RY Pang, A Ajay, AC Li, A Bardes, S Petryk, O Mañas, Z Lin, ...
arXiv preprint arXiv:2405.17247, 2024
452024
DeepStreaks: identifying fast-moving objects in the Zwicky Transient Facility data with deep learning
DA Duev, A Mahabal, Q Ye, K Tirumala, J Belicki, R Dekany, S Frederick, ...
Monthly Notices of the Royal Astronomical Society 486 (3), 4158-4165, 2019
442019
A method for finding anomalous astronomical light curves and their analogues
JR Martínez-Galarza, FB Bianco, D Crake, K Tirumala, AA Mahabal, ...
Monthly Notices of the Royal Astronomical Society 508 (4), 5734-5756, 2021
252021
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks
T Thrush, K Tirumala, A Gupta, M Bartolo, P Rodriguez, T Kane, ...
ACL System Demos, 2022
172022
Effective pruning of web-scale datasets based on complexity of concept clusters
A Abbas, E Rusak, K Tirumala, W Brendel, K Chaudhuri, AS Morcos
arXiv preprint arXiv:2401.04578, 2024
152024
Semdedup: Data-efficient learning at web-scale through semantic deduplication, 2023
A Abbas, K Tirumala, D Simig, S Ganguli, AS Morcos
Zaharia, M., Zhang, M., Zhang, T., Zhang, X., Zhang, Y., Zheng, L., Zhou, K …, 2021
112021
Decoding data quality via synthetic corruptions: Embedding-guided pruning of code data
Y Yang, AK Singh, M Elhoushi, A Mahmoud, K Tirumala, F Gloeckle, ...
arXiv preprint arXiv:2312.02418, 2023
82023
Investigating Generalization by Controlling Normalized Margin
A Farhang, J Bernstein, K Tirumala, Y Liu, Y Yue
International Conference on Machine Learning, 2022
62022
Text quality-based pruning for efficient training of language models
V Sharma, K Padthe, N Ardalani, K Tirumala, R Howes, H Xu, PY Huang, ...
arXiv preprint arXiv:2405.01582, 2024
42024
Ensemble machine learning methods for modeling Covid19 deaths
R Bathwal, P Chitta, K Tirumala, V Varadarajan
arXiv preprint arXiv:2010.04052, 2020
42020
Brevity is the soul of wit: Pruning long files for code generation
AK Singh, Y Yang, K Tirumala, M Elhoushi, AS Morcos
arXiv preprint arXiv:2407.00434, 2024
12024
CAT: Content-Adaptive Image Tokenization
J Shen, K Tirumala, M Yasunaga, I Misra, L Zettlemoyer, L Yu, C Zhou
arXiv preprint arXiv:2501.03120, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–20