default search action
Olatunji Ruwase
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c22]Guanhua Wang, Heyang Qin, Sam Ade Jacobs, Xiaoxia Wu, Connor Holmes, Zhewei Yao, Samyam Rajbhandari, Olatunji Ruwase, Feng Yan, Lei Yang, Yuxiong He:
ZeRO++: Extremely Efficient Collective Communication for Large Model Training. ICLR 2024 - [c21]Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song:
Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric Algorithm-System Co-Design on Modern GPUs. USENIX ATC 2024: 699-713 - [i18]Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song:
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design. CoRR abs/2401.14112 (2024) - [i17]Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang:
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding. CoRR abs/2403.04797 (2024) - [i16]Marah I Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat S. Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou:
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. CoRR abs/2404.14219 (2024) - [i15]Guanhua Wang, Olatunji Ruwase, Bing Xie, Yuxiong He:
FastPersist: Accelerating Model Checkpointing in Deep Learning. CoRR abs/2406.13768 (2024) - [i14]Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang:
Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training. CoRR abs/2406.18820 (2024) - [i13]Jinghan Yao, Sam Ade Jacobs, Masahiro Tanaka, Olatunji Ruwase, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda:
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer. CoRR abs/2408.16978 (2024) - [i12]Guanhua Wang, Chengming Zhang, Zheyu Shen, Ang Li, Olatunji Ruwase:
Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping. CoRR abs/2409.15241 (2024) - 2023
- [j3]Reza Yazdani Aminabadi, Olatunji Ruwase, Minjia Zhang, Yuxiong He, José-María Arnau, Antonio González:
SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Networks. ACM Trans. Embed. Comput. Syst. 22(2): 30:1-30:23 (2023) - [c20]Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, Abhinav Bhatele:
A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training. ICS 2023: 203-214 - [i11]Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, Abhinav Bhatele:
A Novel Tensor-Expert Hybrid Parallelism Approach to Scale Mixture-of-Experts Training. CoRR abs/2303.06318 (2023) - [i10]Guanhua Wang, Heyang Qin, Sam Ade Jacobs, Connor Holmes, Samyam Rajbhandari, Olatunji Ruwase, Feng Yan, Lei Yang, Yuxiong He:
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training. CoRR abs/2306.10209 (2023) - [i9]Zhewei Yao, Reza Yazdani Aminabadi, Olatunji Ruwase, Samyam Rajbhandari, Xiaoxia Wu, Ammar Ahmad Awan, Jeff Rasley, Minjia Zhang, Conglong Li, Connor Holmes, Zhongzhu Zhou, Michael Wyatt, Molly Smith, Lev Kurilenko, Heyang Qin, Masahiro Tanaka, Shuai Che, Shuaiwen Leon Song, Yuxiong He:
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales. CoRR abs/2308.01320 (2023) - [i8]Zhewei Yao, Xiaoxia Wu, Conglong Li, Minjia Zhang, Heyang Qin, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He:
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention. CoRR abs/2309.14327 (2023) - [i7]Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao:
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks. CoRR abs/2312.08583 (2023) - 2022
- [c19]Reza Yazdani Aminabadi, Samyam Rajbhandari, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Olatunji Ruwase, Shaden Smith, Minjia Zhang, Jeff Rasley, Yuxiong He:
DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale. SC 2022: 46:1-46:15 - [i6]Reza Yazdani Aminabadi, Samyam Rajbhandari, Minjia Zhang, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Jeff Rasley, Shaden Smith, Olatunji Ruwase, Yuxiong He:
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale. CoRR abs/2207.00032 (2022) - [i5]Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilic, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, et al.:
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. CoRR abs/2211.05100 (2022) - 2021
- [c18]Heyang Qin, Samyam Rajbhandari, Olatunji Ruwase, Feng Yan, Lei Yang, Yuxiong He:
SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement. NeurIPS 2021: 20531-20544 - [c17]Samyam Rajbhandari, Olatunji Ruwase, Jeff Rasley, Shaden Smith, Yuxiong He:
ZeRO-infinity: breaking the GPU memory wall for extreme scale deep learning. SC 2021: 59 - [c16]Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, Yuxiong He:
ZeRO-Offload: Democratizing Billion-Scale Model Training. USENIX ATC 2021: 551-564 - [i4]Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, Yuxiong He:
ZeRO-Offload: Democratizing Billion-Scale Model Training. CoRR abs/2101.06840 (2021) - [i3]Samyam Rajbhandari, Olatunji Ruwase, Jeff Rasley, Shaden Smith, Yuxiong He:
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning. CoRR abs/2104.07857 (2021) - 2020
- [c15]Jeff Rasley, Samyam Rajbhandari, Olatunji Ruwase, Yuxiong He:
DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters. KDD 2020: 3505-3506 - [c14]Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He:
ZeRO: memory optimizations toward training trillion parameter models. SC 2020: 20
2010 – 2019
- 2019
- [c13]Minjia Zhang, Samyam Rajbhandari, Wenhan Wang, Elton Zheng, Olatunji Ruwase, Jeff Rasley, Jason Li, Junhua Wang, Yuxiong He:
Accelerating Large Scale Deep Learning Inference through DeepCPU at Microsoft. OpML 2019: 5-7 - [i2]Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He:
ZeRO: Memory Optimization Towards Training A Trillion Parameter Models. CoRR abs/1910.02054 (2019) - [i1]Reza Yazdani, Olatunji Ruwase, Minjia Zhang, Yuxiong He, José-María Arnau, Antonio González:
LSTM-Sharp: An Adaptable, Energy-Efficient Hardware Accelerator for Long Short-Term Memory. CoRR abs/1911.01258 (2019) - 2018
- [j2]Feng Yan, Yuxiong He, Olatunji Ruwase, Evgenia Smirni:
Efficient Deep Neural Network Serving: Fast and Furious. IEEE Trans. Netw. Serv. Manag. 15(1): 112-126 (2018) - 2017
- [c12]Samyam Rajbhandari, Yuxiong He, Olatunji Ruwase, Michael Carbin, Trishul M. Chilimbi:
Optimizing CNNs on Multicores for Scalability, Performance and Goodput. ASPLOS 2017: 267-280 - [c11]Jeff Rasley, Yuxiong He, Feng Yan, Olatunji Ruwase, Rodrigo Fonseca:
HyperDrive: exploring hyperparameters with POP scheduling. Middleware 2017: 1-13 - 2016
- [c10]Feng Yan, Yuxiong He, Olatunji Ruwase, Evgenia Smirni:
SERF: efficient scheduling for fast deep neural network serving via judicious parallelism. SC 2016: 300-311 - 2015
- [c9]Kalin Ovtcharov, Olatunji Ruwase, Joo-Young Kim, Jeremy Fowers, Karin Strauss, Eric S. Chung:
Toward accelerating deep learning at scale using specialized hardware in the datacenter. Hot Chips Symposium 2015: 1-38 - [c8]Vivek Seshadri, Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, Trishul M. Chilimbi:
Page overlays: an enhanced virtual memory framework to enable fine-grained memory management. ISCA 2015: 79-91 - [c7]Feng Yan, Olatunji Ruwase, Yuxiong He, Trishul M. Chilimbi:
Performance Modeling and Scalability Optimization of Distributed Deep Learning Systems. KDD 2015: 1355-1364 - 2014
- [c6]Olatunji Ruwase, Michael A. Kozuch, Phillip B. Gibbons, Todd C. Mowry:
Guardrail: a high fidelity approach to protecting hardware devices from buggy drivers. ASPLOS 2014: 655-670 - 2013
- [b1]Olatunji Ruwase:
Improving Device Driver Reliability through Decoupled Dynamic Binary Analyses. Carnegie Mellon University, USA, 2013 - 2010
- [c5]Olatunji Ruwase, Shimin Chen, Phillip B. Gibbons, Todd C. Mowry:
Decoupled lifeguards: enabling path optimizations for dynamic correctness checking tools. PLDI 2010: 25-35
2000 – 2009
- 2009
- [j1]Shimin Chen, Michael Kozuch, Phillip B. Gibbons, Michael P. Ryan, Theodoros Strigkos, Todd C. Mowry, Olatunji Ruwase, Evangelos Vlachos, Babak Falsafi, Vijaya Ramachandran:
Flexible Hardware Acceleration for Instruction-Grain Lifeguards. IEEE Micro 29(1): 62-72 (2009) - 2008
- [c4]Shimin Chen, Michael Kozuch, Theodoros Strigkos, Babak Falsafi, Phillip B. Gibbons, Todd C. Mowry, Vijaya Ramachandran, Olatunji Ruwase, Michael P. Ryan, Evangelos Vlachos:
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring. ISCA 2008: 377-388 - [c3]Fahad R. Dogar, Amar Phanishayee, Himabindu Pucha, Olatunji Ruwase, David G. Andersen:
Ditto: a system for opportunistic caching in multi-hop wireless networks. MobiCom 2008: 279-290 - [c2]Olatunji Ruwase, Phillip B. Gibbons, Todd C. Mowry, Vijaya Ramachandran, Shimin Chen, Michael Kozuch, Michael P. Ryan:
Parallelizing dynamic information flow tracking. SPAA 2008: 35-45 - 2004
- [c1]Olatunji Ruwase, Monica S. Lam:
A Practical Dynamic Buffer Overflow Detector. NDSS 2004
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-16 20:27 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint