default search action
Dhiraj D. Kalamkar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c20]Evangelos Georganas, Dhiraj D. Kalamkar, Kirill Voronin, Abhisek Kundu, Antonio Noack, Hans Pabst, Alexander Breuer, Alexander Heinecke:
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures. IPDPS 2024: 950-963 - 2023
- [i12]Evangelos Georganas, Dhiraj D. Kalamkar, Kirill Voronin, Antonio Noack, Hans Pabst, Alexander Breuer, Alexander Heinecke:
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures. CoRR abs/2304.12576 (2023) - 2022
- [j3]Evangelos Georganas, Dhiraj D. Kalamkar, Sasikanth Avancha, Menachem Adelman, Deepti Aggarwal, Cristina Anderson, Alexander Breuer, Jeremy Bruestle, Narendra Chaudhary, Abhisek Kundu, Denise Kutnick, Frank Laub, Md. Vasimuddin, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Brian Retford, Barukh Ziv, Alexander Heinecke:
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning and HPC Workloads. Frontiers Appl. Math. Stat. 8: 826269 (2022) - [c19]Narendra Chaudhary, Sanchit Misra, Dhiraj D. Kalamkar, Alexander Heinecke, Evangelos Georganas, Barukh Ziv, Menachem Adelman, Bharat Kaul:
Accelerating Deep Learning based Identification of Chromatin Accessibility from noisy ATAC-seq Data. IPDPS Workshops 2022: 176-185 - 2021
- [c18]Evangelos Georganas, Dhiraj D. Kalamkar, Sasikanth Avancha, Menachem Adelman, Cristina Anderson, Alexander Breuer, Jeremy Bruestle, Narendra Chaudhary, Abhisek Kundu, Denise Kutnick, Frank Laub, Md. Vasimuddin, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Barukh Ziv, Alexander Heinecke:
Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads. SC 2021: 14 - [c17]Md. Vasimuddin, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty, Evangelos Georganas, Alexander Heinecke, Dhiraj D. Kalamkar, Nesreen K. Ahmed, Sasikanth Avancha:
DistGNN: scalable distributed training for large-scale graph neural networks. SC 2021: 76 - [i11]Evangelos Georganas, Dhiraj D. Kalamkar, Sasikanth Avancha, Menachem Adelman, Cristina Anderson, Alexander Breuer, Narendra Chaudhary, Abhisek Kundu, Md. Vasimuddin, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Barukh Ziv, Alexander Heinecke:
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads. CoRR abs/2104.05755 (2021) - [i10]Md. Vasimuddin, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty, Evangelos Georganas, Alexander Heinecke, Dhiraj D. Kalamkar, Nesreen K. Ahmed, Sasikanth Avancha:
DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks. CoRR abs/2104.06700 (2021) - [i9]Narendra Chaudhary, Sanchit Misra, Dhiraj D. Kalamkar, Alexander Heinecke, Evangelos Georganas, Barukh Ziv, Menachem Adelman, Bharat Kaul:
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning. CoRR abs/2104.08002 (2021) - 2020
- [c16]Evangelos Georganas, Kunal Banerjee, Dhiraj D. Kalamkar, Sasikanth Avancha, Anand Venkat, Michael J. Anderson, Greg Henry, Hans Pabst, Alexander Heinecke:
Harnessing Deep Learning via a Single Building Block. IPDPS 2020: 222-233 - [c15]Dhiraj D. Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke:
Optimizing deep learning recommender systems training on CPU cluster architectures. SC 2020: 43 - [i8]Dhiraj D. Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke:
Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures. CoRR abs/2005.04680 (2020)
2010 – 2019
- 2019
- [j2]Kunal Banerjee, Evangelos Georganas, Dhiraj D. Kalamkar, Barukh Ziv, Eden Segal, Cristina Anderson, Alexander Heinecke:
Optimizing Deep Learning RNN Topologies on Intel Architecture. Supercomput. Front. Innov. 6(3): 64-85 (2019) - [c14]Dhiraj D. Kalamkar, Kunal Banerjee, Sudarshan Srinivasan, Srinivas Sridharan, Evangelos Georganas, Mikhail E. Smorkalov, Cong Xu, Alexander Heinecke:
Training Google Neural Machine Translation on an Intel CPU Cluster. CLUSTER 2019: 1-10 - [i7]Dhiraj D. Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey:
A Study of BFLOAT16 for Deep Learning Training. CoRR abs/1905.12322 (2019) - [i6]Evangelos Georganas, Kunal Banerjee, Dhiraj D. Kalamkar, Sasikanth Avancha, Anand Venkat, Michael J. Anderson, Greg Henry, Hans Pabst, Alexander Heinecke:
High-Performance Deep Learning via a Single Building Block. CoRR abs/1906.06440 (2019) - [i5]Abhisek Kundu, Sudarshan Srinivasan, Eric C. Qin, Dhiraj D. Kalamkar, Naveen K. Mellempudi, Dipankar Das, Kunal Banerjee, Bharat Kaul, Pradeep Dubey:
K-TanH: Hardware Efficient Activations For Deep Learning. CoRR abs/1909.07729 (2019) - 2018
- [c13]Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj D. Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesús Corbal, Nikita Shustrov, Roman Dubtsov, Evarist Fomenko, Vadim O. Pirogov:
Mixed Precision Training of Convolutional Neural Networks using Integer Operations. ICLR (Poster) 2018 - [c12]Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj D. Kalamkar, Greg Henry, Hans Pabst, Alexander Heinecke:
Anatomy of high-performance deep learning convolutions on SIMD architectures. SC 2018: 66:1-66:12 - [i4]Srinivas Sridharan, Karthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Dipankar Das, Mikhail E. Smorkalov, Mikhail Shiryaev, Dheevatsa Mudigere, Naveen Mellempudi, Sasikanth Avancha, Bharat Kaul, Pradeep Dubey:
On Scale-out Deep Learning Training for Cloud and HPC. CoRR abs/1801.08030 (2018) - [i3]Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj D. Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesús Corbal, Nikita Shustrov, Roman Dubtsov, Evarist Fomenko, Vadim O. Pirogov:
Mixed Precision Training of Convolutional Neural Networks using Integer Operations. CoRR abs/1802.00930 (2018) - [i2]Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj D. Kalamkar, Greg Henry, Hans Pabst, Alexander Heinecke:
Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures. CoRR abs/1808.05567 (2018) - 2016
- [j1]Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Md. Mostofa Ali Patwary, Vadim O. Pirogov, Pradeep Dubey, Xing Liu, Carlos Rosales, Cyril Mazauric, Christopher S. Daley:
Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors. Int. J. High Perform. Comput. Appl. 30(1): 11-27 (2016) - [c11]Bálint Joó, Dhiraj D. Kalamkar, Thorsten Kurth, Karthikeyan Vaidyanathan, Aaron Walden:
Optimizing Wilson-Dirac Operator and Linear Solvers for Intel® KNL. ISC Workshops 2016: 415-427 - [i1]Dipankar Das, Sasikanth Avancha, Dheevatsa Mudigere, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dhiraj D. Kalamkar, Bharat Kaul, Pradeep Dubey:
Distributed Deep Learning Using Synchronous Stochastic Gradient Descent. CoRR abs/1602.06709 (2016) - 2015
- [c10]Karthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Kiran Pamnany, Jeff R. Hammond, Pavan Balaji, Dipankar Das, Jongsoo Park, Bálint Joó:
Improving concurrency and asynchrony in multithreaded MPI applications using software offloading. SC 2015: 30:1-30:12 - 2014
- [c9]Karthikeyan Vaidyanathan, Kiran Pamnany, Dhiraj D. Kalamkar, Alexander Heinecke, Mikhail Smelyanskiy, Jongsoo Park, Daehyun Kim, Aniruddha G. Shet, Bharat Kaul, Bálint Joó, Pradeep Dubey:
Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters. IPDPS 2014: 1083-1092 - [c8]Simon Heybrock, Bálint Joó, Dhiraj D. Kalamkar, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Tilo Wettig, Pradeep Dubey:
Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-Processors. SC 2014: 69-80 - [c7]Srinivas Sridharan, James Dinan, Dhiraj D. Kalamkar:
Enabling Efficient Multithreaded MPI Communication through a Library-Based Implementation of MPI Endpoints. SC 2014: 487-498 - [c6]Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mostofa Ali Patwary, Yutong Lu, Pradeep Dubey:
Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices. SC 2014: 945-955 - 2013
- [c5]Bálint Joó, Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy, Kiran Pamnany, Victor W. Lee, Pradeep Dubey, William A. Watson III:
Lattice QCD on Intel® Xeon PhiTM Coprocessors. ISC 2013: 40-54 - 2012
- [c4]Dhiraj D. Kalamkar, Joshua D. Trzasko, Srinivas Sridharan, Mikhail Smelyanskiy, Daehyun Kim, Armando Manduca, Yunhong Shu, Matt A. Bernstein, Bharat Kaul, Pradeep Dubey:
High Performance Non-uniform FFT on Modern X86-based Multi-core Systems. IPDPS 2012: 449-460 - [c3]Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker:
Optimization of geometric multigrid for emerging multi- and manycore processors. SC 2012: 96 - [c2]Mikhail Smelyanskiy, Jason Sewall, Dhiraj D. Kalamkar, Nadathur Satish, Pradeep Dubey, Nikita Astafiev, Ilya Burylov, Andrey Nikolaev, Sergey Maidanov, Shuo Li, Sunil Kulkarni, Charles H. Finan, Ekaterina Gonina:
Analysis and Optimization of Financial Analytics Benchmark on Modern Multi- and Many-core IA-Based Architectures. SC Companion 2012: 1154-1162
2000 – 2009
- 2007
- [c1]Dhiraj D. Kalamkar, Mainak Chaudhuri, Mark A. Heinrich:
Simplifying Active Memory Clusters by Leveraging Directory Protocol Threads. ISPASS 2007: 242-253
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-07-18 21:55 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint