default search action
IPDPS 2024: San Francisco, CA, USA - Workshops
- IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024 - Workshop, San Francisco, CA, USA, May 27-31, 2024. IEEE 2024, ISBN 979-8-3503-6460-6
- Gagan Agrawal, Alba Cristina Melo:
Message from the 2024 General Co-chairs. xxviii-xxix - Ananth Kalyanaraman, Suren Byna:
Message from the 2024 Workshops Chair and Vice-chair. xxx-xxxi - Xianwei Cheng, Che-Yu Liu, Roberto Proietti, S. J. Ben Yoo:
HPC Systems with Reconfigurable Optical Networks: Performance and Energy Consumption Exploration. 1 - Dhabaleswar K. Panda, Hari Subramoni:
Message from the HCW 2024 Technical Program Committee Co-Chairs. 1 - Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Siddhisanket Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael E. Papka, Sanjif Shanmugavelu, Darshan Gandhi, Hengyu Zhao, Dun Ma, Kiran Ranganath, Rick Weisner, Jiunn-yeu Chen, Yuting Yang, Natalia Vassilieva, Bin C. Zhang, Sylvia Howland, Alexander Tsyplikhin:
Toward a Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators. 1-10 - Anne C. Elster, Jan Christian Meyer:
HCW 2024 Preface and Committee List. 1 - Yuxi Tan, Riadh Ben Abdelhamid, Bingjie Guo, Qixiang Gao, Masaru Nishimura, Yoshiki Yamaguchi:
Scalable dual-instruction multiple-data processing on an efficient systolic-array architecture. 1 - Behrooz A. Shirazi, Kamesh Madduri:
Message from the HCW 2024 Steering Committee Co-Chairs. 2 - Anne C. Elster, Jan Christian Meyer:
Message from the HCW 2024 General Co-Chairs. 3 - Dhabaleswar K. Panda, Hari Subramoni:
Message from the HCW 2024 Technical Program Committee Co-Chairs. 4 - Yale N. Patt:
HCW 2024 Keynote: Hetero: Where we've been, Where we are, and What Next? 5 - Yale N. Patt:
Hetero: Where we've been, Where we are, and What Next? 5 - Josh Milthorpe, Xianghao Wang, Ahmad Azizi:
Performance Portability of the Chapel Language on Heterogeneous Architectures. 6-13 - Miroslav Demek, Jiri Filipovic:
Towards Dynamic Autotuning of SpMV in CUSP Library. 14-22 - H. Umut Suluhan, Serhan Gener, Alexander Fusco, Joshua Mack, Ismet Dagli, Mehmet E. Belviranli, Çagatay Edemen, Ali Akoglu:
A Runtime Manager Integrated Emulation Environment for Heterogeneous SoC Design with RISC-V Cores. 23-30 - Hayfa Tayeb, Bérenger Bramas, Mathieu Faverge, Abdou Guermouche:
Dynamic Tasks Scheduling with Multiple Priorities on Heterogeneous Computing Systems. 31-40 - Niccolo Nicolosi, Francesco Renato Negri, Francesco Pesce, Francesco Peverelli, Davide Conficconi, Marco Domenico Santambrogio:
PSyGS Gen A Generator of Domain-Specific Architectures to Accelerate Sparse Linear System Resolution. 41-47 - Beau Johnston, Narasinga Rao Miniskar, Aaron R. Young, Mohammad Alaul Haque Monil, Seyong Lee, Jeffrey S. Vetter:
IRIS: Exploring Performance Scaling of the Intelligent Runtime System and its Dynamic Scheduling Policies. 58-67 - Mingxuan He, Fangping Liu, Sang Wook Stephen Do:
Heterogeneous Hyperthreading. 68-78 - Jürgen Becker, Zhenman Fang, Viktor K. Prasanna, Marco D. Santambrogio, Ramachandran Vaidyanathan:
31st Reconfigurable Architectures Workshop (RAW 2024). 79 - Marco Domenico Santambrogio:
RAW 2024 Committees. 80-81 - Deming Chen:
RAW 2024 Monday Keynote. 82 - Wayne Luk:
RAW 2024 Invited Talk-1: Auto-Generating Diverse Heterogeneous Designs. 83 - Stefania Perri:
RAW 2024 Invited Talk-2: Digital In-Memory Computing to Accelerate Deep Learning Inference on the Edge. 84 - Diana Göhringer:
RAW 2024 Invited Talk-3: Self-aware Reliable and Reconfigurable Computing Systems - An Overview. 85 - Dirk Stroobandt:
RAW 2024 Invited Talk-4: Reconfigurable Computing: Quo Vadis? 86 - Masato Motomura:
RAW 2024 Invited Talk-5. 87 - Kentaro Sano:
RAW 2024 Invited Talk-6: Reconfigurable Architectures for High-Performance Computing. 88 - Wei Zhang:
RAW 2024 Invited Talk-7. 89 - Hayden Kwok-Hay So:
RAW 2024 Invited Talk-8: Practical Reconfigurable Computing for Next-Generation Edge Applications. 90 - Andrew Schmidt:
RAW 2024 Invited Talk-9: Riallto: An Open-Source Exploratory Framework for Ryzen AI™. 91 - Yufei Mao, Roland Weiss, Yi Zhang, Yu Li, Marc Rothmann, Mario Porrmann:
FPGA Acceleration of DL-Based Real-Time DC Series Arc Fault Detection. 92-98 - Federico Valentino, Beatrice Branchini, Davide Conficconi, Donatella Sciuto, Marco D. Santambrogio:
An Accurate Union Find Decoder for Quantum Error Correction on the Toric Code. 99-105 - Marco Venere, Valentino Guerrini, Beatrice Branchini, Davide Conficconi, Donatella Sciuto, Marco D. Santambrogio:
Towards the Acceleration of the Sparse Blossom Algorithm for Quantum Error Correction. 106-110 - Erik H. D'Hollander, Ewout Danneels, Karel-Brecht Decorte, Senne Loobuyck, Arne Vanheule, Ian Van Kets, Dirk Stroobandt:
Exploring Large Language Models for Verilog Hardware Design Generation. 111-115 - Jessica Vandebon, José Gabriel F. Coutinho, Wayne Luk:
Auto-Generating Diverse Heterogeneous Designs. 116-123 - Diana Göhringer, Ariel Podlubne, Fabian Vargas, Milos Krstic:
Self-Aware Reliable and Reconfigurable Computing Systems - An Overview. 124-129 - Stefania Perri, Cristian Zambelli, Daniele Ielmini, Cristina Silvano:
Digital In-Memory Computing to Accelerate Deep Learning Inference on the Edge. 130-133 - Samuel Collinson, Allan Bai, Oliver Sinnen:
A Fast Scalable Hardware Priority Queue and Optimizations for Multi-Pushes. 134-140 - Claudio Rubattu, Antonio Ledda, Francesco Ratto, Chaitanya Jugade, Dip Goswami, Francesca Palumbo:
FPGA-based Implementation for Industrial Motion Control System. 141-147 - Kazuki Sunaga, Keisuke Sugiura, Hiroki Matsutani:
An FPGA-Based Accelerator for Graph Embedding using Sequential Training Algorithm. 148-154 - Carsten Heinz, Torben Kalkhof, Yannick Lavan, Andreas Koch:
TaPaS Co-AIE: An Open-Source Framework for Streaming-Based Heterogeneous Acceleration Using AMD AI Engines. 155-161 - Anna Drewes, Vitalii Burtsev, Bala Gurumurthy, Martin Wilhelm, David Broneske, Gunter Saake, Thilo Pionteck:
An Architectural Template for FPGA Overlays Targeting Data Flow Applications. 162-168 - Sahan Bandara, Ahmed Sanaullah, Zaid Tahir, Ulrich Drepper, Martin C. Herbordt:
Performance Evaluation of VirtIO Device Drivers for Host-FPGA PCIe Communication. 169-176 - Giorgos Armeniakos, Georgios Mentzos, Dimitrios Soudris:
Accelerating TinyML Inference on Microcontrollers Through Approximate Kernels. 177 - Jiajun Wu, Mo Song, Jingmin Zhao, Hayden Kwok-Hay So:
A Case for Low Bitwidth Floating Point Arithmetic on FPGA for Transformer Based DNN Inference. 178-185 - Raveena Raikar, Dirk Stroobandt:
Balancing Intra-Die and Inter-Die Placement Optimization in 2.5D FPGA Architectures. 187 - Shubhayu Das, Nanditha P. Rao, Sharad Sinha:
ConvMap: Boosting Convolution Throughput on FPGAs with Efficient Resource Mapping. 189 - Vasilis Kypriotis, Georgios Smaragdos, Pieter Kruizinga, Dimitrios Soudris, Christos Strydis:
A Reconfigurable Architecture of a Scalable, Ultrafast, Ultrasound, Delay-and-Sum Beamformer. 190 - Rui Shi, Seda Ogrenci, J. M. Arnold, J. R. Berlioz, Pierrick Hanlet, Kyle J. Hazelwood, M. A. Ibrahim, Han Liu, V. P. Nagaslaev, Aakaash Narayanan, D. J. Nicklaus, Jovan Mitrevski, Gauri Pradhan, A. L. Saewert, B. A. Schupbach, Kiyomi Seiya, Mattson Thieme, R. M. Thurman-Keup, N. V. Tran:
ML-Based Real-Time Control at the Edge: An Approach Using hls4ml. 191 - Julian Haase, Nico Volkens, Diana Göhringer:
Network Adapter for Secure Networks-on-Chip. 192 - Zaid Tahir, Sahan Bandara, Martin C. Herbordt:
Multi-Core Multi-Rule VeBPF Firewall for Secure FPGA IoT Device Deployments. 193 - Roberto A. Bertolini, Filippo Carloni, Davide Conficconi, Marco Domenico Santambrogio:
POCA: A PYNQ Offloaded Cryptographic Accelerator on Embedded FPGA-Based Systems. 194 - Jacir Luiz Bordim, Koji Nakano:
APDCM 2024 Preface and Committee List. 195-196 - Hiroki Ohtsuji:
APDCM 2024 Keynote Talk. 197 - Clayton J. Faber, Roger D. Chamberlain:
Application of Network Calculus Models to Heterogeneous Streaming Applications. 198-201 - Maxime Gonthier, Elisabeth Larsson, Loris Marchal, Carl Nettelblad, Samuel Thibault:
Data-Driven Locality-Aware Batch Scheduling. 202-211 - Rei Aoyagi, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa:
Combining Lossy Compression with Multi-Level Caching for Data Staging over Network. 212-221 - Yujiro Yahata, Keisuke Sugiura, Hiroki Matsutani:
A Scalable Secure Fault Tolerant Aggregation for P2P Federated Learning. 222-231 - Aoi Kida, Hideyuki Kawashima:
Accelerating BFT Database with Transaction Reconstruction. 232-241 - Yusuke Miyazaki, Takashi Hoshino, Hideyuki Kawashima:
Optimizing Aria Concurrency Control Protocol with Early Dependency Resolution. 242-249 - Subhajit Sahu, Kishore Kothapalli, Dip Sankar Banerjee:
Shared-Memory Parallel Algorithms for Community Detection in Dynamic Graphs. 250-259 - Pratik Nayak, Hartwig Anzt:
A Probabilistic Model for Asynchronous Iterative Methods. 260-269 - Koji Nakano:
The Logarithmic Random Bidding for the Parallel Roulette Wheel Selection with Precise Probabilities. 270-272 - Yasuaki Ito, Satoki Tsuji, Haruto Fujii, Kanta Suzuki, Nobuya Yokogawa, Koji Nakano, Akihiko Kasagi:
Introduction to Computational Quantum Chemistry for Computer Scientists. 273-282 - Shintaro Iwasaki:
AsHES 2024 Preface and Committee List. 283-284 - Philippe Tillet:
Block-based GPU Programming with Triton. 285 - James B. White:
Performance Versus Maintainability: A Case Study of Scream on Frontier. 286-292 - Ali TehraniJamsaz, Alok Mishra, Akash Dutta, Abid M. Malik, Barbara M. Chapman, Ali Jannesari:
ParaGraph: Weighted Graph Representation for Performance Optimization of HPC Kernels. 293-300 - Mikhail Kirilin, Carsten Burstedde:
Alternative Quadrant Representations with Morton Index and AVX2 Vectorization for AMR Algorithms within the p4est Software Library. 301-310 - Raúl Marichal, Ernesto Dufrechou, Pablo Ezzatti:
Avoiding Training in the Platform-Aware Optimization Process for Faster DNN Latency Reduction. 311-320 - Christoffer Åleskog, Håkan Grahn, Anton Borg:
A Comparative Study on Simulation Frameworks for AI Accelerator Evaluation. 321-328 - Zheming Jin:
Extending the SYCL Joint Matrix for Binarized Neural Networks. 329-333 - Sushil K. Prasad:
Message from the EduPar-24 Workshop Chairs. 334 - Sushil K. Prasad:
EduPar-24 Workshop Organization. 335-336 - Charles E. Leiserson:
EduPar 2024 Keynote Speaker. 337 - John D. Owens, Bruce Hoppe:
Helping Faculty Teach Software Performance Engineering. 338-341 - Brian Plancher:
Parallel Optimization for Robotics: An Undergraduate Introduction to GPU Parallel Programming and Numerical Optimization Research. 342-345 - Guy E. Blelloch, Yan Gu, Yihan Sun:
Teaching Parallel Algorithms Using the Binary-Forking Model. 346-351 - Alina Lazar, Ethan Scheelk, Elizabeth Shoop, David P. Bunde:
Peachy Parallel Assignments (EduPar 2024). 352-356 - Chris Bourke, Justin W. Firestone:
Codeless PDC Modules for Early Computing Curriculum. 357-364 - Cade Wiley, Grey Ballard:
Visualizing PRAM Algorithm for Mergesort. 365-368 - Lena Oden, Klaus Nölp, Philipp Brauner:
Integrating Interactive Performance Analysis in Jupyter Notebooks for Parallel Programming Education. 369-376 - Elizabeth Shoop, Richard A. Brown, Suzanne J. Matthews, Joel C. Adams:
Interactive Textbooks for Parallel and Distributed Computing Across the Undergraduate CS Curriculum. 377-384 - Sandino Vargas Perez:
Teaching Performance Metrics in Parallel Computing Courses. 385-390 - Tim Kaler, Xuhao Chen, Brian Wheatman, Dorothy Curtis, Bruce Hoppe, Tao B. Schardl, Charles E. Leiserson:
Speedcode: Software Performance Engineering Education via the Coding of Didactic Exercises. 391-394 - François Tessier, Weikuan Yu:
ESSA 2024 Message and Committees. 395-396 - Hariharan Devarajan, Adam Moody, Donglai Dai, Cameron Stanavige, Elsa Gonsiorowski, Marty McFadden, Olaf Faaland, Gregory Kosinovsky, Kathryn M. Mohror:
The Impact of Asynchronous I/O in Checkpoint-Restart Workloads. 397-405 - Xiang Fu, Xin Huang, Wubiao Xu, Weiping Zhang, Shiman Meng, Luanzheng Guo, Kento Sato:
Benchmarking Variables for Checkpointing in HPC Applications. 406-413 - Matthieu Dorier, Philip H. Carns, Robert B. Ross, Shane Snyder, Robert Latham, Amal Gueroudji, George Amvrosiadis, Chuck Cranor, Jérome Soumagne:
Extending the Mochi Methodology to Enable Dynamic HPC Data Services. 414-422 - Andrew Rodriguez, Noushin Azami, Martin Burtscher:
Adaptive Per-File Lossless Compression of Floating-Point Data. 423-430 - João Speglich, Navjot Kukreja, George Bisbas, Átila Saraiva, Jan Hückelheim, Fabio Luporini, John Washbourne:
Optimizing Forward Wavefield Storage Leveraging High-Speed Storage Media. 431-438 - Bin Dong, Kesheng Wu, Suren Byna:
The Art of Sparsity: Mastering High-Dimensional Tensor Storage. 439-446 - Nesreen K. Ahmed, Manoj Kumar:
GrAPL 2024 Preface and Committees. 447-448 - Altan Haan, Doru-Thom Popovici, Koushik Sen, Costin Iancu, Alvin Cheung:
To Tile or not to Tile, That is the Question. 449-458 - Chasen Milner, Hayden Jananthan, Jeremy Kepner, Vijay Gadepally, Michael Jones, Peter Michaleas, Ritesh Patel, Sandeep Pisharody, Gabriel Wachman, Alex Pentland:
Teaching Network Traffic Matrices in an Interactive Game Environment. 459-467 - Hao Xu, Shuang Song, Ze Mao:
Characterizing the Performance of Emerging Deep Learning, Graph, and High Performance Computing Workloads Under Interference. 468-477 - Raye Kimmerer, Timothy G. Mattson, Scott McMillan, Benjamin Brock, Erik Welch, Michel Pelletier, José E. Moreira:
The GraphBLAS 3.0 Project. 478-481 - Ariel Lubonja, Cencheng Shen, Carey E. Priebe, Randal C. Burns:
Edge-Parallel Graph Encoder Embedding. 482-485 - Matthieu Nastorg, Jean-Marc Gratien, Thibault Faney, Michele Alessandro Bucci, Guillaume Charpiat, Marc Schoenauer:
Multi-Level GNN Preconditioner for Solving Large Scale Problems. 486-495 - Joel Mathew Cherian, Nithin Puthalath Manoj, Kevin Jude Concessao, Unnikrishnan Cheramangalath:
STGraph: A Framework for Temporal Graph Neural Networks. 496-505 - Ali TehraniJamsaz, Hanze Chen, Ali Jannesari:
GraphBinMatch: Graph-Based Similarity Learning for Cross-Language Binary and Source Code Matching. 506-515 - Raye Kimmerer:
GraphBLAS.jl v0.1: An Update on GraphBLAS in Julia. 516-519 - Abdullah T. Mughrabi, Morteza Baradaran, Ahmed Samara, Kevin Skadron:
ECG: Expressing Locality and Prefetching for Optimal Caching in Graph Structures. 520-525 - Shaikh Arifuzzaman, Hasan S. Arikan, Md Abdul Motaleb Faysal, Maximilian H. Bremer, John Shalf, Doru Popovici:
Unlocking the Potential: Performance Portability of Graph Algorithms on Kokkos Framework. 526-529 - Gregory Schwing, Daniel Grosu, Loren Schwiebert:
Shared-Memory Parallel Edmonds Blossom Algorithm for Maximum Cardinality Matching in General Graphs. 530-539 - Alba Cristina Magalhaes Alves de Melo, Ananth Kalyanaraman:
HiCOMB 2024 Preface and Committees. 540 - Wu Feng:
Re-visiting the Third Pillar of Science for Synergistic (Bio)Computing. 541 - Giulia Guidi:
Lessons Learned Designing Irregular Genomic Algorithms on Parallel Systems and Architectures. 542 - Ian Lumsden, Hariharan Devarajan, Jack Marquez, Stephanie Brink, David Böhme, Olga Pearce, Jae-Seung Yeom, Michela Taufer:
Empirical Study of Molecular Dynamics Workflow Data Movement: DYAD vs. Traditional I/O Systems. 543-553 - Gianmarco Accordi, Davide Gadioli, Giorgio Seguini, Andrea Rosario Beccari, Gianluca Palermo:
ZSMILES: An Approach for Efficient SMILES Storage for Random Access in Virtual Screening. 554-560 - Reza Sajjadinasab, Hamed Rastaghi, Hafsah Shahzad, Sanjay Arora, Ulrich Drepper, Martin C. Herbordt:
Further Optimizations and Analysis of Smith-Waterman with Vector Extensions. 561-570 - Archit Vasan, Ozan Gökdemir, Alexander Brace, Arvind Ramanathan, Thomas S. Brettin, Rick Stevens, Venkatram Vishwanath:
High Performance Binding Affinity Prediction with a Transformer-Based Surrogate Model. 571-580 - Pete Beckman:
PAISE 2024 Preface and Committees. 581-583 - Matthew Jackson, Bo Ji, Dimitrios S. Nikolopoulos:
FrameFeedback: A Closed-Loop Control System for Dynamic Offloading Real-Time Edge Inference. 584-591 - Hasanul Mahmud, Peng Kang, Kevin Desai, Palden Lama, Sushil K. Prasad:
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge. 592-599 - Juliana Curry, Ahmed Louri, Avinash Karanth, Razvan C. Bunescu:
PCM Enabled Low-Power Photonic Accelerator for Inference and Training on Edge Devices. 600-607 - HooYoung Ahn, SeonYoung Kim, Yoo-Mi Park, Woojong Han, Nick Contini, Bharath Ramesh, Mustafa Abduljabbar, Dhabaleswar K. Panda:
Towards Accelerating k-NN with MPI and Near-Memory Processing. 608-615 - Artur Podobas:
CGRA4HPC 2024 Welcome Message and Committee List. 616-617 - Jiangnan Li, Yazhou Yan, Jingyuan Li, Shaoyang Sun, Boyin Jin, Wenbo Yin, Lingli Wang:
An Architecture-Agnostic Dataflow Mapping Framework on CGRA. 618-625 - Jingyuan Li, Yuan Dai, Yihan Hu, Jiangnan Li, Wenbo Yin, Jun Tao, Lingli Wang:
TransMap: An Efficient CGRA Mapping Framework via Transformer and Deep Reinforcement Learning. 626-633 - Chihyo Ahn, Shinnung Jeong, Liam Paul Cooper, Nicholas Parnenzini, Hyesoon Kim:
Comparative Analysis of Executing GPU Applications on FPGA: HLS vs. Soft GPU Approaches. 634-641 - Omar Ragheb, Stephen Wicklund, Matthew Walker, Rami Beidas, Adham Ragab, Tianyi Yu, Jason Helge Anderson:
CGRA-ME 2.0: A Research Framework for Next-Generation CGRA Architectures and CAD. 642-649 - Makoto Saito, Takuya Kojima, Hideki Takase, Hiroshi Nakamura:
A Scalable Mapping Method for Elastic CGRAs. 650-657 - Maya Borowicz, James Ding, Winnie Fan, Zhongqi Gao, Davis Jackson, Ares Lu, Sophia Rohlfsen, Ray Simar:
GIM (Ghost In the Machine): A Coarse-Grained Reconfigurable Compute-In-Memory Platform for Exploring Machine-Learning Architectures. 658-663 - Seyong Lee, Lena Oden:
HIPS 2024 Preface and Committees. 664-665 - HsinYu Sidney Tsai:
Architecture and Programming of Analog In-Memory-Computing Accelerators for Deep Neural Networks. 666 - Marc González Tallada, Joel E. Denny, Pedro Valero-Lara, Seyong Lee, Keita Teranishi, Jeffrey S. Vetter:
eCC++ : A Compiler Construction Framework for Embedded Domain-Specific Languages. 667-677 - Yicheng Li, Joseph Schuchart, George Bosilca:
Comprehensive Study for Just-In-Time Pack Functions in Open MPI. 678-685 - Rajat Bhattarai, Howard Pritchard, Sheikh Ghafoor:
Dynamic Resource Management for Elastic Scientific Workflows using PMIx. 686-695 - Ian Di Dio Lavore, Davide Maffi, Marco Arnaboldi, Arnaud Delamare, Daniele Bonetta, Marco D. Santambrogio:
GrOUT: Transparent Scale-Out to Overcome UVM's Oversubscription Slowdowns. 696-705 - Jamison Kerney, Ioan Raicu, John Raicu, Kyle Chard:
Towards Fine-Grained Parallelism in Parallel and Distributed Python Libraries. 706-715 - Daniel Barry, Anthony Danalis, Jack J. Dongarra:
Automated Data Analysis for Defining Performance Metrics from Raw Hardware Events. 716-725 - Yectli A. Huerta:
Performance Analysis of the NVIDIA HPC SDK and AMD AOCC Compilers in an HPC Cluster Using Pooled, Robust and Relative Metrics. 726-737 - Pedro Valero-Lara:
9th IEEE International Workshop on Automatic Performance Tuning (iWAPT 2024). 738-739 - Damian W. I. Rouson:
iWAPT 2024 Keynote Talk: What Happens to a Dream Deferred? Chasing Automatic Offloading in Fortran 2023. 740 - Gregory Bolet, Giorgis Georgakoudis, Konstantinos Parasyris, Kirk W. Cameron, David Beckingsale, Todd Gamblin:
An Exploration of Global Optimization Strategies for Autotuning OpenMP-based Codes. 741-750 - Kengo Nakajima:
Communication-Computation Overlapping for Parallel Multigrid Methods. 751-760 - Mingzhe Han, Goutham Kalikrishna Reddy Kuncham, Benjamin Michalowicz, Rahul Vaidya, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda:
PML-MPI: A Pre-Trained ML Framework for Efficient Collective Algorithm Selection in MPI. 761-770 - Emmanuel Jeannot, Pierre Lemarinier, Guillaume Mercier, Sophie Robert-Hayek, Richard Sartori:
Application-Agnostic Auto-Tuning of Open MPI Collectives Using Bayesian Optimization. 771-781 - Toshinobu Katayama, Keichi Takahashi, Yoichi Shimomura, Hiroyuki Takizawa:
XAI-Based Feature Importance Analysis on Loop Optimization. 782-791 - Adrián Pérez Diéguez, Min Choi, Mahmut Okyay, Mauro Del Ben, Bryan M. Wong, Khaled Z. Ibrahim:
Cost-Effective Methodology for Complex Tuning Searches in HPC: Navigating Interdependencies and Dimensionality. 792-801 - Dalibor Klusácek, Julita Corbalán, Gonzalo P. Rodrigo:
27th Workshop on Job Scheduling Strategies for Parallel Processing; (JSSPP 2024). 802 - Hien Nguyen, Jeremy E. Thompson:
ParSocial 2024 Welcome and Committee List. 803-804 - Jay Vap, Peter M. Kogge:
RIMR: Reverse Influence Maximization Rank. 805-814 - Naw Safrin Sattar, Hao Lu, Feiyi Wang, Mahantesh Halappanavar:
Distributed Multi-GPU Community Detection on Exascale Computing Platforms. 815-824 - Subhajit Sahu, Kishore Kothapalli, Hemalatha Eedi, Sathya Peri:
Lock-free Computation of PageRank in Dynamic Graphs. 825-834 - Elizabeth R. Koning, William Gropp:
Proposal for a Flexible Benchmark for Agent Based Models. 835-838 - Suresh Subramanian, Vairavan Murugappan, Eunice E. Santos:
Socio-Behavioral Influences in Epidemic Modeling: Towards a Unified Framework. 839-842 - Arindam Fadikar, Abby Stevens, Nicholson T. Collier, Kok Ben Toh, Olga Morozova, Anna Hotton, Jared Clark, David Higdon, Jonathan Ozik:
Towards Improved Uncertainty Quantification of Stochastic Epidemic Models Using Sequential Monte Carlo. 843-852 - Youcef Djenouri, Fabio Augusto de Alcantara Andrade, Gautam Srivastava, Ahmed Nabil Belbachir:
Revolutionizing Personal Recommendations via Federated Contrastive Transformer Learning. 853-856 - Mert Can Cakmak, Nitin Agarwal:
High-Speed Transcript Collection on Multimedia Platforms: Advancing Social Media Research through Parallel Processing. 857-860 - Ronaldo Canizales, Luis Mixco, Jedidiah McClurg:
Parallelizing Accelerographic Records Processing. 861-869 - Guillaume Helbecque, Ezhilmathi Krishnasamy, Nouredine Melab, Pascal Bouvry:
GPU-Accelerated Tree-Search in Chapel Versus CUDA and HIP. 872-879 - Gregory Schwing, Daniel Grosu, Loren Schwiebert:
Parallel Maximum Cardinality Matching for General Graphs on GPUs. 880-889 - Sumiaya Dabeer, Amitabha Bagchi, Rahul Narain:
GPU-LSolve: An Efficient GPU-Based Laplacian Solver for Million-Scale Graphs. 890-899 - Tarek Menouer, Christophe Cérin, Patrice Darmon:
KOptim: Kubernetes Optimization Framework. 900-908 - José Miguel Aragón-Jurado, Marina Díaz-Jiménez, Bernabé Dorronsoro, Pablo Pavón-Domínguez, Marcin Seredynski, Patricia Ruiz:
Electric Drive Assignment Strategies Optimization for Plugin Hybrid Urban Buses on Tailored Emissions Mapping. 909-918 - Mehrnaz Sharifian, Diman Zad Tootaghaj, Chen-Nee Chuah, Puneet Sharma:
DUST: Resource-Aware Telemetry Offloading with A Distributed Hardware-Agnostic Approach. 919-928 - Dayuan Chen, Noe Soto, Jonas F. Tuttle, Ziliang Zong:
Understanding Multi-Dimensional Efficiency of Fine-Tuning Large Language Models Using SpeedUp, MemoryUp, and EnergyUp. 929-937 - Florian Fey, Sergei Gorlatch:
Compiler-Driven SWAR Parallelism for High-Performance Bitboard Algorithms. 938-946 - Abass Sana, Kaoutar Senhaji, Amir Nakib:
Multiobjective Based Strategy for Neural Architecture Search for Segmentation Task. 947-955 - Didier El Baz, Jia Luo, Hao Mo, Lei Shi:
A Mathematical Model and a Convergence Result for Totally Asynchronous Federated Learning. 956-963 - Yasith Udagedara, Andrea Raith, Oliver Sinnen:
State-Space Search to Find Energy-Aware Pareto-Efficient Optimal Task Schedules. 964-973 - Sabine Roller, George Bosilca, Raphaël Couturier, Neda Ebrahimi Pour, Jean-Claude Charr, Thomas Rauber, Gudula Rünger, Laurence T. Yang:
Message from the PDSEC-24 Workshop Chairs. 974-975 - Alice Lasserre, Jean Marie Couteyen-Carpaye, Abdou Guermouche, Raymond Namyst:
Multi-Criteria Mesh Partitioning for an Explicit Temporal Adaptive Task-Distributed Finite-Volume Solver. 976-985 - Pablo Vizcaino, Jesús Labarta, Filippo Mantovani:
Graph Computing on Long Vector Architectures (Yes, It Works!). 986-995 - Camille Coti, Yann Pfau-Kempf, Markus Battarbee, Urs Ganse, Sameer Shende, Kevin A. Huck, Jordi Rodriquez, Leo Kotipalo, Jennifer Faj, Jeremy J. Williams, Ivy Peng, Allen D. Malony, Stefano Markidis, Minna Palmroth:
Integration of Modern HPC Performance Tools in Vlasiator for Exascale Analysis and Optimization. 996-1005 - Vincent Alba, Olivier Aumage, Denis Barthou, Raphaël Colin, Marie Christine Counilh, Stéphane Genaud, Amina Guermouche, Vincent Loechner, Arun Thangamani:
Performance Portability of Generated Cardiac Simulation Kernels Through Automatic Dimensioning and Load Balancing on Heterogeneous Nodes. 1006-1015 - Jurdana Masuma Iqrah, Wei Wang, Hongjie Xie, Sushil K. Prasad:
A Parallel Workflow for Polar Sea-Ice Classification Using Auto-Labeling of Sentinel-2 Imagery. 1016-1025 - Jesse McDonald, Maximilian Horzela, Frédéric Suter, Henri Casanova:
Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study. 1026-1035 - Aristeidis Tsaris, Philipe Ambrozio Dias, Abhishek Potnis, Junqi Yin, Feiyi Wang, Dalton D. Lunga:
Pretraining Billion-Scale Geospatial Foundational Models on Frontier. 1036-1046 - Kshitij Mehta, Massimiliano Lupo Pasini, Stephan Irle, Pilsun Yoo, Frédéric Suter, Dmitry Ganyushin, Scott Klasky:
Scaling Ensembles of Data-Intensive Quantum Chemical Calculations for Millions of Molecules. 1047-1056 - Taufeq Mohammed Razakh, Thomas Linker, Ye Luo, Rajiv K. Kalia, Ken-ichi Nomura, Priya Vashishta, Aiichiro Nakano:
Accelerating Quantum Light-Matter Dynamics on Graphics Processing Units. 1057-1066 - Ashfaq Khokhar, Mary Eshaghian-Wilner, Robert Basili:
Q-CASA 2024 Preface and Committee List. 1067 - Shiplu Sarker, Wenyang Qian, Soham Pal, Robert Basili, Mary Eshaghian-Wilner, Ashfaq Khokhar, Glenn R. Luecke, James P. Vary:
Quantifying Performance of Wire-Based Quantum Circuit Cutting with Entanglements. 1068-1077 - Marcel Quanz, Korbinian Staudacher, Karl Fürlinger:
Parallel Quantum Circuit Extraction from MBQC-Patterns. 1078-1087 - Aniello Esposito, Tamuz Danzig:
Hybrid Classical-Quantum Simulation of MaxCut using QAOA-in-QAOA. 1088-1094 - Annalisa Massini, Federico Mingardi:
A Delay-Efficient Implementation of Quantum Carry Select Adders. 1095-1104 - Aaron Orenstein, Vipin Chaudhary:
Quantum Circuit Mapping Using Binary Integer Nonlinear Programming. 1105-1114 - Tobias Stollenwerk, Stuart Hadfield:
Measurement-Based Quantum Approximate Optimization. 1115-1127 - J. Xun, Qin Liu, Shan Huang, Andi Chen, W. Shengjun:
Image Compression and Reconstruction Based on Quantum Network. 1128-1135 - Marvin Bechtold, Johanna Barzen, Frank Leymann, Alexander Mandl:
Cutting a Wire with Non-Maximally Entangled States. 1136-1145 - William Ruys, Hochan Lee, Bozhi You, Shreya Talati, Jaeyoung Park, James Almgren-Bell, Yineng Yan, Milinda Fernando, George Biros, Mattan Erez, Martin Burtscher, Christopher J. Rossbach, Keshav Pingali, Milos Gligoric:
A Deep Dive into Task-Based Parallelism in Python. 1147-1149 - Viktoria Mayer, Wilfried N. Gansterer:
A New Exact State Reconstruction Strategy for Conjugate Gradient Methods with Arbitrary Preconditioners. 1150-1152 - Muna Tageldin, Majeed M. Hayat, Jered Dominguez-Trujillo, Patrick G. Bridges:
A Stochastic Composite Model to Understand the Impact of Rare, Colossal Interference in HPC Systems. 1153-1155 - Jin Xue, Zili Shao:
Accelerating Native Transaction Processing in LSM-Based Persistent Key-Value Stores. 1156-1158 - Xu Zhang, Guangda Zhang, Lu Wang, Xia Zhao:
AdCoalescer: An Adaptive Coalescer to Reduce the Inter-Module Traffic in MCM-GPUs. 1159-1160 - Xiang Chen, Ru Ying, Haocong Ma, Yao Wang, Xianjun Meng, Guangjun Xie, Yonghui Zhan, Fenyong Yuan, Ying Yang, Tao Lu, Jinqiang Wang, You Zhou, Fei Wu:
An SR-IOV SSD Optimized for QoS-Sensitive IaaS Cloud Storage. 1161-1163 - Mathhew Whitlock, Hemanth Kolla, Aurelien Bouteiller, Jackson R. Mayo, Nicolas M. Morales, Keita Teranishi, George Bosilca:
Asynchrony and Failure Masking via Pseudo-Local Process Recovery in MPI Applications. 1164-1166 - Shinyoung Ahn, Hooyoung Ahn, Hyeonseong Choi, Jaehyun Lee:
EDDIS: Accelerating Distributed Data-Parallel DNN Training for Heterogeneous GPU Cluster. 1167-1168 - Pál András Papp, Georg Anegg, Aikaterini Karanasiou, Albert-Jan Nicholas Yzelman:
Efficient Multi-Processor Scheduling in Increasingly Realistic Models (Brief Summary). 1169-1171 - Martijn de Vos, Akash Dhasade, Paolo Dini, Elia Guerra, Anne-Marie Kermarrec, Marco Miozzo, Rafael Pires, Rishi Sharma:
Energy-Aware Decentralized Learning with Intermittent Model Training. 1172-1174 - Alok Kamatar, Valérie Hayot-Sasson, Yadu N. Babuji, André Bauer, Gourav Rattihalli, Ninad Hogade, Dejan S. Milojicic, Kyle Chard, Ian T. Foster:
Enhancing Energy Efficiency with Multi-Site Scheduling Strategies. 1175-1177 - Baodi Shan, Mauricio Araya-Polo:
Evaluation of Programming Models and Performance for Stencil Computation on GPGPUs. 1178-1180 - Eunji Lee, Yoonsang Han, Gordon Euhyun Moon:
Exploiting Tensor Cores in Sparse Matrix-Multivector Multiplication via Block-Sparsity-Aware Clustering. 1181-1183 - Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng:
FedClust: Optimizing Federated Learning on Non-IID Data Through Weight-Driven Client Clustering. 1184-1186 - Grant Wilkins, Sheng Di, Jon C. Calhoun, Zilinghan Li, Kibaek Kim, Robert Underwood, Richard Mortier, Franck Cappello:
FedSZ: Leveraging Error-Bounded Lossy Compression for Federated Learning Communications. 1187-1188 - Janaina Schwarzrock, Arthur Francisco Lorenzon, Samuel Xavier de Souza, Antonio Carlos Schneider Beck:
Integration Framework for Online Thread Throttling with Thread and Page Mapping on NUMA Systems. 1189-1192 - Jonghyun Bae, Jong Youl Choi, Massimiliano Lupo Pasini, Kshitij Mehta, Khaled Z. Ibrahim:
MDLoader: A Hybrid Model-driven Data Loader for Distributed Deep Neural Networks Training. 1193-1195 - Suraiya Tairin, Haiying Shen, Anand Iyer:
Proactive, Accuracy-aware Straggler Mitigation in Machine Learning Clusters. 1196-1198 - Isuru Ranawaka, Ariful Azad:
Scalable Node Embedding Algorithms Using Distributed Sparse Matrix Operations. 1199-1201 - Jie Li, George Michelogiannakis, Brandon Cook, John Shalf, Yong Chen:
Scheduling and Allocation of Disaggregated Memory Resources in HPC Systems. 1202-1203 - Subhajit Sahu, Kishore Kothapalli, Dip Sankar Banerjee:
Shared-Memory Parallel Dynamic Louvain Algorithm for Community Detection. 1204-1205 - Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Reza Yazdani Aminabadi, Shuaiwen Leon Song, Samyam Rajbhandari, Yuxiong He:
System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models. 1206-1208 - Evgeniy Feder, Anton Paramonov, Pavel Mavrin, Iosif Salem, Stefan Schmid, Vitaly Aksenov:
Toward Self-Adjusting $k$-Ary Search Tree Networks. 1209-1211 - Dinghuang Hu, Dezun Dong:
Understanding Different Transport Coexistence in Datacenter Networks. 1212-1213 - Sanmukh Kuppannagari, Tanwi Mallick:
IPDPS 2024 PhD Forum. 1214-1233
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.