default search action
PACT 2015: San Francisco, CA, USA
- 2015 International Conference on Parallel Architectures and Compilation, PACT 2015, San Francisco, CA, USA, October 18-21, 2015. IEEE Computer Society 2015, ISBN 978-1-4673-9524-3
Session 1A: GPUs
- Mihir Awatramani, Xian Zhu, Joseph Zambreno, Diane T. Rover:
Phase Aware Warp Scheduling: Mitigating Effects of Phase Behavior in GPGPU Applications. 1-12 - Jie Zhang, David Donofrio, John Shalf, Mahmut T. Kandemir, Myoungsoo Jung:
NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures. 13-24 - Rachata Ausavarungnirun, Saugata Ghose, Onur Kayiran, Gabriel H. Loh, Chita R. Das, Mahmut T. Kandemir, Onur Mutlu:
Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance. 25-38
Session 1B: Algorithms
- Farzad Khorasani, Rajiv Gupta, Laxmi N. Bhuyan:
Scalable SIMD-Efficient Graph Processing on GPUs. 39-50 - Adam McLaughlin, Duane Merrill, Michael Garland, David A. Bader:
Parallel Methods for Verifying the Consistency of Weakly-Ordered Architectures. 51-62 - Farzad Khorasani, Mehmet E. Belviranli, Rajiv Gupta, Laxmi N. Bhuyan:
Stadium Hashing: Scalable and Flexible Hashing on GPUs. 63-74
Session 2A: Profiling
- Yujie Liu, Justin Gottschlich, Gilles Pokam, Michael F. Spear:
TSXProf: Profiling Hardware Transactions. 75-86 - Lev Mukhanov, Dimitrios S. Nikolopoulos, Bronis R. de Supinski:
ALEA: Fine-Grain Energy Profiling with Basic Block Sampling. 87-98
Session 2B: Architecture
- Schuyler Eldridge, Amos Waterland, Margo I. Seltzer, Jonathan Appavoo, Ajay Joshi:
Towards General-Purpose Neural Network Computing. 99-112 - Mingyu Gao, Grant Ayers, Christos Kozyrakis:
Practical Near-Data Processing for In-Memory Analytics Frameworks. 113-124
Session 3A: Language and Compilation
- Stephen T. Heumann, Alexandros Tzannes, Vikram S. Adve:
Scalable Task Scheduling and Synchronization Using Hierarchical Effects. 125-137 - Riyadh Baghdadi, Ulysse Beaugnon, Albert Cohen, Tobias Grosser, Michael Kruse, Chandan Reddy, Sven Verdoolaege, Adam Betts, Alastair F. Donaldson, Jeroen Ketema, Javed Absar, Sven van Haastregt, Alexey Kravets, Anton Lokhmotov, Robert David, Elnar Hajiyev:
PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming. 138-149 - Karthik Murthy, John M. Mellor-Crummey:
Communication Avoiding Algorithms: Analysis and Code Generation for Parallel Systems. 150-162
Session 3B: Memory
- Wei Wei, Dejun Jiang, Sally A. McKee, Jin Xiong, Mingyu Chen:
Exploiting Program Semantics to Place Data in Hybrid Memory. 163-173 - Donghyuk Lee, Lavanya Subramanian, Rachata Ausavarungnirun, Jongmoo Choi, Onur Mutlu:
Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM. 174-187 - Mark Oskin, Gabriel H. Loh:
A Software-Managed Approach to Die-Stacked DRAM. 188-200
Session 4: Best Papers
- Harshvardhan, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:
An Algorithmic Approach to Communication Reduction in Parallel Graph Algorithms. 201-212 - Prasanth Chatarasi, Jun Shirako, Vivek Sarkar:
Polyhedral Optimizations of Explicitly Parallel Programs. 213-226 - Xiangyao Yu, Srinivas Devadas:
Tardis: Time Traveling Coherence Algorithm for Distributed Shared Memory. 227-240 - Joo Hwan Lee, Jaewoong Sim, Hyesoon Kim:
BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models. 241-252
Session 5: Keynote
- Dharmendra S. Modha:
Brain-Inspired Computing. 253
Session 6A: Compilers
- Shasha Wen, Xu Liu, Milind Chabbi:
Runtime Value Numbering: A Profiling Technique to Pinpoint Redundant Computations. 254-265 - Michelle L. Goodstein, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry:
Tracking and Reducing Uncertainty in Dataflow Analysis-Based Dynamic Parallel Monitoring. 266-279 - Vinit Deodhar, Hrushit Parikh, Ada Gavrilovska, Santosh Pande:
Compiler Assisted Load Balancing on Large Clusters. 280-291
Session 6B: Caches
- Marco Elver, Vijay Nagarajan:
RC3: Consistency Directed Cache Coherence for x86-64 with RC Extensions. 292-304 - Jason Jong Kyu Park, Yongjun Park, Scott A. Mahlke:
Fine Grain Cache Partitioning Using Per-Instruction Working Blocks. 305-316 - Mahdad Davari, Alberto Ros, Erik Hagersten, Stefanos Kaxiras:
An Efficient, Self-Contained, On-chip Directory: DIR1-SISD. 317-330
Session 7A: Resilience and Compilation
- Subrata Mitra, Greg Bronevetsky, Suhas Javagal, Saurabh Bagchi:
Dealing with the Unknown: Resilience to Prediction Errors. 331-342 - Prasanna Venkatesh Rengasamy, Anand Sivasubramaniam, Mahmut T. Kandemir, Chita R. Das:
Exploiting Staleness for Approximating Loads on CMPs. 343-354 - Janghaeng Lee, Mehrzad Samadi, Scott A. Mahlke:
Orchestrating Multiple Data-Parallel Kernels on Multiple Devices. 355-366
Session 7B: Caches
- Muneeb Khan, Michael A. Laurenzano, Jason Mars, Erik Hagersten, David Black-Schaffer:
AREP: Adaptive Resource Efficient Prefetching for Maximizing Multicore Performance. 367-378 - Lluc Alvarez, Miquel Moretó, Marc Casas, Emilio Castillo, Xavier Martorell, Jesús Labarta, Eduard Ayguadé, Mateo Valero:
Runtime-Guided Management of Scratchpad Memories in Multicore Architectures. 379-391 - George Kurian, Qingchuan Shi, Srinivas Devadas, Omer Khan:
OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access. 392-405
Session 8: Keynote
- Salman Habib:
Cosmology and Computers: HACCing the Universe. 406
Session 9A: Compilers
- Ivan Jibaja, Peter Jensen, Ningxin Hu, Mohammad R. Haghighat, John McCutchan, Dan Gohman, Stephen M. Blackburn, Kathryn S. McKinley:
Vector Parallelism in JavaScript: Language and Compiler Support for SIMD. 407-418 - Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, Vivek Sarkar:
Compiling and Optimizing Java 8 Programs for GPU Execution. 419-431 - Vasileios Porpodas, Timothy M. Jones:
Throttling Automatic Vectorization: When Less is More. 432-444
Session 9B: Modeling
- Hermann Schweizer, Maciej Besta, Torsten Hoefler:
Evaluating the Cost of Atomic Operations on Modern Architectures. 445-456 - Yipeng Wang, Ganesh Balakrishnan, Yan Solihin:
MeToo: Stochastic Modeling of Memory Traffic Timing Behavior. 457-467 - Arnamoy Bhattacharyya, Grzegorz Kwasniewski, Torsten Hoefler:
Using Compiler Techniques to Improve Automatic Performance Modeling. 468-479
Poster Abstracts
- Tian Jin:
Using Hybrid Schedules to Safely Outperform Classical Polyhedral Schedules. 480-481 - Miguel Angel Aguilar, Rainer Leupers:
Unified Identification of Multiple Forms of Parallelism in Embedded Applications. 482-483 - Daichi Murakami, Kei Hiraki:
An Optimization of Resource Arrangement for Network-on-Chip using Genetic Algorithm. 484-485 - Raj Parihar, Michael C. Huang:
Load Balancing in Decoupled Look-ahead: A Do-It-Yourself (DIY) Approach. 486-487 - Shixiong Xu, David Gregg:
An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs. 488-489 - Prasanth Chatarasi, Vivek Sarkar:
Extending Polyhedral Model for Analysis and Transformation of OpenMP Programs. 490-491 - Ahmad Hassan, Hans Vandierendonck, Dimitrios S. Nikolopoulos:
Energy-Efficient Hybrid DRAM/NVM Main Memory. 492-493 - Patricia Arroba, José Manuel Moya, José L. Ayala, Rajkumar Buyya:
DVFS-Aware Consolidation for Energy-Efficient Clouds. 494-495 - Jie Zhang, David Donofrio, John Shalf, Myoungsoo Jung:
Integrating 3D Resistive Memory Cache into GPGPU for Energy-Efficient Data Processing. 496-497 - Narges Shahidi, Anand Sivasubramanian, Mahmut T. Kandemir, Chita R. Das:
Storage Consolidation on SSDs: Not Always a Panacea, but Can We Ease the Pain? 498-499
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.