default search action
19th ICS 2005: Massachusetts, USA
- Arvind, Larry Rudolph:
Proceedings of the 19th Annual International Conference on Supercomputing, ICS 2005, Cambridge, Massachusetts, USA, June 20-22, 2005. ACM 2005, ISBN 1-59593-167-8
Cache
- Aneesh Aggarwal:
Reducing latencies of pipelined cache accesses through set prediction. 2-11 - Eriko Nurvitadhi, Nirut Chalainanont, Shih-Lien Lu:
Characterization of L3 cache behavior of SPECjAppServer2002 and TPC-C. 12-20 - Jaydeep Marathe, Frank Mueller, Bronis R. de Supinski:
A hybrid hardware/software approach to efficiently determine cache coherence Bottlenecks. 21-30 - Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhang, Doug Burger, Stephen W. Keckler:
A NUCA substrate for flexible CMP cache sharing. 31-40
Value
- Peng Zhou, Soner Önder, Steve Carr:
Fast branch misprediction recovery in out-of-order superscalar processors. 41-50 - Yongxiang Liu, Anahita Shayesteh, Gokhan Memik, Glenn Reinman:
Tornado warning: the perils of selective replay in multithreaded processors. 51-60 - Rubén González, Adrián Cristal, Miquel Pericàs, Mateo Valero, Alexander V. Veidenbaum:
An asymmetric clustered processor based on value content. 61-70 - Kaushik Rajan, Ramaswamy Govindarajan:
A heterogeneously segmented cache architecture for a packet forwarding engine. 71-80
Sampling
- Nathan Froyd, John M. Mellor-Crummey, Robert J. Fowler:
Low-overhead call path profiling of unmodified, optimized code. 81-90 - Huai Wang, Srinivasan Parthasarathy, Amol Ghoting, Shirish Tatikonda, Gregory Buehrer, Tahsin M. Kurç, Joel H. Saltz:
Design of a next generation sampling service for large scale data analysis applications. 91-100 - Reza Azimi, Michael Stumm, Robert W. Wisniewski:
Online performance analysis by statistical sampling of microprocessor performance counters. 101-110 - Robert H. Bell Jr., Lizy Kurian John:
Improved automatic testcase synthesis for performance model validation. 111-120
Compilers 1
- Alejandro Duran, Marc González, Julita Corbalán:
Automatic thread distribution for nested parallelism in OpenMP. 121-130 - Xipeng Shen, Yaoqing Gao, Chen Ding, Roch Archambault:
Lightweight reference affinity analysis. 131-140 - Kamen Yotov, Keshav Pingali, Paul Stodghill:
Think globally, search locally. 141-150 - Albert Cohen, Marc Sigler, Sylvain Girbal, Olivier Temam, David Parello, Nicolas Vasilache:
Facilitating the search for compositions of program transformations. 151-160
Compilers 2
- Masayo Haneda, Peter M. W. Knijnenburg, Harry A. G. Wijshoff:
Generating new general compiler optimization settings. 161-168 - Peng Wu, Alexandre E. Eichenberger, Amy Wang, Peng Zhao:
An integrated simdization framework using virtual vectors. 169-178 - Jose Renau, James Tuck, Wei Liu, Luis Ceze, Karin Strauss, Josep Torrellas:
Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation. 179-188 - Ayon Basumallik, Rudolf Eigenmann:
Towards automatic translation of OpenMP to MPI. 189-198
Threads
- Hassan Chafi, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, JaeWoong Chung, Lance Hammond, Christos Kozyrakis, Kunle Olukotun:
TAPE: a transactional application profiling environment. 199-208 - Madhavi Gopal Valluri, Lizy Kurian John, Kathryn S. McKinley:
Low-power, low-complexity instruction issue using compiler assistance. 209-218 - Jose Renau, Karin Strauss, Luis Ceze, Wei Liu, Smruti R. Sarangi, James Tuck, Josep Torrellas:
Thread-Level Speculation on a CMP can be energy efficient. 219-228 - Barry Lawson, Evgenia Smirni:
Power-aware resource allocation in high-end systems via online simulation. 229-238
Machines
- Gary Gostin, Jean-Francois Collard, Kirby Collins:
The architecture of the HP Superdome shared-memory multiprocessor. 239-245 - George Almási, Gyan Bhanot, Alan Gara, Manish Gupta, James C. Sexton, Robert Walkup, Vasily V. Bulatov, Andrew W. Cook, Bronis R. de Supinski, James N. Glosli, Jeffrey A. Greenough, François Gygi, Alison Kubota, Steve Louis, Thomas E. Spelce, Frederick H. Streitz, Peter L. Williams, Robert K. Yates, Charles Archer, José E. Moreira, Charles A. Rendleman:
Scaling physics and material science applications on a massively parallel Blue Gene/L system. 246-252 - George Almási, Philip Heidelberger, Charles Archer, Xavier Martorell, C. Christopher Erway, José E. Moreira, Burkhard D. Steinmacher-Burow, Yili Zheng:
Optimization of MPI collective communication on BlueGene/L systems. 253-262
Distributed Systems
- Cristiana Amza, Gokul Soundararajan, Emmanuel Cecchet:
Transparent caching with strong consistency in dynamic content web sites. 264-273 - Seung Woo Son, Guangyu Chen, Mahmut T. Kandemir:
Disk layout optimization for reducing energy consumption. 274-283 - Thanasis Loukopoulos, Petros Lampsas, Ishfaq Ahmad:
Continuous Replica Placement schemes in distributed systems. 284-292 - Wesley M. Felter, Karthick Rajamani, Tom W. Keller, Cosmin Rusu:
A performance-conserving approach for reducing peak power consumption in server systems. 293-302
Operating Systems
- Dan Tsafrir, Yoav Etsion, Dror G. Feitelson, Scott Kirkpatrick:
System noise, OS clock ticks, and fine-grained parallel applications. 303-312 - Gladys Utrera, Julita Corbalán, Jesús Labarta:
Another approach to backfilled jobs: applying virtual malleability to expired windows. 313-322 - Weikuan Yu, Shuang Liang, Dhabaleswar K. Panda:
High performance support of parallel virtual file system (PVFS2) over Quadrics. 323-331 - Richard C. Murphy, Arun Rodrigues, Peter M. Kogge, Keith D. Underwood:
The implications of working set analysis on supercomputing memory hierarchy design. 332-340
Applications
- Brian S. White, Sally A. McKee, Bronis R. de Supinski, Brian Miller, Daniel J. Quinlan, Martin Schulz:
Improving the computational intensity of unstructured mesh applications. 341-350 - Kai Shen:
Parallel sparse LU factorization on second-class message passing platforms. 351-360 - Matteo Frigo, Volker Strumpen:
Cache oblivious stencil computations. 361-366 - Christos D. Antonopoulos, Xiaoning Ding, Andrey N. Chernikov, Filip Blagojevic, Dimitrios S. Nikolopoulos, Nikos Chrisochoides:
Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architectures. 367-376
System-Wide Issues
- Julia Zilber, Ofer Amit, David Talby:
What is worth learning from parallel workloads?: a user and session based analysis. 377-386 - Henrik Löf, Sverker Holmgren:
affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system. 387-392 - Ahmad Faraj, Xin Yuan:
Automatic generation and tuning of MPI collective communication routines. 393-402
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.