default search action
Georg Hager
Person information
- affiliation: Erlangen National High Performance Computing Center, Germany
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c53]Jan Laukemann, Thomas Gruber, Georg Hager, Dossay Oryspayev, Gerhard Wellein:
CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion. IPDPS 2024: 350-360 - [i81]Dane C. Lacey, Christie L. Alappat, Florian Lange, Georg Hager, Holger Fehske, Gerhard Wellein:
Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels. CoRR abs/2405.12525 (2024) - [i80]Jan Laukemann, Georg Hager, Gerhard Wellein:
Microarchitectural comparison and in-core modeling of state-of-the-art CPUs: Grace, Sapphire Rapids, and Genoa. CoRR abs/2409.08108 (2024) - 2023
- [j40]Ayesha Afzal, Georg Hager, Stefano Markidis, Gerhard Wellein:
Making applications faster by asynchronous execution: Slowing down processes or relaxing MPI collectives. Future Gener. Comput. Syst. 148: 472-487 (2023) - [j39]Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager, Harald Köstler, Gerhard Wellein:
MD-Bench: A performance-focused prototyping harness for state-of-the-art short-range molecular dynamics algorithms. Future Gener. Comput. Syst. 149: 25-38 (2023) - [j38]Dominik Ernst, Markus Holzer, Georg Hager, Matthias Knorr, Gerhard Wellein:
Analytical performance estimation during code generation on modern GPUs. J. Parallel Distributed Comput. 173: 152-167 (2023) - [j37]Andreas Alvermann, Georg Hager, Holger Fehske:
Orthogonal Layers of Parallelism in Large-Scale Eigenvalue Computations. ACM Trans. Parallel Comput. 10(3): 16:1-16:31 (2023) - [j36]Christie L. Alappat, Georg Hager, Olaf Schenk, Gerhard Wellein:
Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication. IEEE Trans. Parallel Distributed Syst. 34(2): 581-597 (2023) - [j35]Ayesha Afzal, Georg Hager, Gerhard Wellein:
The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs. IEEE Trans. Parallel Distributed Syst. 34(2): 623-638 (2023) - [c52]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Physical Oscillator Model for Supercomputing. SC Workshops 2023: 1229-1235 - [c51]Ayesha Afzal, Georg Hager, Gerhard Wellein:
SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study. SC Workshops 2023: 1245-1254 - [c50]Georg Hager:
Application Knowledge Required: Performance Modeling for Fun and Profit. ICPE 2023: 5 - [c49]Jan Laukemann, Georg Hager:
Core-Level Performance Engineering with the Open-Source Architecture Code Analyzer (OSACA) and the Compiler Explorer. ICPE (Companion) 2023: 127-131 - [i79]Ayesha Afzal, Georg Hager, Stefano Markidis, Gerhard Wellein:
Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives. CoRR abs/2302.12164 (2023) - [i78]Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager, Harald Köstler, Gerhard Wellein:
MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages. CoRR abs/2302.14660 (2023) - [i77]Christie L. Alappat, Jonas Thies, Georg Hager, Holger Fehske, Gerhard Wellein:
Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs. CoRR abs/2309.02228 (2023) - [i76]Ayesha Afzal, Georg Hager, Gerhard Wellein:
SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study. CoRR abs/2309.05373 (2023) - [i75]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Physical Oscillator Model for Supercomputing. CoRR abs/2310.05701 (2023) - [i74]Jan Laukemann, Thomas Gruber, Georg Hager, Dossay Oryspayev, Gerhard Wellein:
CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion. CoRR abs/2311.04797 (2023) - 2022
- [j34]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Analytic performance model for parallel overlapping memory-bound kernels. Concurr. Comput. Pract. Exp. 34(10) (2022) - [j33]Christie L. Alappat, Nils Meyer, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Tilo Wettig:
Execution-Cache-Memory modeling and performance tuning of sparse matrix-vector multiplication and Lattice quantum chromodynamics on A64FX. Concurr. Comput. Pract. Exp. 34(20) (2022) - [c48]Ayesha Afzal, Gerhard Wellein, Georg Hager:
Addressing White-box Modeling and Simulation Challenges in Parallel Computing. SIGSIM-PADS 2022: 25-26 - [c47]Ayesha Afzal, Georg Hager, Gerhard Wellein, Stefano Markidis:
Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications. PPAM (1) 2022: 155-170 - [i73]Dominik Ernst, Markus Holzer, Georg Hager, Matthias Knorr, Gerhard Wellein:
Analytical Performance Estimation during Code Generation on Modern GPUs. CoRR abs/2204.14242 (2022) - [i72]Christie L. Alappat, Georg Hager, Olaf Schenk, Gerhard Wellein:
Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication. CoRR abs/2205.01598 (2022) - [i71]Ayesha Afzal, Georg Hager, Gerhard Wellein:
The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs. CoRR abs/2205.04190 (2022) - [i70]Ayesha Afzal, Georg Hager, Gerhard Wellein, Stefano Markidis:
Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications. CoRR abs/2205.13963 (2022) - [i69]Andreas Alvermann, Georg Hager, Holger Fehske:
Orthogonal layers of parallelism in large-scale eigenvalue computations. CoRR abs/2209.01974 (2022) - 2021
- [j32]Dominik Ernst, Georg Hager, Jonas Thies, Gerhard Wellein:
Performance engineering for real and complex tall & skinny matrix multiplication kernels on GPUs. Int. J. High Perform. Comput. Appl. 35(1) (2021) - [j31]Andreas Pieper, Georg Hager, Holger Fehske:
A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials. Int. J. High Perform. Comput. Appl. 35(1) (2021) - [c46]Christie L. Alappat, Johannes Seiferth, Georg Hager, Matthias Korch, Thomas Rauber, Gerhard Wellein:
YaskSite: Stencil Optimization Techniques Applied to Explicit ODE Methods on Modern Architectures. CGO 2021: 174-186 - [c45]Dominik Ernst, Georg Hager, Matthias Knorr, Gerhard Wellein, Markus Holzer:
Opening the Black Box: Performance Estimation during Code Generation for GPUs. SBAC-PAD 2021: 22-32 - [c44]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. ISC 2021: 351-371 - [i68]Christie L. Alappat, Nils Meyer, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Tilo Wettig:
ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX. CoRR abs/2103.03013 (2021) - [i67]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. CoRR abs/2103.03175 (2021) - [i66]Dominik Ernst, Georg Hager, Markus Holzer, Matthias Knorr, Gerhard Wellein:
Opening the Black Box: Performance Estimation during Code Generation for GPUs. CoRR abs/2107.01143 (2021) - 2020
- [j30]Francesco Cremonesi, Georg Hager, Gerhard Wellein, Felix Schürmann:
Analytic performance modeling and analysis of detailed neuron simulations. Int. J. High Perform. Comput. Appl. 34(4) (2020) - [j29]Johannes Hofmann, Christie L. Alappat, Georg Hager, Dietmar Fey, Gerhard Wellein:
Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors. Supercomput. Front. Innov. 7(2): 54-78 (2020) - [j28]Jonas Thies, Melven Röhrig-Zöllner, Nigel Overmars, Achim Basermann, Dominik Ernst, Georg Hager, Gerhard Wellein:
PHIST: A Pipelined, Hybrid-Parallel Iterative Solver Toolkit. ACM Trans. Math. Softw. 46(4): 31:1-31:26 (2020) - [j27]Christie L. Alappat, Achim Basermann, Alan R. Bishop, Holger Fehske, Georg Hager, Olaf Schenk, Jonas Thies, Gerhard Wellein:
A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication. ACM Trans. Parallel Comput. 7(3): 19:1-19:37 (2020) - [c43]Christie L. Alappat, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Nils Meyer, Tilo Wettig:
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX. PMBS@SC 2020: 1-7 - [c42]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs. ISC 2020: 391-411 - [c41]Christie L. Alappat, Johannes Hofmann, Georg Hager, Holger Fehske, Alan R. Bishop, Gerhard Wellein:
Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors. ISC 2020: 412-433 - [p4]Christie L. Alappat, Andreas Alvermann, Achim Basermann, Holger Fehske, Yasunori Futamura, Martin Galgon, Georg Hager, Sarah Huber, Akira Imakura, Masatoshi Kawai, Moritz Kreutzer, Bruno Lang, Kengo Nakajima, Melven Röhrig-Zöllner, Tetsuya Sakurai, Faisal Shahzad, Jonas Thies, Gerhard Wellein:
ESSEX: Equipping Sparse Solvers For Exascale. Software for Exascale Computing 2020: 143-187 - [i65]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs. CoRR abs/2002.02989 (2020) - [i64]Christie L. Alappat, Johannes Hofmann, Georg Hager, Holger Fehske, Alan R. Bishop, Gerhard Wellein:
Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors. CoRR abs/2002.03344 (2020) - [i63]Christie L. Alappat, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Nils Meyer, Tilo Wettig:
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX. CoRR abs/2009.13903 (2020) - [i62]Ayesha Afzal, Georg Hager, Gerhard Wellein:
An analytic performance model for overlapping execution of memory-bound loop kernels on multicore CPUs. CoRR abs/2011.00243 (2020)
2010 – 2019
- 2019
- [j26]Julian Hornich, Julian Hammer, Georg Hager, Thomas Gruber, Gerhard Wellein:
Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT. Supercomput. Front. Innov. 6(3): 4-25 (2019) - [j25]Faisal Shahzad, Jonas Thies, Moritz Kreutzer, Thomas Zeiser, Georg Hager, Gerhard Wellein:
CRAFT: A Library for Easier Application-Level Checkpoint/Restart and Automatic Fault Tolerance. IEEE Trans. Parallel Distributed Syst. 30(3): 501-514 (2019) - [c40]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study. CLUSTER 2019: 1-10 - [c39]Dominik Ernst, Georg Hager, Jonas Thies, Gerhard Wellein:
Performance Engineering for a Tall & Skinny Matrix Multiplication Kernels on GPUs. PPAM (1) 2019: 505-515 - [c38]Jan Laukemann, Julian Hammer, Georg Hager, Gerhard Wellein:
Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels. PMBS@SC 2019: 1-6 - [i61]Francesco Cremonesi, Georg Hager, Gerhard Wellein, Felix Schürmann:
Analytic Performance Modeling and Analysis of Detailed Neuron Simulations. CoRR abs/1901.05344 (2019) - [i60]Dominik Ernst, Georg Hager, Jonas Thies, Gerhard Wellein:
Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs. CoRR abs/1905.03136 (2019) - [i59]Ayesha Afzal, Georg Hager, Gerhard Wellein:
Delay Propagation and Overlapping Mechanisms on Clusters: A Case Study of Idle Periods based on Workload, Communication, and Delay Granularity. CoRR abs/1905.10603 (2019) - [i58]Julian Hornich, Julian Hammer, Georg Hager, Thomas Gruber, Gerhard Wellein:
Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT. CoRR abs/1906.08138 (2019) - [i57]Johannes Hofmann, Christie L. Alappat, Georg Hager, Dietmar Fey, Gerhard Wellein:
Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors. CoRR abs/1907.00048 (2019) - [i56]Christie L. Alappat, Georg Hager, Olaf Schenk, Jonas Thies, Achim Basermann, Alan R. Bishop, Holger Fehske, Gerhard Wellein:
A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. CoRR abs/1907.06487 (2019) - [i55]Jan Laukemann, Julian Hammer, Georg Hager, Gerhard Wellein:
Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels. CoRR abs/1910.00214 (2019) - 2018
- [j24]Faisal Shahzad, Moritz Kreutzer, Thomas Zeiser, Rui Machado, Andreas Pieper, Georg Hager, Gerhard Wellein:
Building and utilizing fault tolerance support tools for the GASPI applications. Int. J. High Perform. Comput. Appl. 32(5): 613-626 (2018) - [j23]Georg Hager, Gerhard Wellein:
Performance Engineering. Inform. Spektrum 41(5): 323-327 (2018) - [j22]Tareq M. Malas, Georg Hager, Hatem Ltaief, David E. Keyes:
Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations. ACM Trans. Parallel Comput. 4(3): 12:1-12:32 (2018) - [c37]Markus Wittmann, Georg Hager, Radim Janalík, Martin Lanser, Axel Klawonn, Oliver Rheinbach, Olaf Schenk, Gerhard Wellein:
Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model. SBAC-PAD 2018: 233-241 - [c36]Jan Laukemann, Julian Hammer, Johannes Hofmann, Georg Hager, Gerhard Wellein:
Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures. PMBS@SC 2018: 121-131 - [c35]Johannes Hofmann, Georg Hager, Dietmar Fey:
On the Accuracy and Usefulness of Analytic Energy Models for Contemporary Multicore Processors. ISC 2018: 22-43 - [c34]Moritz Kreutzer, Dominik Ernst, Alan R. Bishop, Holger Fehske, Georg Hager, Kengo Nakajima, Gerhard Wellein:
Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs. ISC 2018: 329-349 - [i54]Johannes Hofmann, Georg Hager, Dietmar Fey:
On the accuracy and usefulness of analytic energy models for contemporary multicore processors. CoRR abs/1803.01618 (2018) - [i53]Moritz Kreutzer, Georg Hager, Dominik Ernst, Holger Fehske, Alan R. Bishop, Gerhard Wellein:
Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs. CoRR abs/1803.02156 (2018) - [i52]Jan Laukemann, Julian Hammer, Johannes Hofmann, Georg Hager, Gerhard Wellein:
Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures. CoRR abs/1809.00912 (2018) - 2017
- [j21]Johannes Hofmann, Dietmar Fey, Michael Riedmann, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Performance analysis of the Kahan-enhanced scalar product on current multi-core and many-core processors. Concurr. Comput. Pract. Exp. 29(9) (2017) - [j20]Moritz Kreutzer, Jonas Thies, Melven Röhrig-Zöllner, Andreas Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg Hager, Gerhard Wellein:
GHOST: Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems. Int. J. Parallel Program. 45(5): 1046-1072 (2017) - [c33]Thomas Röhl, Jan Eitzinger, Georg Hager, Gerhard Wellein:
LIKWID Monitoring Stack: A Flexible Framework Enabling Job Specific Performance monitoring for the masses. CLUSTER 2017: 781-784 - [c32]Johannes Hofmann, Georg Hager, Gerhard Wellein, Dietmar Fey:
An Analysis of Core- and Chip-Level Architectural Features in Four Generations of Intel Server Processors. ISC 2017: 294-314 - [i51]Julian Hammer, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels. CoRR abs/1702.04653 (2017) - [i50]Johannes Hofmann, Georg Hager, Gerhard Wellein, Dietmar Fey:
An analysis of core- and chip-level architectural features in four generations of Intel server processors. CoRR abs/1702.07554 (2017) - [i49]Thomas Röhl, Jan Eitzinger, Georg Hager, Gerhard Wellein:
LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses. CoRR abs/1708.01476 (2017) - [i48]Faisal Shahzad, Jonas Thies, Moritz Kreutzer, Thomas Zeiser, Georg Hager, Gerhard Wellein:
CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance. CoRR abs/1708.02030 (2017) - [i47]Andreas Pieper, Georg Hager, Holger Fehske:
PVSC-DTM: A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials. CoRR abs/1708.09689 (2017) - [i46]Thomas Röhl, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Validation of hardware events for successful performance pattern identification in High Performance Computing. CoRR abs/1710.04094 (2017) - 2016
- [j19]Georg Hager, Darren J. Kerbyson, Abhinav Vishnu, Gerhard Wellein:
Performance and power for highly parallel systems. Concurr. Comput. Pract. Exp. 28(2): 187-188 (2016) - [j18]Georg Hager, Jan Treibig, Johannes Habich, Gerhard Wellein:
Exploring performance and power properties of modern multi-core chips via simple machine models. Concurr. Comput. Pract. Exp. 28(2): 189-210 (2016) - [j17]Markus Wittmann, Georg Hager, Thomas Zeiser, Jan Treibig, Gerhard Wellein:
Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations. Concurr. Comput. Pract. Exp. 28(7): 2295-2315 (2016) - [j16]Andreas Pieper, Moritz Kreutzer, Andreas Alvermann, Martin Galgon, Holger Fehske, Georg Hager, Bruno Lang, Gerhard Wellein:
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. J. Comput. Phys. 325: 226-243 (2016) - [c31]Johannes Hofmann, Dietmar Fey, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Analysis of Intel's Haswell Microarchitecture Using the ECM Model and Microbenchmarks. ARCS 2016: 210-222 - [c30]Tareq M. Malas, Julian Hornich, Georg Hager, Hatem Ltaief, Christoph Pflaum, David E. Keyes:
Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization. IPDPS 2016: 142-151 - [p3]Jonas Thies, Martin Galgon, Faisal Shahzad, Andreas Alvermann, Moritz Kreutzer, Andreas Pieper, Melven Röhrig-Zöllner, Achim Basermann, Holger Fehske, Georg Hager, Bruno Lang, Gerhard Wellein:
Towards an Exascale Enabled Sparse Solver Repository. Software for Exascale Computing 2016: 295-316 - [p2]Moritz Kreutzer, Jonas Thies, Andreas Pieper, Andreas Alvermann, Martin Galgon, Melven Röhrig-Zöllner, Faisal Shahzad, Achim Basermann, Alan R. Bishop, Holger Fehske, Georg Hager, Bruno Lang, Gerhard Wellein:
Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers. Software for Exascale Computing 2016: 317-338 - [i45]Johannes Hofmann, Dietmar Fey, Michael Riedmann, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors. CoRR abs/1604.01890 (2016) - 2015
- [j15]Tareq M. Malas, Georg Hager, Hatem Ltaief, Holger Stengel, Gerhard Wellein, David E. Keyes:
Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates. SIAM J. Sci. Comput. 37(4) (2015) - [j14]Melven Röhrig-Zöllner, Jonas Thies, Moritz Kreutzer, Andreas Alvermann, Andreas Pieper, Achim Basermann, Georg Hager, Gerhard Wellein, Holger Fehske:
Increasing the Performance of the Jacobi-Davidson Method by Blocking. SIAM J. Sci. Comput. 37(6) (2015) - [c29]Faisal Shahzad, Moritz Kreutzer, Thomas Zeiser, Rui Machado, Andreas Pieper, Georg Hager, Gerhard Wellein:
Building a Fault Tolerant Application Using the GASPI Communication Layer. CLUSTER 2015: 580-587 - [c28]Holger Stengel, Jan Treibig, Georg Hager, Gerhard Wellein:
Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model. ICS 2015: 207-216 - [c27]Moritz Kreutzer, Andreas Pieper, Georg Hager, Gerhard Wellein, Andreas Alvermann, Holger Fehske:
Performance Engineering of the Kernel Polynomal Method on Large-Scale CPU-GPU Systems. IPDPS 2015: 417-426 - [c26]Johannes Hofmann, Dietmar Fey, Michael Riedmann, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Performance Analysis of the Kahan-Enhanced Scalar Product on Current Multicore Processors. PPAM (1) 2015: 63-73 - [c25]Julian Hammer, Georg Hager, Jan Eitzinger, Gerhard Wellein:
Automatic loop kernel analysis and performance modeling with Kerncraft. PMBS@SC 2015: 4:1-4:11 - [i44]Johannes Hofmann, Dietmar Fey, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Performance analysis of the Kahan-enhanced scalar product on current multicore processors. CoRR abs/1505.02586 (2015) - [i43]Faisal Shahzad, Moritz Kreutzer, Thomas Zeiser, Rui Machado, Andreas Pieper, Georg Hager, Gerhard Wellein:
Building a fault tolerant application using the GASPI communication layer. CoRR abs/1505.04628 (2015) - [i42]Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Short Note on Costs of Floating Point Operations on current x86-64 Architectures: Denormals, Overflow, Underflow, and Division by Zero. CoRR abs/1506.03997 (2015) - [i41]Moritz Kreutzer, Jonas Thies, Melven Röhrig-Zöllner, Andreas Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg Hager, Gerhard Wellein:
GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. CoRR abs/1507.08101 (2015) - [i40]Julian Hammer, Georg Hager, Jan Eitzinger, Gerhard Wellein:
Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. CoRR abs/1509.03778 (2015) - [i39]Andreas Pieper, Moritz Kreutzer, Martin Galgon, Andreas Alvermann, Holger Fehske, Georg Hager, Bruno Lang, Gerhard Wellein:
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. CoRR abs/1510.04895 (2015) - [i38]Tareq M. Malas, Georg Hager, Hatem Ltaief, David E. Keyes:
Multi-dimensional intra-tile parallelization for memory-starved stencil computations. CoRR abs/1510.04995 (2015) - [i37]Tareq M. Malas, Julian Hornich, Georg Hager, Hatem Ltaief, Christoph Pflaum, David E. Keyes:
Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization. CoRR abs/1510.05218 (2015) - [i36]Johannes Hofmann, Dietmar Fey, Jan Eitzinger, Georg Hager, Gerhard Wellein:
Analysis of Intel's Haswell Microarchitecture Using The ECM Model and Microbenchmarks. CoRR abs/1511.03639 (2015) - 2014
- [j13]Stefan Kronawitter, Holger Stengel, Georg Hager, Christian Lengauer:
Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model. Parallel Process. Lett. 24(3) (2014) - [j12]Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, Alan R. Bishop:
A Unified Sparse Matrix Data Format for Efficient General Sparse Matrix-Vector Multiplication on Modern Processors with Wide SIMD Units. SIAM J. Sci. Comput. 36(5) (2014) - [c24]Johannes Hofmann, Jan Treibig, Georg Hager, Gerhard Wellein:
Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. ARCS Workshops 2014: 1-8 - [c23]Andreas Alvermann, Achim Basermann, Holger Fehske, Martin Galgon, Georg Hager, Moritz Kreutzer, Lukas Krämer, Bruno Lang, Andreas Pieper, Melven Röhrig-Zöllner, Faisal Shahzad, Jonas Thies, Gerhard Wellein:
ESSEX: Equipping Sparse Solvers for Exascale. Euro-Par Workshops (2) 2014: 577-588 - [c22]Thomas Roehl, Jan Treibig, Georg Hager, Gerhard Wellein:
Overhead Analysis of Performance Counter Measurements. ICPP Workshops 2014: 176-185 - [c21]Johannes Hofmann, Jan Treibig, Georg Hager, Gerhard Wellein:
Comparing the performance of different x86 SIMD instruction sets for a medical imaging application on modern multi- and manycore chips. WPMVP@PPoPP 2014: 57-64 - [i35]Johannes Hofmann, Jan Treibig, Georg Hager, Gerhard Wellein:
Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. CoRR abs/1401.3615 (2014) - [i34]Johannes Hofmann, Jan Treibig, Georg Hager, Gerhard Wellein:
Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. CoRR abs/1401.7494 (2014) - [i33]Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices. CoRR abs/1410.0412 (2014) - [i32]Tareq M. Malas, Georg Hager, Hatem Ltaief, Holger Stengel, Gerhard Wellein, David E. Keyes:
Multicore-optimized wavefront diamond blocking for optimizing stencil updates. CoRR abs/1410.3060 (2014) - [i31]Holger Stengel, Jan Treibig, Georg Hager, Gerhard Wellein:
Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. CoRR abs/1410.5010 (2014) - [i30]Moritz Kreutzer, Georg Hager, Gerhard Wellein, Andreas Pieper, Andreas Alvermann, Holger Fehske:
Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. CoRR abs/1410.5242 (2014) - [i29]Tareq M. Malas, Georg Hager, Hatem Ltaief, David E. Keyes:
Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. CoRR abs/1410.5561 (2014) - 2013
- [j11]Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Comparison of different propagation steps for lattice Boltzmann methods. Comput. Math. Appl. 65(6): 924-935 (2013) - [j10]Jan Treibig, Georg Hager, Hannes G. Hofmann, Joachim Hornegger, Gerhard Wellein:
Pushing the limits for medical image reconstruction on recent standard multicore processors. Int. J. High Perform. Comput. Appl. 27(2): 162-177 (2013) - [j9]Faisal Shahzad, Markus Wittmann, Moritz Kreutzer, Thomas Zeiser, Georg Hager, Gerhard Wellein:
A Survey of Checkpoint/Restart Techniques on Distributed Memory Systems. Parallel Process. Lett. 23(4) (2013) - [c20]Tobias Scharpff, Klaus Iglberger, Georg Hager, Ulrich Rüde:
Model-guided performance analysis of the sparse matrix-matrix multiplication. HPCS 2013: 445-452 - [c19]Faisal Shahzad, Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
An Evaluation of Different I/O Techniques for Checkpoint/Restart. IPDPS Workshops 2013: 1708-1716 - [i28]Markus Wittmann, Georg Hager, Thomas Zeiser, Gerhard Wellein:
Asynchronous MPI for the Masses. CoRR abs/1302.4280 (2013) - [i27]Tobias Scharpff, Klaus Iglberger, Georg Hager, Ulrich Rüde:
Model-guided Performance Analysis of the Sparse Matrix-Matrix Multiplication. CoRR abs/1303.1651 (2013) - [i26]Christoph Scheit, Georg Hager, Jan Treibig, Stefan Becker, Gerhard Wellein:
Optimization of FASTEST-3D for Modern Multicore Systems. CoRR abs/1303.4538 (2013) - [i25]Markus Wittmann, Georg Hager, Thomas Zeiser, Gerhard Wellein:
An analysis of energy-optimized lattice-Boltzmann CFD simulations from the chip to the highly parallel level. CoRR abs/1304.7664 (2013) - [i24]Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, Alan R. Bishop:
A unified sparse matrix data format for modern processors with wide SIMD units. CoRR abs/1307.6209 (2013) - 2012
- [j8]Klaus Iglberger, Georg Hager, Jan Treibig, Ulrich Rüde:
Expression Templates Revisited: A Performance Analysis of Current Methodologies. SIAM J. Sci. Comput. 34(2) (2012) - [c18]Georg Hager:
Performance Engineering: From Numbers to Insight. Euro-Par Workshops 2012: 393-394 - [c17]Jan Treibig, Georg Hager, Gerhard Wellein:
Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering. Euro-Par Workshops 2012: 451-460 - [c16]Klaus Iglberger, Georg Hager, Jan Treibig, Ulrich Rüde:
High performance smart expression template math libraries. HPCS 2012: 367-373 - [c15]Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, Achim Basermann, Alan R. Bishop:
Sparse Matrix-vector Multiplication on GPGPU Clusters: A New Storage Format and a Scalable Implementation. IPDPS Workshops 2012: 1696-1702 - [i23]Jan Treibig, Georg Hager, Gerhard Wellein:
Best practices for HPM-assisted performance engineering on modern multicore processors. CoRR abs/1206.3738 (2012) - [i22]Georg Hager, Jan Treibig, Johannes Habich, Gerhard Wellein:
Exploring performance and power properties of modern multicore chips via simple machine models. CoRR abs/1208.2908 (2012) - 2011
- [b1]Georg Hager, Gerhard Wellein:
Introduction to High Performance Computing for Scientists and Engineers. Chapman and Hall / CRC computational science series, CRC Press 2011, ISBN 978-1-439-81192-4, pp. I-XXV, 1-330 - [j7]Johannes Habich, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Performance analysis and optimization strategies for a D3Q19 lattice Boltzmann kernel on nVIDIA GPUs using CUDA. Adv. Eng. Softw. 42(5): 266-272 (2011) - [j6]Jan Treibig, Gerhard Wellein, Georg Hager:
Efficient multicore-aware parallelization strategies for iterative stencil computations. J. Comput. Sci. 2(2): 130-137 (2011) - [j5]Christian Feichtinger, Johannes Habich, Harald Köstler, Georg Hager, Ulrich Rüde, Gerhard Wellein:
A flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU-CPU clusters. Parallel Comput. 37(9): 536-549 (2011) - [j4]Gerald Schubert, Holger Fehske, Georg Hager, Gerhard Wellein:
Hybrid-Parallel Sparse Matrix-Vector Multiplication with Explicit Communication Overlap on Current Multicore-Based Systems. Parallel Process. Lett. 21(3): 339-358 (2011) - [c14]Gerald Schubert, Georg Hager, Holger Fehske, Gerhard Wellein:
Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP Programming. IPDPS Workshops 2011: 1751-1758 - [c13]Jan Treibig, Georg Hager, Gerhard Wellein:
likwid-bench: An Extensible Microbenchmarking Platform for x86 Multicore Compute Nodes. Parallel Tools Workshop 2011: 27-36 - [c12]Jan Treibig, Georg Hager, Gerhard Wellein, Michael Meier:
Poster: LIKWID: lightweight performance tools. SC Companion 2011: 29-30 - [i21]Gerald Schubert, Georg Hager, Holger Fehske, Gerhard Wellein:
Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming. CoRR abs/1101.0091 (2011) - [i20]Markus Wittmann, Georg Hager:
Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems. CoRR abs/1101.0093 (2011) - [i19]Klaus Iglberger, Georg Hager, Jan Treibig, Ulrich Rüde:
Expression Templates Revisited: A Performance Analysis of the Current ET Methodology. CoRR abs/1104.1729 (2011) - [i18]Jan Treibig, Georg Hager, Gerhard Wellein:
LIKWID: Lightweight Performance Tools. CoRR abs/1104.4874 (2011) - [i17]Jan Treibig, Georg Hager, Hannes G. Hofmann, Joachim Hornegger, Gerhard Wellein:
Pushing the limits for medical image reconstruction on recent standard multicore processors. CoRR abs/1104.5243 (2011) - [i16]Gerald Schubert, Holger Fehske, Georg Hager, Gerhard Wellein:
Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. CoRR abs/1106.5908 (2011) - [i15]Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Comparison of different Propagation Steps for the Lattice Boltzmann Method. CoRR abs/1111.0922 (2011) - [i14]Markus Wittmann, Thomas Zeiser, Georg Hager, Gerhard Wellein:
Domain decomposition and locality optimization for large-scale lattice Boltzmann simulations. CoRR abs/1111.1129 (2011) - [i13]Johannes Habich, Christian Feichtinger, Harald Köstler, Georg Hager, Gerhard Wellein:
Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results. CoRR abs/1112.0850 (2011) - [i12]Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, Achim Basermann, Alan R. Bishop:
Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation. CoRR abs/1112.5588 (2011) - 2010
- [j3]Markus Wittmann, Georg Hager, Jan Treibig, Gerhard Wellein:
Leveraging Shared Caches for Parallel Temporal Blocking of Stencil Codes on Multicore Processors and Clusters. Parallel Process. Lett. 20(4): 359-376 (2010) - [c11]Jan Treibig, Georg Hager, Gerhard Wellein:
LIKWID: Lightweight Performance Tools. CHPC 2010: 165-175 - [c10]Jan Treibig, Georg Hager, Gerhard Wellein:
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments. ICPP Workshops 2010: 207-216 - [c9]Markus Wittmann, Georg Hager, Gerhard Wellein:
Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. IPDPS Workshops 2010: 1-7 - [i11]Jan Treibig, Gerhard Wellein, Georg Hager:
Efficient multicore-aware parallelization strategies for iterative stencil computations. CoRR abs/1004.1741 (2010) - [i10]Jan Treibig, Georg Hager, Gerhard Wellein:
LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. CoRR abs/1004.4431 (2010) - [i9]Markus Wittmann, Georg Hager, Jan Treibig, Gerhard Wellein:
Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. CoRR abs/1006.3148 (2010) - [i8]Christian Feichtinger, Johannes Habich, Harald Köstler, Georg Hager, Ulrich Rüde, Gerhard Wellein:
A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters. CoRR abs/1007.1388 (2010)
2000 – 2009
- 2009
- [j2]Thomas Zeiser, Georg Hager, Gerhard Wellein:
Benchmark Analysis and Application Results for Lattice Boltzmann Simulations on NEC SX Vector and Intel Nehalem Systems. Parallel Process. Lett. 19(4): 491-511 (2009) - [c8]Gerhard Wellein, Georg Hager, Thomas Zeiser, Markus Wittmann, Holger Fehske:
Efficient Temporal Blocking for Stencil Computations by Multicore-Aware Wavefront Parallelization. COMPSAC (1) 2009: 579-586 - [c7]Thomas Zeiser, Georg Hager, Gerhard Wellein:
The world's fastest CPU and SMP node: Some performance results from the NEC SX-9. IPDPS 2009: 1-8 - [c6]Rolf Rabenseifner, Georg Hager, Gabriele Jost:
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. PDP 2009: 427-436 - [c5]Jan Treibig, Georg Hager:
Introducing a Performance Model for Bandwidth-Limited Loop Kernels. PPAM (1) 2009: 615-624 - [i7]Markus Wittmann, Georg Hager:
A Proof of Concept for Optimizing Task Parallelism by Locality Queues. CoRR abs/0902.1884 (2009) - [i6]Jan Treibig, Georg Hager:
Introducing a Performance Model for Bandwidth-Limited Loop Kernels. CoRR abs/0905.0792 (2009) - [i5]Gerald Schubert, Georg Hager, Holger Fehske:
Performance limitations for sparse matrix-vector multiplications on current multicore environments. CoRR abs/0910.4836 (2009) - [i4]Jan Treibig, Georg Hager, Gerhard Wellein:
Multi-core architectures: Complexities of performance prediction and the impact of cache topology. CoRR abs/0910.4865 (2009) - [i3]Markus Wittmann, Georg Hager, Gerhard Wellein:
Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. CoRR abs/0912.4506 (2009) - 2008
- [j1]Georg Hager, Thomas Zeiser, Gerhard Wellein:
Data Access Characteristics and Optimizations for Sun UltraSPARC T2 and T2+ Systems. Parallel Process. Lett. 18(4): 471-490 (2008) - [c4]Georg Hager, Thomas Zeiser, Gerhard Wellein:
Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. IPDPS 2008: 1-7 - [p1]Thomas Zeiser, Georg Hager, Gerhard Wellein:
Vector Computers in a World of Commodity Clusters, Massively Parallel Systems and Many-Core Many-Threaded CPUs: Recent Experience Based on an Advanced Lattice Boltzmann Flow Solver. High Performance Computing in Science and Engineering 2008: 333-347 - 2007
- [i2]Georg Hager, Thomas Zeiser, Gerhard Wellein:
Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. CoRR abs/0712.2302 (2007) - [i1]Georg Hager, Holger Stengel, Thomas Zeiser, Gerhard Wellein:
RZBENCH: Performance evaluation of current HPC architectures using low-level and application benchmarks. CoRR abs/0712.3389 (2007) - 2006
- [c3]Rolf Rabenseifner, Georg Hager, Gabriele Jost, Rainer Keller:
Hybrid MPI and OpenMP Parallel Programming. PVM/MPI 2006: 11 - 2003
- [c2]Georg Hager, Eric Jeckelmann, Holger Fehske, Gerhard Wellein:
Exact Numerical Treatment of Finite Quantum Systems Using Leading-Edge Supercomputers. HPSC 2003: 165-177 - 2002
- [c1]Gerhard Wellein, Georg Hager, Achim Basermann, Holger Fehske:
Fast Sparse Matrix-Vector Multiplication for TeraFlop/s Computers. VECPAR 2002: 287-301
Coauthor Index
aka: Jan Treibig
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-15 00:23 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint