Running code on a different processor can unlock significant performance and efficiency improvements. GPUs, for example, are well known for accelerating mathematical kernels and data-parallel workloads. With Daisytuner, porting C/C++ applications to accelerators becomes straightforward: It simply requires setting a compiler flag.
This repository provides introductory, example-driven guidance on how docc can offload your code to different processors without any code changes.
- Example 01: Offloading a simple kernel to CUDA and Tenstorrent.
- Example 02: Daisy Workflows: Build and Test on Multi-Accelerator Servers
- Example 03: Understanding the Offloading Behavior with Libraries
For more advanced, application-level examples, see:
- HPCCG A demonstration of the conjugate gradient method running on CPU, CUDA, and Tenstorrent.
We are currently finalizing the last components of docc and will release it as a free-to-use compiler in the coming weeks. In the meantime, docc is already available through the daisy CI/CD workflows. Additionally, most of the underlying compiler passes and analyses are open source and can be explored in our sdfglib repository.