High-level C++ for Accelerator Clusters
-
Updated
Nov 27, 2024 - C++
High-level C++ for Accelerator Clusters
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
Almost trivial distributed parallelization of stencil-based GPU and CPU applications on a regular staggered grid
GPU Framework for Radio Astronomical Image Synthesis
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
Chains stable-diffusion-webui instances together to facilitate faster image generation.
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
A dual-GPU DEM solver with complex grain geometry support
multi-gpu pre-training in one machine for BERT from scratch without horovod (Data Parallelism)
The Forge Cross-Platform Rendering Framework PC Windows, Steamdeck (native), Ray Tracing, macOS / iOS, Android, XBOX, PS4, PS5, Switch, Quest 2
POT3D: High Performance Potential Field Solver
Multi-threaded GUI manager for mass creation of AI-generated art with support for multiple GPUs.
GPU-ready Dockerfile to run Stability.AI stable-diffusion model v2 with a simple web interface. Includes multi-GPUs support.
Hands-on workshop CUDA-Q NVIDIA in RWTH Aachen University & Technische Universität Berlin, June 2024.
Multi-GPU & CPU OpenCL kernel executor with load-balancing as if there is one big GPU.
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.
Add a description, image, and links to the multi-gpu topic page so that developers can more easily learn about it.
To associate your repository with the multi-gpu topic, visit your repo's landing page and select "manage topics."