Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.

Cuda 22 3 Updated Apr 25, 2024

hpides / multi-gpu-sorting

This repository contains the source code for our ACM SIGMOD '22 paper (Evaluating Multi-GPU Sorting with Modern Interconnects)

Cuda 5 1 Updated Apr 26, 2022

hummingtree / dmma-ptx-gemm

Cuda 4 2 Updated Mar 3, 2021

apuaaChen / sparse_transformer_sc21

Cuda 2 3 Updated May 30, 2021

qin-yu / cuda-multi-grid-sync

try newly released `cudaLaunchCooperativeKernelMultiDevice()` in CUDA C++

Cuda 2 1 Updated May 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Muhammet Soytürk mabdullahsoyturk

Achievements

Achievements

Organizations

Block or report mabdullahsoyturk

Lists (2)

Mathematical Optimization

Python Tools

Stars

NVIDIA / cub

NVIDIA / nccl-tests

udacity / cs344

olcf / cuda-training-series

NVIDIA / multi-gpu-programming-models

baidu-research / baidu-allreduce

FZJ-JSC / tutorial-multi-gpu

tbennun / cudnn-training

anilshanbhag / gpu-topk

CUDA-Tutorial / CodeSamples

uuudown / Tartan

poojahira / spmv-cuda

prg-titech / dynasoar

apuaaChen / vectorSparse

UDC-GAC / openCNN

CGCL-codes / Graphchallenge21

ParCoreLab / CPU-Free-model

hpides / multi-gpu-sorting

hummingtree / dmma-ptx-gemm

apuaaChen / sparse_transformer_sc21

qin-yu / cuda-multi-grid-sync