Skip to content

LyleLuo/ROME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities

This is the repository for the PPoPP'26 paper: ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities.

ROME is a GPU-optimized All-Pairs Shortest Path (APSP) system that regularizes irregular graph workloads via spatial computation restructuring and an asynchronous execution pipeline to achieve high hardware utilization.

Requirements

Hardware: NVIDIA GPU

We have tested our project on a server equipped with NVIDIA RTX 4090 (Ada Lovelace architecture, 24 GB).

OS & Software:

  • Ubuntu 20.04
  • CUDA >= 12.0
  • flex (2.6.4)
  • Bison (3.5.1)

Setup

# Clone Repo
git clone --recurse-submodules https://github.com/LyleLuo/ROME.git

# Build METIS dependency
cd modules/metis
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DUSE_THREAD=Y -Dintsize=64 ..
cmake --build .
cd ../../..

# Build Scotch dependency
sudo apt install -y flex bison
cd modules/scotch
mkdir -p build && cd build
cmake .. -DBUILD_FORTRAN=OFF -DBUILD_PTSCOTCH=OFF -DINTSIZE=64 -DSCOTCH_DETERMINISTIC=FULL
make -j4
cd ../../..

# Build ROME
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DUSE_THREAD=Y ..
cmake --build .

How to Run

Test Graphs

We support Matrix Market format (.mtx files). A sample graph delaunay_n13.mtx is provided in the mtx/ directory.

ROME

To run the experiments in the paper, please follow the instructions below:

OMP_NUM_THREADS=32 CUDA_VISIBLE_DEVICES=0 "$EXECUTABLE" "$MTX_PATH" ignore

For example:

OMP_NUM_THREADS=32 CUDA_VISIBLE_DEVICES=0 ./rome mtx/delaunay_n13.mtx ignore

Engines

ROME's core design includes the Merge Engine, Batch and Tile Engine, and Async Pipeline. These features can be manually disabled before building. In src/APSP/rome.cu, the following switches are available:

  • If NO_ASYNC is defined, the Pipeline will be disabled.
  • If BATCH_TILE_ENGINE or MERGE_ENGINE is set to false, those engines will be disabled.
//#define NO_ASYNC
const bool BATCH_TILE_ENGINE = true;
const bool MERGE_ENGINE = true;

Please refer to the paper for details on the functionality of these engines.

Paper

Please refer to this paper for more details.

@inproceedings{luo2026rome,
  title={ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities},
  author={Luo, Weile and Chen, Yuhan and Yu, Xiangrui and Wang, Qiang and Fan, Ruibo and Liu, Hongyuan and Chu, Xiaowen},
  booktitle={Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming},
  year={2026}
}

Related Code

  • SuperFW: Acknowledgment - We reused some code from this repository that is not related to ROME's core framework, and integrated its SuperFW implementation into ROME for correctness verification.
  • FastAPSP
  • cuGraph

About

ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors