GitHub - arc-research-lab/AGILE: AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration (SC25)

AGILE

AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration

Citation

Paper is available at arXiv

@inproceedings{sc25agile,
author = {Yang, Zhuoping and Zhuang, Jinming and Chen, Xingzhen and Jones, Alex K and Zhou, Peipei},
title = {AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration},
year = {2025},
booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025},
series = {Supercomputing '25}
}

Thanks for your interest in this project. Your growing engagement will inspire us to improve and enhance AGILE continually.

AGILE

🎉 AGILE tutorial examples are keeping updating now! See ./tutorial 👀

Note: The benchmarks currently use deprecated APIs to initialize SSDs, which may not work and will be updated soon. The temporary solution is to refer to the host code in the tutorial folder and use the CUDA kernel code from the benchmarks.

Installation

AGILE requires a modified version of GDRCopy, which is included in this repo (./driver/gdrcopy). Please follow the instructions to build and install it.

Note: The requirement on this modified GDRCopy drvier will be removed soon.

To use AGILE, you must backup all data and switch to the AGILE NVMe driver for the target NVMe SSDs. Check switch to AGILE driver for more details.

AGILE relies on the GPUs' BAR1 Memory as the source and destination in GPU-SSD peer-to-peer communication. If the default BAR1 memory size is too small (typically 128MB), please refer NVIDIA Display Mode Selector Tool (1.67.0) to increase the BAR1 memory size.

Disable IOMMU in /etc/default/grub by adding intel_iommu=off flag to GRUB_CMDLINE_LINUX_DEFAULT. Then, update grub (sudo update-grub) and reboot (sudo reboot) the machine.

Experiments

AGILE has been evaluated on a Dell R750 server running Ubuntu 20.04, equipped with an Nvidia RTX 5000 Ada GPU, a Dell Ent NVMe AGN MU AIC 1.6TB SSD, and two Samsung 990 PRO 1TB SSDs. The Nvidia Driver version is 550.54, and the CUDA version is 12.8.

For setting up the baseline BaM, please refer to https://github.com/ZaidQureshi/bam. The BaM version baselines can be found at ./baseline/benchmarks.

Experimental results in Figure 4 - 12

Table: Experimental Bash scripts for reproducing results for Figure 4 - 11.

Figures	Corresponding Scripts
Figure 4	`run_ctc.sh`
Figure 5	`rand_read.sh`
Figure 6	`rand_write.sh`
Figure 7 - 10	`run_dlrm.sh` & `auto_dlrm.sh`
Figure 11	`run_bfs.sh` & `run_spmv.sh`

Todo-List

We will keep updating AGILE with more features, and you are more than welcome to request more functionalities. Currently, we have the following plans for improving AGILE:

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
baseline		baseline
benchmarks		benchmarks
bin		bin
common		common
cpu_verify		cpu_verify
driver		driver
experiments		experiments
figures		figures
include		include
scripts		scripts
tools		tools
tutorial		tutorial
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration

Citation

Paper is available at arXiv

Thanks for your interest in this project. Your growing engagement will inspire us to improve and enhance AGILE continually.

🎉 AGILE tutorial examples are keeping updating now! See ./tutorial 👀

Installation

Experiments

Experimental results in Figure 4 - 12

Table: Experimental Bash scripts for reproducing results for Figure 4 - 11.

Todo-List

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

arc-research-lab/AGILE

Folders and files

Latest commit

History

Repository files navigation

AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration

Citation

Paper is available at arXiv

Thanks for your interest in this project. Your growing engagement will inspire us to improve and enhance AGILE continually.

🎉 AGILE tutorial examples are keeping updating now! See ./tutorial 👀

Installation

Experiments

Experimental results in Figure 4 - 12

Table: Experimental Bash scripts for reproducing results for Figure 4 - 11.

Todo-List

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages