FlashRNA - An Efficient Model for Regulatory Genomics

An efficient genomic sequence-to-function model that significantly improves computational and memory efficiency while maintaining high predictive performance.

Overview

FlashRNA is a sequence-to-function model that significantly improves computational efficiencies of existing transformer-based models in regulatory genomics while maintaining competitive performance. FlashRNA leverages FlashAttention to address high computational costs associated with self-attention layers. Combined with additional improvements in model architecture training setup, FlashRNA significantly reduces training and inference speed. Notably, FlashRNA can be trained from scratch without depending on another pre-trained model.

FlashRNA is trained on a large set of functional genomics tracks, including RNA-seq, DNase-seq, and ATAC-seq. Here, we train FlashRNA with the open-sourced that was processed and shared by the authors of Borzoi, after some additional processing. However, the model can be trained from scratch on any similar genomic track dataset.

This repository contains the FlashRNA model code, instructions for downloading model weights, and example usage code. We plan on open-sourcing code for model training and evaluation shortly, along with more details on our approach.

Setup

Installation

conda env create -f environment.yml
conda activate flashrna
pip install .

Usage

We have provided the model itself for use. Open-source training and inference setup is coming soon.

Important: FlashRNA uses FlashAttention and currently only runs on GPUs that support FlashAttention. Support for GPUs incompatible with FlashAttention is coming soon.

Example Usage

Please check out examples/basic_setup.ipynb

More comprehensive usage examples, along with useful helper functions for inference, will be added soon!

Pre-trained Models

Pre-trained FlashRNA models are available as Wandb Artifacts:

Single FlashRNA models (4 replicates):

deep-genomics-open-source/FlashRNA/single-model-rep-1:v0 (link)
deep-genomics-open-source/FlashRNA/single-model-rep-2:v0 (link)
deep-genomics-open-source/FlashRNA/single-model-rep-3:v0 (link)
deep-genomics-open-source/FlashRNA/single-model-rep-4:v0 (link)

Distilled model:

deep-genomics-open-source/FlashRNA/distilled-model:v0 (link)

These models can be loaded using the following methods:

from flash_rna.models import FlashRNA

# Requires Wandb login
model = FlashRNA.from_ckpt(wandb_artifact="deep-genomics-open-source/FlashRNA/single-model-rep-1:v0")

# Alternatively, manually download the checkpoint files from URLs
model = FlashRNA.from_ckpt(ckpt_path="path/to/downloaded/model.ckpt")

Contact

Please contact andrew.jung (at) deepgenomics.com, andrewjung (at) psi.toronto.edu, or open a GitHub issue for questions.

Preprint and Citation

To help understand FlashRNA, we are sharing a short preprint: https://www.biorxiv.org/content/10.1101/2025.10.14.682350v1

We will be sharing an updated version with more details soon!

@article {Jung2025.10.14.682350,
        author = {Jung, Andrew J and Zhu, Helen and Gao, Alice J and Li, Roujia and Slobodyanyuk, Mykhaylo and Chu, Vivian and Lim, Declan and Lee, Leo J and Celaj, Albi and Frey, Brendan J},
        title = {FlashRNA: An Efficient Model for Regulatory Genomics},
        elocation-id = {2025.10.14.682350},
        year = {2025},
        doi = {10.1101/2025.10.14.682350},
        publisher = {Cold Spring Harbor Laboratory},
}

Acknowledgements

We thank the authors of Borzoi for providing a comprehensive open-sourced repo to reproduce their dataset, model, training, and evaluatons. We also like to thank the authors Flashzoi for sharing PyTorch version of Borzoi, implementing efficient relative shift operation, and their model Flashzoi.

If you use FlashRNA in your research, please also consider citing them.

This work was also pursued as part of the first author's academic research at the University of Toronto. We would like to acknowledge Deep Genomics and the Vector Institute for compute resources.

(C) Deep Genomics Inc. (2025) FlashRNA is licensed under the Apache License, Version 2.0 (the "License"); you may not use FlashRNA except in compliance with the License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
examples		examples
flash_rna		flash_rna
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlashRNA - An Efficient Model for Regulatory Genomics

Overview

Setup

Installation

Usage

Example Usage

Pre-trained Models

Contact

Preprint and Citation

Acknowledgements

About

Uh oh!

Uh oh!

Languages

License

deepgenomics/flashrna

Folders and files

Latest commit

History

Repository files navigation

FlashRNA - An Efficient Model for Regulatory Genomics

Overview

Setup

Installation

Usage

Example Usage

Pre-trained Models

Contact

Preprint and Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages