Skip to content

DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response

Notifications You must be signed in to change notification settings

alaaj27/DeepVul

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 DeepVul: Multi-Task Transformer for Gene Essentiality and Drug Response

DeepVul is a multi-task transformer-based model designed to jointly predict gene essentiality and drug response using gene expression data. The model uses a shared feature extractor to learn robust biological representations that can be fine-tuned for downstream tasks, such as gene knockout effect prediction or treatment sensitivity profiling.


📑 Table of Contents


🚀 Features

  • Joint prediction of gene essentiality and drug response
  • Shared transformer encoder for multi-task learning
  • Flexible modes: pre-training only, fine-tuning only, or both
  • Compatible with public omics and pharmacogenomic datasets
  • Fully configurable via command-line arguments

📦 Installation

Make sure you have conda installed. Then run:

conda env create --file condaenv.yml
conda activate condaenv

📊 Datasets

To run DeepVul, download the following datasets and place them in the data/ directory:

Dataset Description Source
Gene Expression TPM-log transformed gene expression data Download
Gene Essentiality CRISPR-Cas9 knockout effect scores Download
Drug Response PRISM log-fold change drug response Download
Sanger Essentiality CERES gene effect data from Sanger Download
Somatic Mutation Mutation profiles for CCLE lines Download

⚙️ Hyperparameters

DeepVul supports flexible training via CLI arguments:

Parameter Default Description
--pretrain_batch_size 20 Batch size during pre-training
--finetuning_batch_size 20 Batch size during fine-tuning
--hidden_state 500 Size of transformer hidden layers
--pre_train_epochs 20 Pre-training epochs
--fine_tune_epochs 20 Fine-tuning epochs
--opt Adam Optimizer type
--lr 0.0001 Learning rate
--dropout 0.1 Dropout rate
--nhead 2 Number of attention heads
--num_layers 2 Transformer encoder layers
--dim_feedforward 2048 Feedforward network size
--fine_tuning_mode freeze-shared Whether to freeze shared layers during fine-tuning
--run_mode pre-train / fine-tune / both Execution mode

🏃 Running the Model

Change directory into the src folder:

cd src

Pre-training

python run_deepvul.py --run_mode pre-train ...

Fine-tuning

python run_deepvul.py --run_mode fine-tune ...

Full Pipeline (Pre-train + Fine-tune)

python run_deepvul.py --run_mode both ...

Customize the CLI options as needed based on your experiment setup.


🧠 Additional Information

  • Source code for model architecture, training, and evaluation is located in the src/ directory.
  • If you encounter issues or have questions, please open a GitHub Issue or contact the maintainers.
  • Model interpretation and evaluation scripts are included in the repo.

📄 Citation

If you use DeepVul in your work, please cite:

@article {Jararweh2024.10.17.618944,
  author = {Jararweh, Ala and Arredondo, David and Macaulay, Oladimeji and Dicome, Mikaela and Tafoya, Luis and Hu, Yue and Virupakshappa, Kushal and Boland, Genevieve and Flaherty, Keith and Sahu, Avinash},
  title = {DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response},
  elocation-id = {2024.10.17.618944},
  year = {2024},
  doi = {10.1101/2024.10.17.618944},
  publisher = {Cold Spring Harbor Laboratory},
  URL = {https://www.biorxiv.org/content/early/2024/10/21/2024.10.17.618944},
  journal = {bioRxiv}
}

About

DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages