🔬 DeepVul: Multi-Task Transformer for Gene Essentiality and Drug Response

DeepVul is a multi-task transformer-based model designed to jointly predict gene essentiality and drug response using gene expression data. The model uses a shared feature extractor to learn robust biological representations that can be fine-tuned for downstream tasks, such as gene knockout effect prediction or treatment sensitivity profiling.

📑 Table of Contents

🚀 Features

Joint prediction of gene essentiality and drug response
Shared transformer encoder for multi-task learning
Flexible modes: pre-training only, fine-tuning only, or both
Compatible with public omics and pharmacogenomic datasets
Fully configurable via command-line arguments

📦 Installation

Make sure you have conda installed. Then run:

conda env create --file condaenv.yml
conda activate condaenv

📊 Datasets

To run DeepVul, download the following datasets and place them in the data/ directory:

Dataset	Description	Source
Gene Expression	TPM-log transformed gene expression data	Download
Gene Essentiality	CRISPR-Cas9 knockout effect scores	Download
Drug Response	PRISM log-fold change drug response	Download
Sanger Essentiality	CERES gene effect data from Sanger	Download
Somatic Mutation	Mutation profiles for CCLE lines	Download

⚙️ Hyperparameters

DeepVul supports flexible training via CLI arguments:

Parameter	Default	Description
`--pretrain_batch_size`	20	Batch size during pre-training
`--finetuning_batch_size`	20	Batch size during fine-tuning
`--hidden_state`	500	Size of transformer hidden layers
`--pre_train_epochs`	20	Pre-training epochs
`--fine_tune_epochs`	20	Fine-tuning epochs
`--opt`	Adam	Optimizer type
`--lr`	0.0001	Learning rate
`--dropout`	0.1	Dropout rate
`--nhead`	2	Number of attention heads
`--num_layers`	2	Transformer encoder layers
`--dim_feedforward`	2048	Feedforward network size
`--fine_tuning_mode`	freeze-shared	Whether to freeze shared layers during fine-tuning
`--run_mode`	pre-train / fine-tune / both	Execution mode

🏃 Running the Model

Change directory into the `src` folder:

cd src

Pre-training

python run_deepvul.py --run_mode pre-train ...

Fine-tuning

python run_deepvul.py --run_mode fine-tune ...

Full Pipeline (Pre-train + Fine-tune)

python run_deepvul.py --run_mode both ...

Customize the CLI options as needed based on your experiment setup.

🧠 Additional Information

Source code for model architecture, training, and evaluation is located in the src/ directory.
If you encounter issues or have questions, please open a GitHub Issue or contact the maintainers.
Model interpretation and evaluation scripts are included in the repo.

📄 Citation

If you use DeepVul in your work, please cite:

@article {Jararweh2024.10.17.618944,
  author = {Jararweh, Ala and Arredondo, David and Macaulay, Oladimeji and Dicome, Mikaela and Tafoya, Luis and Hu, Yue and Virupakshappa, Kushal and Boland, Genevieve and Flaherty, Keith and Sahu, Avinash},
  title = {DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response},
  elocation-id = {2024.10.17.618944},
  year = {2024},
  doi = {10.1101/2024.10.17.618944},
  publisher = {Cold Spring Harbor Laboratory},
  URL = {https://www.biorxiv.org/content/early/2024/10/21/2024.10.17.618944},
  journal = {bioRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Examples		Examples
data		data
src		src
README.md		README.md
condaenv.yml		condaenv.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔬 DeepVul: Multi-Task Transformer for Gene Essentiality and Drug Response

📑 Table of Contents

🚀 Features

📦 Installation

📊 Datasets

⚙️ Hyperparameters

🏃 Running the Model

Change directory into the `src` folder:

Pre-training

Fine-tuning

Full Pipeline (Pre-train + Fine-tune)

🧠 Additional Information

📄 Citation

About

Uh oh!

Releases

Packages

Languages

alaaj27/DeepVul

Folders and files

Latest commit

History

Repository files navigation

🔬 DeepVul: Multi-Task Transformer for Gene Essentiality and Drug Response

📑 Table of Contents

🚀 Features

📦 Installation

📊 Datasets

⚙️ Hyperparameters

🏃 Running the Model

Change directory into the src folder:

Pre-training

Fine-tuning

Full Pipeline (Pre-train + Fine-tune)

🧠 Additional Information

📄 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Change directory into the `src` folder:

Packages