GReNIMJA: Gene Regulatory Network Inference by Mixing and Jointing features of Amino acid and nucleotide sequences

This is the code for Mixing features of transcription factors and genes enables accurate prediction of gene regulation relationships for unknown transcription factors. This project is carried out in Funahashi Lab. at Keio University.

Overview

A key point of this study is that GReNIMJA was designed, not to predict the specific TFs from genes, to predict whether the regulatory relationships exist or not from both of the amino acid sequences of TFs and nucleic acid sequences of genes.　　

Our model extracted features from both the amino acid sequences of TFs and the nucleotide sequences of target genes, mixed these features using a 2D LSTM architecture, and performed binary classification to predict the presence or absence of regulatory relationships.

The detailed information on this code is described in our paper published on Mixing features of transcription factors and genes enables accurate prediction of gene regulation relationships for unknown transcription factors.

Requirements

We have confirmed that our code works correctly on Ubuntu 22.04.

See requirements.txt for details.

QuickStart

1. Download this repository by `git clone`.

% git clone git@github.com:funalab/GReNIMJA.git

2. Install requirements

% cd GReNIMJA
% python -m venv venv
% source ./venv/bin/activate
% pip install --upgrade pip
% pip install -r requirements.txt

3. Download datasets, embeddings, and learned models.

[NOTE]
Before downloading the related files, please check the available storage space.
When you download and extract the tar.gz files, you will need at least approximately 9.8 GB of storage space.

On Linux:

% bash downloads/download_linux.sh

On macOS:

% bash downloads/download_mac.sh

All datasets constructed in this study can be obtained by the above command.
If you want to know how the data was constructed, please see scripts/ for details.
These scripts are mainly written in Shell and R.

4. Run the model (Evaluation of model performance for each unknown TF in the prediction of regulatory relationships)

On GPU (Specify GPU ID):

% python src/test.py --gpu_id 0

On CPU (Negative value of GPU ID indicates CPU)):

% python src/test.py --gpu_id -1

The processing time of above example will be about 2 hours on GPU (NVIDIA A100 40GB PCIe).

You can also set the specific path.
To inference for known transcription factors,

% python src/test.py --gpu_id 0 --model_path ./models/known_best_model.pt --embeddings_path ./embeddings --dataset_path ./datasets/known_TFs/test_dataset.pickle --save_path ./results/known_test

To inference for unknown transcription factors,

% python src/test.py --gpu_id 0 --model_path ./models/unknown_best_model.pt --embeddings_path ./embeddings --dataset_path ./datasets/unknown_TFs/test_dataset.pickle --save_path ./results/unknown_test

Acknowledgement

The research was funded by JST CREST, Japan Grant Number JPMJCR2011 to Tetsuya J. Kobayashi and Akira Funahashi.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
downloads		downloads
raw		raw
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GReNIMJA: Gene Regulatory Network Inference by Mixing and Jointing features of Amino acid and nucleotide sequences

Overview

Requirements

QuickStart

1. Download this repository by `git clone`.

2. Install requirements

3. Download datasets, embeddings, and learned models.

4. Run the model (Evaluation of model performance for each unknown TF in the prediction of regulatory relationships)

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

funalab/GReNIMJA

Folders and files

Latest commit

History

Repository files navigation

GReNIMJA: Gene Regulatory Network Inference by Mixing and Jointing features of Amino acid and nucleotide sequences

Overview

Requirements

QuickStart

1. Download this repository by git clone.

2. Install requirements

3. Download datasets, embeddings, and learned models.

4. Run the model (Evaluation of model performance for each unknown TF in the prediction of regulatory relationships)

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Download this repository by `git clone`.

Packages