STRAND

RNA-Protein Complex Refinement via Diffusion

Installation

Clone the repo:

git clone https://github.com/zeqri/STRAND.git

Create and activate conda environment:

conda create -n STRAND python=3.9.18 
conda activate STRAND 
pip install -r requirements.txt

Replicating Paper Results

📁 Dataset Setup

Download the preprocessed structures used in our experiments:

👉 Download Structures from Google Drive

Download the test_data.zip file
Place it in the datasets directory
Extract the archive:

unzip datasets/test_data.zip -d datasets/

🚀 Running Experiments

Choose your inference mode based on whether you want to use the confidence model:

Manual Selection (Without Confidence Model)

Run structure refinement with manual selection on any of the three benchmark datasets:

# RNA-Pro dataset
sh src/inference_manual.sh rnapro

# Non-X-ray dataset  
sh src/inference_manual.sh nonxray

# X-ray dataset
sh src/inference_manual.sh xray

Automated Selection (With Confidence Model)

Run structure refinement using the confidence model for automated structure selection:

# RNA-Pro dataset
sh src/inference_conf.sh rnapro

# Non-X-ray dataset
sh src/inference_conf.sh nonxray

# X-ray dataset
sh src/inference_conf.sh xray

📊 Results

Refined structures and evaluation metrics will be saved in the results/ directory, organized by dataset and method used.

Training

Download PDB files containing RNA-protein complexe before the cutoff date 30.Sept.2021 via the website and store them into datasets/pdb_files or run the command:

mk dir datasets/pdb_files 
sh datasets/batch_download.sh -f  datasets/list_file.txt -p -o datasets/pdb_files

All the data must be stored as dill files, to do so run:

python src/data/preprocessing/cache_data.py --dir_path datasets/pdb_files --save_path datasets/train/af3_1022P_1022R

Strand tr+rot utalized data augmentation during training, to augment the data run:

sh src/data/preprocessing/data_aug.sh

🎯 Default Training Configuration

By default, STRAND trains with translation + rotation (STRAND-tr+rot).

⚙️ Custom Training Configurations

To train with different spatial transformations, modify the boolean arguments in src/train.sh:

# Available options:
--translation  True  # Enable translation refinement
--rotation     True # Enable rotation refinement  
--torsion      True # Enable torsion angle refinement

🚀 Starting Training

Score model:

Set Data_file and Data_path variables in src/train.sh.
Configure your desired spatial transformations in src/train.sh.
Run the training script:

sh src/train.sh

Note: Training requires preprocessed datasets and sufficient computational resources (GPU recommended).

Generate samples:

After obtaining an optimised SCORE MODEL, use it to generate samples via:

sh src/generate_samples.sh

Confidence model:

Use the generated samples to train the confidence model and run:

sh src/train_confidence.sh

Inference

Store the structures to be refined as dill files using src/data/preprocessing/cache_data.py.

Specify the path of the stored data set to be refined and it's corrosponding csv file in the variables Data_path and Data_file respectively in the file src/train_confidence.sh.

Set --run_inference_without_confidence_model to be True to run the inference without the confidence model.

Run the inferecne porcess via:

sh src/inference.sh

Set --run_inference_without_confidence_model to be False to run the inference without the confidence model.

sh src/inference.sh

After running the inference visualization directories are created containing the generated samples. Defualt path is visualization/STRAND

To assess how well the refined samples are, Downdload the Ground Truth files that were refined from the PDB as .pdb files and store them indatasets/gt_dir.

Manual Selection results:

To display manual selection results run:

python src/visualize_inf_manual.py  --gt_path datasets/gt_dir --samples_path visualization/STRAND

Selection via confidence model:

To display the confidence model selection results run:

python src/visualize_inf_conf.py  --gt_path datasets/gt_dir --samples_path visualization/STRAND

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
config		config
datasets		datasets
models		models
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

STRAND

Installation

Replicating Paper Results

📁 Dataset Setup

🚀 Running Experiments

Manual Selection (Without Confidence Model)

Automated Selection (With Confidence Model)

📊 Results

Training

🎯 Default Training Configuration

⚙️ Custom Training Configurations

🚀 Starting Training

Inference

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

zeqri/STRAND

Folders and files

Latest commit

History

Repository files navigation

STRAND

Installation

Replicating Paper Results

📁 Dataset Setup

🚀 Running Experiments

Manual Selection (Without Confidence Model)

Automated Selection (With Confidence Model)

📊 Results

Training

🎯 Default Training Configuration

⚙️ Custom Training Configurations

🚀 Starting Training

Inference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages