LKE-DTA

LKE-DTA: Predicting Drug–Target Binding Affinity

LKE-DTA is an innovative framework designed to predict drug–target binding affinity by integrating Large Language Model (LLM) representations and Knowledge Graph Embeddings. By combining natural language processing techniques and graph learning methods, LKE-DTA offers a robust approach for modeling and predicting drug–target interactions with high accuracy.

This program involves two separate environments to extract drug and protein feature representations and model training.

Feature Embedding Extraction

Environment 1: Drug Feature Embedding

Requirements

Python 3.7
numpy 1.21.5
pandas 1.3.5
torch 1.2.0
dgl 0.4

Step 1: Drug Feature Embedding (TransE + iBKH)

Generate Triplets
Run the following script to prepare the knowledge graph triplets:

python ibkh-embedding.py

Train TransE Model using iBKH Knowledge Graph

Run the appropriate command based on the dataset you are using:

For Davis dataset:

DGLBACKEND=pytorch dglke_train --dataset iBKH --data_path ./data/davis --data_files training_triplets.tsv --format raw_udd_hrt --model_name TransE_l2 --batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 10000 --log_interval 100 --batch_size_eval 1000 -adv --regularization_coef 1.00E-07 --num_thread 1 --num_proc 8 --neg_sample_size_eval 1000

For KIBA dataset:

DGLBACKEND=pytorch dglke_train --dataset iBKH --data_path ./data/kiba --data_files training_triplets.tsv --format raw_udd_hrt --model_name TransE_l2 --batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 10000 --log_interval 100 --batch_size_eval 1000 -adv --regularization_coef 1.00E-07 --num_thread 1 --num_proc 8 --neg_sample_size_eval 1000

Semantic Representation Extraction

Environment 2: Semantic Representation and Model Training

Requirements

Python 3.10
esm 2.0.0
torch 2.5.1
dashscope
torch_geometric 2.6.1

Step 2: Drug Semantic Representation (Qwen)

register for the Qwen API and set your API key

export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"

Run Drug Representation Script

python qwen_representation.py

Step 3: Protein Feature Representation (ESM)

Using the same environment:

python esm_representation.py

DTA Model Training

Step 4: Training

Utilize the same Python Environment 2 and Use Distributed Data Parallel (DDP) for multi-GPU training. Run the training script with 4 GPUs (e.g., 4×3090):

torchrun --nproc-per-node=4 training.py

Conclusion

This README outlines the complete setup and execution pipeline for the LKE-DTA project. For further support or advanced configuration, please consult the repositories for the following components:

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
ckpts		ckpts
data		data
images		images
.gitignore		.gitignore
LICENSE		LICENSE
LKEDTA.py		LKEDTA.py
Model_MHA.py		Model_MHA.py
README.md		README.md
esm_representation.py		esm_representation.py
ibkh_embedding.py		ibkh_embedding.py
preparation.py		preparation.py
qwen_representation.py		qwen_representation.py
training.py		training.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LKE-DTA

LKE-DTA: Predicting Drug–Target Binding Affinity

Feature Embedding Extraction

Environment 1: Drug Feature Embedding

Requirements

Step 1: Drug Feature Embedding (TransE + iBKH)

Semantic Representation Extraction

Environment 2: Semantic Representation and Model Training

Requirements

Step 2: Drug Semantic Representation (Qwen)

Step 3: Protein Feature Representation (ESM)

DTA Model Training

Step 4: Training

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Cory-mjl/LKE-DTA

Folders and files

Latest commit

History

Repository files navigation

LKE-DTA

LKE-DTA: Predicting Drug–Target Binding Affinity

Feature Embedding Extraction

Environment 1: Drug Feature Embedding

Requirements

Step 1: Drug Feature Embedding (TransE + iBKH)

Semantic Representation Extraction

Environment 2: Semantic Representation and Model Training

Requirements

Step 2: Drug Semantic Representation (Qwen)

Step 3: Protein Feature Representation (ESM)

DTA Model Training

Step 4: Training

Conclusion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages