LKE-DTA is an innovative framework designed to predict drug–target binding affinity by integrating Large Language Model (LLM) representations and Knowledge Graph Embeddings. By combining natural language processing techniques and graph learning methods, LKE-DTA offers a robust approach for modeling and predicting drug–target interactions with high accuracy.
This program involves two separate environments to extract drug and protein feature representations and model training.
- Python 3.7
- numpy 1.21.5
- pandas 1.3.5
- torch 1.2.0
- dgl 0.4
- Generate Triplets
Run the following script to prepare the knowledge graph triplets:
python ibkh-embedding.py- Train TransE Model using iBKH Knowledge Graph
Run the appropriate command based on the dataset you are using:
For Davis dataset:
DGLBACKEND=pytorch dglke_train --dataset iBKH --data_path ./data/davis --data_files training_triplets.tsv --format raw_udd_hrt --model_name TransE_l2 --batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 10000 --log_interval 100 --batch_size_eval 1000 -adv --regularization_coef 1.00E-07 --num_thread 1 --num_proc 8 --neg_sample_size_eval 1000For KIBA dataset:
DGLBACKEND=pytorch dglke_train --dataset iBKH --data_path ./data/kiba --data_files training_triplets.tsv --format raw_udd_hrt --model_name TransE_l2 --batch_size 1024 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 10000 --log_interval 100 --batch_size_eval 1000 -adv --regularization_coef 1.00E-07 --num_thread 1 --num_proc 8 --neg_sample_size_eval 1000- Python 3.10
- esm 2.0.0
- torch 2.5.1
- dashscope
- torch_geometric 2.6.1
- register for the Qwen API and set your API key
export DASHSCOPE_API_KEY="YOUR_DASHSCOPE_API_KEY"- Run Drug Representation Script
python qwen_representation.pyUsing the same environment:
python esm_representation.pyUtilize the same Python Environment 2 and Use Distributed Data Parallel (DDP) for multi-GPU training. Run the training script with 4 GPUs (e.g., 4×3090):
torchrun --nproc-per-node=4 training.pyThis README outlines the complete setup and execution pipeline for the LKE-DTA project. For further support or advanced configuration, please consult the repositories for the following components: