0% found this document useful (0 votes)
21 views48 pages

Supplementary Information

The document discusses a study on rapid traversal of chemical space using machine learning-guided docking screens, focusing on conformal prediction and molecular docking techniques. It includes detailed supplementary information on model architectures, hyperparameters, and performance evaluations across various protein targets, highlighting the effectiveness of classifiers like CatBoost, DNN, and RoBERTa. The findings indicate that the conformal prediction framework can accurately classify compounds as virtual active or inactive, with performance varying based on target characteristics and data quality.

Uploaded by

1933346604bobos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views48 pages

Supplementary Information

The document discusses a study on rapid traversal of chemical space using machine learning-guided docking screens, focusing on conformal prediction and molecular docking techniques. It includes detailed supplementary information on model architectures, hyperparameters, and performance evaluations across various protein targets, highlighting the effectiveness of classifiers like CatBoost, DNN, and RoBERTa. The findings indicate that the conformal prediction framework can accurately classify compounds as virtual active or inactive, with performance varying based on target characteristics and data quality.

Uploaded by

1933346604bobos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

nature computational science

Supplementary information
Article https://doi.org/10.1038/s43588-025-00777-x

Rapid traversal of vast chemical space using


machine learning-guided docking screens
In the format provided by the
authors and unedited
Table of Contents
Supplementary Sections
1. Conformal prediction and molecular docking S2
Supp. Figure 1. Overview of the conformal prediction workflow. S3
Supp. Table 1. Protein preparation for molecular docking. S4
Supp. Figure 2. Partial charge redistribution scheme in amino acid residues. S5

2. Hyperparameters, architectures and molecular descriptors S6


Supp. Table 2. Model hyperparameters. S7
Supp. Figure 3. Learning rate and weight decay analysis for DNNs. S8
Supp. Figure 4. Architecture analysis for DNNs. S9
Supp. Figure 5. Learning rate analysis for RoBERTa. S10
Supp. Table 3. Sensitivity and training set size - Morgan2. S11
Supp. Table 4. Precision and training set size - Morgan2. S11
Supp. Table 5. Sensitivity and training size - CDDD. S12
Supp. Table 6. Precision and training size - CDDD. S12
Supp. Table 7. Sensitivity and training size - RoBERTa. S13
Supp. Table 8. Precision and training size - RoBERTa. S13
Supp. Table 9. Training and prediction times. S13
Supp. Figure 6. Performance and number of aggregated models. S14
Supp. Figure 7. Performance on imbalanced datasets. S15
Supp. Figure 8. Overview of noise addition. S16
Supp. Figure 9. Performance on noisy datasets. S17
Supp. Figure 10. Performance on non-sensical datasets - labels. S18
Supp. Figure 11. Performance on non-sensical datasets - features. S19
Supp. Figure 12. Non-exchangeable datasets. S20
Supp. Figure 13. Correlation between quality of information and docking rank. S21
Supp. Figure 14. Structural diversity analysis in top-ranked molecules. S22

Supplementary Tables S23


Supp. Table 10. Chemical structures and D2R radioligand displacement data. S23
Supp. Table 11. Chemical novelty of discovered D2R ligands. S25
Supp. Table 12. Chemical structures and A2AR radioligand displacement data. S26
Supp. Table 13. Chemical novelty of discovered A2AR ligands. S31
Supp. Table 14. Chemical novelty of discovered A2AR-D2R dual target ligand. S30

Supplementary Figures S31


Supp. Figure 15. Binding assay curves of discovered D2R ligands. S31
Supp. Figure 16. Functional assay curves of discovered D2R ligands. S32
Supp. Figure 17. Ligand enrichment curves for A2AR and D2R models. S33
Supp. Figure 18. Binding assay curves of discovered A2AR ligands. S34
Supp. Figure 19. LC-MS data for compound 1. S35
Supp. Figure 20. LC-MS data for compound 2. S36
Supp. Figure 21. LC-MS data for compound 3. S37
Supp. Figure 22. LC-MS data for compound 4. S38
Supp. Figure 23. LC-MS data for compound 5. S39
Supp. Figure 24. LC-MS data for compound 6. S40
Supp. Figure 25. 1H-NMR data for compound 1. S41
Supp. Figure 26. 1H-NMR data for compound 2. S42
Supp. Figure 27. 1H-NMR data for compound 3. S43
Supp. Figure 28. 1H-NMR data for compound 4. S44
Supp. Figure 29. 1H-NMR data for compound 5. S45
Supp. Figure 30. 1H-NMR data for compound 6. S46

Supplementary References S47

S1
Supplementary Section 1: Conformal prediction and molecular docking

An overview of the conformal prediction framework is provided in Supplementary Figure


1. The chosen proteins for our benchmarking set represented diverse types of protein
folds, binding sites, protein-ligand interactions, and ligand chemotypes. G protein-coupled
receptors (GPCRs) were represented by the A2A adenosine receptor (A2AR) and the D2
dopamine receptor (D2R). The SARS-CoV-2 main protease (Mpro), 8-oxoguanine
glycosylase 1 (OGG1), ecto-5’-nucleotidase (5’-NT), and AmpC β-lactamase (AmpC)
exemplified different types of soluble enzymes. Finally, the Kelch-like ECH-associated
protein 1 (KEAP1) and Sortilin (SORT1) represented protein-protein interaction
interfaces. Molecular docking performance was optimized for each target by fine-tuning
parameters that influence the scoring function in DOCK3.7, enhancing enrichment of
known ligands over matched decoys. The selected docking parameters are presented in
Supplementary Table 1 and Supplementary Figure 2.

S2
Supplementary Figure 1. Overview of the conformal prediction workflow. After docking to a target of
interest, machine learning datasets are obtained through selection of a score threshold, followed by labeling
and featurization of samples. Training and test sets are assumed to be exchangeable. The training set is
split into a proper training and calibration set, and this process is repeated for each independent model that
must be trained. After training the classifiers, each sample in the test set is predicted. The corresponding
calibration sets help normalize the outputs given by the classifiers. A pair of p-values (p1 referring to the
confidence the sample belongs to the virtual actives and p0 referring to the confidence the sample belongs
to the virtual inactives class) is obtained after aggregating model outputs by taking median values. After
selecting a significance threshold, the sample can be assigned to a set prediction. For binary classifications,
Mondrian conformal prediction has four sets that a sample can be categorized into: virtual active {1}, virtual
inactive {0}, both = virtual active or inactive {0,1}, and null = no class assignment {}.

S3
Supplementary Table 1. Protein preparation for molecular docking.

Number of
Tarted Histidine protonation Electrostatic Desolvation
Target Templatea matching
residuesb states radiusc radius
spheres
δ: 155, 230
A2AR 4EIY1 N253 ε: 75, 250, 306 45 1.2 Å 0.3 Å
δ+ε: 264, 278
S64, Q120
AmpC 6DPT2 ε: 13, 108, 186, 210, 314 45 1.2 Å 0.2 Å
N152, A318
δ: 33, 38, 220, 304, 440
ε: 103, 243, 375, 383, 437,
5’-NT 6XUE3 N390 44 1.2 Å 0.4 Å
456, 518
δ+ε: 118
δ: 393, 398
D2R 6CM44 None 45 1.2 Å 0.25 Å
ε: 106
δ: 436
S363, Q530,
KEAP1 5FNU5 ε: 424, 432, 437, 451, 516, 45 1.4 Å 0.2 Å
S555, S602
552, 553, 562, 575
H163, G143, δ: 64, 80
Mpro 6W636 64 1.2 Å 0.3 Å
E166 ε: 41, 163, 164, 172, 246
δ: 10, 13, 54, 97, 112, 179,
OGG1 6G3Y7 G42 185, 195, 270, 276, 282 45 Default (1.9 Å) None
ε: 119, 237
δ: 68, 98, 360, 458, 490
SORT1 6X488 Y318 ε: 70, 182, 220, 295, 331, 406, 45 1.6 Å None
428, 430, 506, 590, 664
δ: 155, 230, 266
A2AR 8GNE9 N253 ε: 75, 250, 306 45 1.2 Å 0.3 Å
δ+ε: 278
D2R 7CMV10,d S197, S193 ε: 106, 393, 398 45 1.2 Å 0.25 Å

a
PDB accession code. b Increase of dipole moments by adding partial charges to atoms, without altering
the total charge of the residue. A detailed description of the partial charge redistribution is provided in
Supplementary Figure 1. c Tangent thin sphere radius. Default refers to low dielectric spheres made by
blastermaster’s SPHGEN program prior to thin sphere protocols. d A detailed description of homology model
generation based on the D3R is given in the methods section.

S4
Supplementary Figure 2. Partial charge redistribution in amino acid residues. For each protein target
in this study, the increase of dipole moments by adding partial charges to atoms, without altering the total
charge of the residue in the preparation of the molecular docking model.

S5
Supplementary Section 2: Evaluation of classifiers and molecular descriptors

CatBoost, DNN and RoBERTa classifiers resulted in consistently high sensitivity values
and the three molecular representations showed similar performance. The main
differences between the architectures were instead in the precision, significance, and
computational cost (Supplementary Tables 3-9). On average, the significance values
ranged from 0.15 to 0.18 with prediction efficiencies exceeding 0.99. The CP framework
was hence able to classify nearly all evaluated compounds as either virtual active or
virtual inactive with an average error rate of 15-18%. Whereas deviations in validity (the
agreement between the selected significance and resulting error rate) are often observed
in applications where insufficient data is available11, the performance of the CP on
molecular docking data resulted in the expected error rate for all targets in the
benchmarking set (Figure 2c). Analysis of the results for each protein in the benchmarking
set demonstrated that performance was target-dependent, with sensitivity values ranging
from 0.76 to 0.96 (Supplementary Tables 3-8). As 100000 compounds in the test set
belonged to the actives class, a maximal reduction of 100-fold could be achieved if all
compounds were correctly classified.

The largest database reduction was obtained for AmpC, a beta-lactamase targeted for
the development of antibiotics.12 For AmpC, 474646 out of the 10 million compounds in
the test set were assigned to the virtual active class, corresponding to a 21-fold database
reduction, and 96% of the true virtual actives were among these. The worst performance
was obtained for the target Mpro, which is a viral protease relevant for development of
drugs for treatment of COVID-19.13 In this case, the database was reduced by four-fold
and 76% of the true virtual actives were identified. The target dependent results of
machine learning accelerated protocols have been observed previously, and analysis of
our docking results indicate that the performance is influenced by the nature of the binding
site, the diversity of the top-ranked compounds, and the docking score distribution. For
example, the top-scoring compounds of open and solvent-exposed binding sites tend to
be more structurally diverse, which affects the ability of the classifier to recognize patterns
in the docking data.

Increasing the number of classification models from five to ten did not substantially
increase the performance of the conformal predictor, and the results were also robust if
the size of the minority class (virtual actives) was decreased from 1% to 0.1%
(Supplementary Figures 6-7). The introduction of noise in the docking scores did not
substantially alter the performance of the predictor (Supplementary Figures 8-9), but
scrambling of class labels or features led to complete loss of predictive power
(Supplementary Figures 10-11). Exchangeability is a fundamental concept in conformal
prediction. When the criterion of exchangeability between the training and test set is
satisfied, the prediction error rate overlaps the selected significance level, which is one of
the major strengths of this method. We assessed how the sensitivity is influenced by the
choice of training set for two targets (A2AR and D2R). A conformal predictor was trained
on one million random molecules from WuXi’s GalaXi make-on-demand database (1.4
billion rule-of-four molecules), which has only a small overlap with the Enamine’s REAL

S6
database.14 Predictions were then performed for the set of ten million random molecules
from Enamine’s REAL database docked to the corresponding target. For both targets,
substantially worse sensitivity values (0.19 and 0.30, respectively) were obtained
compared to the scenario in which both the training and test set were randomly extracted
from the Enamine’s REAL database (0.89 and 0.92, respectively) (Supplementary Figure
12). This demonstrated that full exchangeability between training and test set is essential
for accurate predictions.

Supplementary Table 2. Model hyperparameters. Key hyperparameters used during training of models.

Architecture CatBoost DNN RoBERTa


• learning_rate = 1e-4
• weight_decay = 1e-2
• nr_trees = 500
• batch_size = 200 • learning_rate = 4e-7
• metric = AUC
Parameters • max_epochs= 100 • max_epochs = 10
• weights =
• patience = 10 • seyonec/PubChem10M
balanced
• optimizer = RangerLars
• class_weights = balanced

S7
Supplementary Figure 3. Learning rate and weight decay analysis for deep neural networks. The
changes in training loss, valid loss, valid accuracy, and speeds during training were monitored for deep
neural networks with learning rates (LR) and weight decays (WD). Models were trained on one million
molecules of the AmpC dataset represented by Morgan2 descriptors, and hence the input dimension was
set to 1024. The output dimension was set to two for binary classification (virtual active and virtual inactive).
The early stop patience for valid loss was set to 3, after which the best performing checkpoint (grey dashed
line) was stored as final model. The default learning rate was then set to 1e-4 and the default weight decay
was set to 1e-2 (See Supp Table S2). Mean values were obtained from three independently trained models
and error bars correspond to the standard error of those means.

S8
Supplementary Figure 4. Architecture analysis for deep neural networks. The changes in training loss,
valid loss, and valid accuracy during training were monitored for deep neural networks with different
architectures, which are shown above each subplot. Models were trained on one million molecules of the
AmpC dataset represented by Morgan2 descriptors, and hence the input dimension was set to 1024. The
output dimension was set to two for binary classification (virtual active and virtual inactive). The learning
rate was set to 1e-4 and the weight decay was set to 1e-2. The early stop patience for valid loss was set to
3, after which the best performing checkpoint (grey dashed line) was stored as final model. The [input]-
[1000]-[4000]-[2000]-[2] architecture was then selected as default. Mean values were obtained from three
independently trained models and error bars correspond to the standard error of those means.

S9
Supplementary Figure 5. Learning rate analysis for RoBERTa. (A) The changes in sensitivity and
precision during training were monitored for the RoBERTa classifiers. Models were trained on one million
AmpC molecules using RoBERTa’s internal descriptors. A small external test set of 200000 molecules was
used to obtain the sensitivity and precision metrics. Mean values were obtained from three independently
trained models and predictions, and error bars correspond to the standard error of those means. The default
number of epochs was set to ten in all other calculations. (B) RoBERTa models were trained on one million
A2AR molecules with three different learning rates: 1e-5 (B), 4e-6 (C), and 4e-8 (D). The relative set
distributions for different significance values are shown, together with the significance at which the predict
achieves highest efficiency. The default learning rate was then set to 4e-7 for training RoBERTa models
(See Supp Table S2). The fraction of molecules predicted to be in the one-set, zero-set, both-set, and null-
set are colored in blue, red, white, and gray respectively (B,C,D).

S10
Supplementary Table 3. Sensitivity and training set size - Morgan2. Sensitivity values obtained at
optimal efficiency for different sizes of the training set.

Sensitivitya
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.754 ± 0.011 0.799 ± 0.005 0.820 ± 0.007 0.856 ± 0.004 0.873 ± 0.003 0.891 ± 0.002
AmpC 0.857 ± 0.014 0.909 ± 0.009 0.921 ± 0.005 0.936 ± 0.001 0.945 ± 0.000 0.955 ± 0.001
CatBoost

5’-NT 0.719 ± 0.025 0.773 ± 0.005 0.783 ± 0.008 0.811 ± 0.002 0.834 ± 0.001 0.849 ± 0.001
D2R 0.793 ± 0.002 0.813 ± 0.016 0.854 ± 0.006 0.883 ± 0.002 0.910 ± 0.001 0.917 ± 0.001
KEAP1 0.688 ± 0.011 0.732 ± 0.008 0.777 ± 0.008 0.795 ± 0.005 0.819 ± 0.002 0.833 ± 0.003
MPRO 0.588 ± 0.010 0.650 ± 0.003 0.681 ± 0.003 0.705 ± 0.006 0.743 ± 0.003 0.765 ± 0.005
OGG1 0.720 ± 0.014 0.770 ± 0.004 0.782 ± 0.001 0.815 ± 0.002 0.836 ± 0.006 0.853 ± 0.001
SORT1 0.656 ± 0.011 0.703 ± 0.004 0.733 ± 0.003 0.773 ± 0.001 0.804 ± 0.004 0.821 ± 0.004
Average 0.722 ± 0.017 0.768 ± 0.015 0.794 ± 0.014 0.822 ± 0.014 0.845 ± 0.012 0.860 ± 0.012
A2AR 0.744 ± 0.034 0.789 ± 0.010 0.814 ± 0.013 0.833 ± 0.008 0.831 ± 0.007 0.841 ± 0.002
AmpC 0.781 ± 0.013 0.836 ± 0.003 0.859 ± 0.004 0.897 ± 0.004 0.903 ± 0.001 0.919 ± 0.003
5’-NT 0.731 ± 0.009 0.753 ± 0.018 0.782 ± 0.016 0.788 ± 0.010 0.807 ± 0.002 0.804 ± 0.002
D2R 0.747 ± 0.005 0.769 ± 0.012 0.803 ± 0.009 0.838 ± 0.005 0.861 ± 0.003 0.873 ± 0.003
DNN

KEAP1 0.697 ± 0.035 0.766 ± 0.011 0.768 ± 0.014 0.784 ± 0.004 0.790 ± 0.004 0.796 ± 0.006
MPRO 0.677 ± 0.045 0.675 ± 0.014 0.682 ± 0.007 0.699 ± 0.003 0.713 ± 0.006 0.726 ± 0.002
OGG1 0.728 ± 0.030 0.772 ± 0.012 0.787 ± 0.012 0.791 ± 0.006 0.806 ± 0.002 0.817 ± 0.002
SORT1 0.704 ± 0.025 0.702 ± 0.007 0.729 ± 0.010 0.749 ± 0.010 0.760 ± 0.002 0.782 ± 0.004
Average 0.726 ± 0.010 0.758 ± 0.010 0.778 ± 0.011 0.797 ± 0.012 0.809 ± 0.012 0.820 ± 0.011

a
Each test set contained ten million molecules. Morgan2 descriptors were used as features of the
molecules. Three independent calculations (training and prediction) were performed for each target and
error bars correspond to the standard error of the mean. Averages are reported in bold.

Supplementary Table 4. Precision and training set size - Morgan2. Precision values obtained at optimal
efficiency for different sizes of the training set.

Precisiona
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.043 ± 0.002 0.045 ± 0.001 0.052 ± 0.002 0.055 ± 0.000 0.068 ± 0.001 0.074 ± 0.001
AmpC 0.090 ± 0.003 0.100 ± 0.005 0.117 ± 0.002 0.138 ± 0.007 0.180 ± 0.002 0.202 ± 0.001
CatBoost

5’-NT 0.035 ± 0.002 0.036 ± 0.001 0.039 ± 0.001 0.041 ± 0.000 0.046 ± 0.000 0.052 ± 0.000
D2R 0.047 ± 0.000 0.060 ± 0.005 0.065 ± 0.003 0.079 ± 0.002 0.093 ± 0.000 0.106 ± 0.000
KEAP1 0.034 ± 0.001 0.034 ± 0.001 0.036 ± 0.000 0.040 ± 0.000 0.044 ± 0.000 0.047 ± 0.001
MPRO 0.020 ± 0.000 0.022 ± 0.000 0.023 ± 0.000 0.025 ± 0.000 0.028 ± 0.000 0.030 ± 0.000
OGG1 0.035 ± 0.001 0.037 ± 0.001 0.040 ± 0.001 0.042 ± 0.000 0.048 ± 0.001 0.052 ± 0.000
SORT1 0.024 ± 0.001 0.027 ± 0.000 0.030 ± 0.001 0.033 ± 0.001 0.039 ± 0.001 0.044 ± 0.001
Average 0.041 ± 0.004 0.045 ± 0.005 0.050 ± 0.006 0.057 ± 0.007 0.068 ± 0.010 0.076 ± 0.011
A2AR 0.039 ± 0.003 0.041 ± 0.001 0.044 ± 0.002 0.048 ± 0.001 0.054 ± 0.001 0.057 ± 0.000
AmpC 0.042 ± 0.002 0.052 ± 0.002 0.067 ± 0.002 0.080 ± 0.003 0.097 ± 0.002 0.104 ± 0.001
5’-NT 0.030 ± 0.001 0.034 ± 0.001 0.035 ± 0.001 0.039 ± 0.001 0.041 ± 0.001 0.043 ± 0.000
D2R 0.029 ± 0.001 0.040 ± 0.003 0.045 ± 0.003 0.053 ± 0.002 0.060 ± 0.001 0.068 ± 0.001
DNN

KEAP1 0.029 ± 0.002 0.030 ± 0.001 0.034 ± 0.001 0.035 ± 0.001 0.039 ± 0.001 0.040 ± 0.001
MPRO 0.018 ± 0.001 0.021 ± 0.000 0.023 ± 0.000 0.024 ± 0.000 0.026 ± 0.000 0.027 ± 0.000
OGG1 0.034 ± 0.001 0.037 ± 0.001 0.038 ± 0.001 0.041 ± 0.001 0.043 ± 0.000 0.044 ± 0.000
SORT1 0.022 ± 0.001 0.025 ± 0.001 0.028 ± 0.001 0.030 ± 0.001 0.033 ± 0.000 0.035 ± 0.000
Average 0.031 ± 0.002 0.035 ± 0.002 0.039 ± 0.003 0.044 ± 0.003 0.049 ± 0.004 0.052 ± 0.005

a
Each test set contained ten million molecules. Morgan2 descriptors were used as features of the
molecules. Three independent calculations (training and prediction) were performed for each target and
error bars correspond to the standard error of the mean. Averages are reported in bold.

S11
Supplementary Table 5. Sensitivity and training set size - CDDD. Sensitivity values obtained at
optimal efficiency for different sizes of the training set.

Sensitivitya
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.784 ± 0.013 0.806 ± 0.013 0.819 ± 0.003 0.845 ± 0.002 0.852 ± 0.003 0.870 ± 0.004
AmpC 0.847 ± 0.013 0.893 ± 0.008 0.903 ± 0.004 0.919 ± 0.003 0.931 ± 0.001 0.937 ± 0.002
CatBoost

5’-NT 0.747 ± 0.012 0.790 ± 0.007 0.793 ± 0.005 0.815 ± 0.004 0.828 ± 0.003 0.832 ± 0.002
D2R 0.805 ± 0.014 0.839 ± 0.004 0.847 ± 0.008 0.875 ± 0.001 0.888 ± 0.001 0.896 ± 0.002
KEAP1 0.716 ± 0.014 0.759 ± 0.008 0.784 ± 0.009 0.799 ± 0.002 0.816 ± 0.003 0.827 ± 0.001
MPRO 0.605 ± 0.004 0.658 ± 0.004 0.682 ± 0.003 0.699 ± 0.005 0.728 ± 0.003 0.737 ± 0.007
OGG1 0.745 ± 0.007 0.776 ± 0.005 0.776 ± 0.006 0.809 ± 0.001 0.816 ± 0.002 0.833 ± 0.003
SORT1 0.676 ± 0.006 0.691 ± 0.011 0.722 ± 0.004 0.749 ± 0.005 0.772 ± 0.004 0.792 ± 0.001
Average 0.741 ± 0.015 0.777 ± 0.015 0.791 ± 0.014 0.814 ± 0.014 0.829 ± 0.012 0.840 ± 0.012
A2AR 0.755 ± 0.005 0.832 ± 0.013 0.818 ± 0.008 0.851 ± 0.004 0.850 ± 0.003 0.862 ± 0.002
AmpC 0.841 ± 0.016 0.871 ± 0.011 0.892 ± 0.007 0.908 ± 0.005 0.923 ± 0.004 0.941 ± 0.004
5’-NT 0.777 ± 0.016 0.812 ± 0.000 0.816 ± 0.015 0.811 ± 0.007 0.822 ± 0.002 0.836 ± 0.003
D2R 0.786 ± 0.021 0.874 ± 0.005 0.857 ± 0.006 0.871 ± 0.002 0.884 ± 0.003 0.897 ± 0.003
DNN

KEAP1 0.753 ± 0.022 0.792 ± 0.015 0.791 ± 0.008 0.810 ± 0.007 0.824 ± 0.005 0.820 ± 0.001
MPRO 0.657 ± 0.025 0.707 ± 0.004 0.689 ± 0.013 0.724 ± 0.002 0.723 ± 0.008 0.737 ± 0.006
OGG1 0.772 ± 0.017 0.798 ± 0.003 0.800 ± 0.007 0.802 ± 0.014 0.821 ± 0.005 0.828 ± 0.002
SORT1 0.684 ± 0.042 0.728 ± 0.009 0.761 ± 0.003 0.772 ± 0.012 0.775 ± 0.003 0.793 ± 0.006
Average 0.753 ± 0.013 0.802 ± 0.012 0.803 ± 0.012 0.819 ± 0.012 0.828 ± 0.012 0.839 ± 0.012

a
Each test set contained ten million molecules. Continuous-Data-Driven Descriptors (CDDD) were used as
features of the molecules. Three independent calculations (training and prediction) were performed for each
target and error bars correspond to the standard error of the mean. Averages are reported in bold.

Supplementary Table 6. Precision and training set size - CDDD. Precision values obtained at optimal
efficiency for different sizes of the training set.

Precisiona
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.046 ± 0.002 0.047 ± 0.001 0.051 ± 0.001 0.052 ± 0.001 0.059 ± 0.000 0.062 ± 0.000
AmpC 0.079 ± 0.001 0.085 ± 0.003 0.093 ± 0.000 0.100 ± 0.002 0.113 ± 0.001 0.128 ± 0.002
CatBoost

5’-NT 0.042 ± 0.002 0.039 ± 0.001 0.044 ± 0.001 0.043 ± 0.001 0.046 ± 0.000 0.050 ± 0.000
D2R 0.052 ± 0.002 0.057 ± 0.001 0.060 ± 0.003 0.063 ± 0.000 0.071 ± 0.000 0.079 ± 0.001
KEAP1 0.038 ± 0.001 0.037 ± 0.001 0.038 ± 0.001 0.040 ± 0.000 0.042 ± 0.000 0.045 ± 0.000
MPRO 0.021 ± 0.000 0.022 ± 0.001 0.024 ± 0.000 0.025 ± 0.000 0.026 ± 0.000 0.027 ± 0.000
OGG1 0.035 ± 0.000 0.038 ± 0.000 0.041 ± 0.000 0.041 ± 0.000 0.044 ± 0.000 0.046 ± 0.000
SORT1 0.026 ± 0.000 0.027 ± 0.001 0.028 ± 0.000 0.030 ± 0.000 0.034 ± 0.001 0.037 ± 0.000
Average 0.042 ± 0.003 0.044 ± 0.004 0.047 ± 0.004 0.049 ± 0.005 0.054 ± 0.005 0.059 ± 0.006
A2AR 0.054 ± 0.002 0.047 ± 0.001 0.056 ± 0.000 0.053 ± 0.001 0.064 ± 0.001 0.067 ± 0.001
AmpC 0.089 ± 0.004 0.101 ± 0.002 0.105 ± 0.003 0.126 ± 0.005 0.135 ± 0.002 0.140 ± 0.003
5’-NT 0.042 ± 0.002 0.041 ± 0.001 0.043 ± 0.002 0.047 ± 0.001 0.049 ± 0.000 0.050 ± 0.000
D2R 0.062 ± 0.004 0.057 ± 0.001 0.068 ± 0.001 0.071 ± 0.002 0.079 ± 0.001 0.084 ± 0.001
DNN

KEAP1 0.037 ± 0.001 0.038 ± 0.002 0.042 ± 0.000 0.044 ± 0.001 0.045 ± 0.001 0.049 ± 0.000
MPRO 0.023 ± 0.001 0.023 ± 0.000 0.025 ± 0.001 0.026 ± 0.000 0.028 ± 0.000 0.029 ± 0.000
OGG1 0.038 ± 0.001 0.040 ± 0.001 0.042 ± 0.001 0.045 ± 0.001 0.048 ± 0.000 0.050 ± 0.000
SORT1 0.029 ± 0.002 0.029 ± 0.000 0.030 ± 0.001 0.034 ± 0.001 0.038 ± 0.001 0.040 ± 0.001
Average 0.047 ± 0.004 0.047 ± 0.005 0.051 ± 0.005 0.055 ± 0.006 0.061 ± 0.007 0.064 ± 0.007

a
Each test set contained ten million molecules. Continuous-Data-Driven Descriptors (CDDD) were used as
features of the molecules. Three independent calculations (training and prediction) were performed for each
target and error bars correspond to the standard error of the mean. Averages are reported in bold.

S12
Supplementary Table 7. Sensitivity and training set size - RoBERTa. Sensitivity values obtained at
optimal efficiency for different sizes of the training set.

Sensitivitya
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.765 ± 0.007 0.781 ± 0.006 0.806 ± 0.007 0.848 ± 0.007 0.861 ± 0.006 0.879 ± 0.002
AmpC 0.808 ± 0.005 0.872 ± 0.005 0.890 ± 0.004 0.916 ± 0.003 0.939 ± 0.002 0.944 ± 0.002
RoBERTa

5’-NT 0.735 ± 0.011 0.784 ± 0.007 0.778 ± 0.005 0.808 ± 0.007 0.827 ± 0.003 0.841 ± 0.003
D2R 0.737 ± 0.003 0.817 ± 0.004 0.841 ± 0.007 0.863 ± 0.002 0.884 ± 0.003 0.901 ± 0.000
KEAP1 0.727 ± 0.011 0.764 ± 0.006 0.797 ± 0.005 0.805 ± 0.006 0.822 ± 0.005 0.830 ± 0.000
MPRO 0.627 ± 0.005 0.657 ± 0.013 0.689 ± 0.005 0.703 ± 0.002 0.729 ± 0.001 0.745 ± 0.000
OGG1 0.728 ± 0.007 0.751 ± 0.005 0.783 ± 0.003 0.805 ± 0.004 0.819 ± 0.005 0.837 ± 0.004
SORT1 0.662 ± 0.014 0.690 ± 0.010 0.730 ± 0.001 0.757 ± 0.003 0.782 ± 0.001 0.805 ± 0.004
Average 0.724 ± 0.011 0.764 ± 0.013 0.789 ± 0.012 0.813 ± 0.013 0.833 ± 0.012 0.848 ± 0.012

a
Each test set contained ten million molecules. Internal RoBERTa descriptors were used as features of the
molecules. Three independent calculations (training and prediction) were performed for each target and
error bars correspond to the standard error of the mean. Averages are reported in bold.

Supplementary Table 8. Precision and training set size - RoBERTa. Precision values obtained at
optimal efficiency for different sizes of the training set.

Precisiona
Method Target
25K 50K 100K 200K 500K 1M
A2AR 0.034 ± 0.001 0.042 ± 0.001 0.050 ± 0.001 0.054 ± 0.001 0.064 ± 0.002 0.070 ± 0.000
AmpC 0.052 ± 0.000 0.065 ± 0.001 0.084 ± 0.002 0.111 ± 0.002 0.143 ± 0.004 0.181 ± 0.004
RoBERTa

5’-NT 0.032 ± 0.001 0.035 ± 0.001 0.042 ± 0.000 0.045 ± 0.001 0.050 ± 0.001 0.054 ± 0.001
D2R 0.034 ± 0.001 0.048 ± 0.001 0.058 ± 0.001 0.066 ± 0.002 0.082 ± 0.001 0.094 ± 0.001
KEAP1 0.035 ± 0.001 0.038 ± 0.001 0.040 ± 0.000 0.043 ± 0.001 0.047 ± 0.001 0.050 ± 0.000
MPRO 0.018 ± 0.000 0.021 ± 0.000 0.023 ± 0.000 0.025 ± 0.000 0.028 ± 0.000 0.031 ± 0.000
OGG1 0.030 ± 0.000 0.035 ± 0.000 0.039 ± 0.000 0.042 ± 0.001 0.048 ± 0.001 0.051 ± 0.000
SORT1 0.022 ± 0.001 0.026 ± 0.000 0.029 ± 0.000 0.031 ± 0.000 0.038 ± 0.000 0.043 ± 0.000
Average 0.032 ± 0.002 0.039 ± 0.003 0.046 ± 0.004 0.052 ± 0.005 0.062 ± 0.007 0.072 ± 0.009

a
Each test set contained ten million molecules. Internal RoBERTa descriptors were used as features of the
molecules. Three independent calculations (training and prediction) were performed for each target and
error bars correspond to the standard error of the mean. Averages are reported in bold.

Supplementary Table 9. Training and prediction times.

Cost (s)a CatBoost DNN


RoBERTa (GPU)
Descriptor Morgan2 CDDD Morgan2 CDDD
Training (1M) 1410 2330 12669 24906 376685
Prediction (1M) 117 192 462 699 9125
a
The times (in seconds) required to train a conformal predictor on one million molecules with different
architectures and descriptors or predict one million molecules.

S13
Supplementary Figure 6. Performance and number of aggregated models. Sensitivity and precision at
optimal efficiency were analyzed for a different number of models during aggregation. Five independent
CatBoost models were trained on one million molecules represented by Morgan2 descriptors. Each test set
contained ten million molecules. Three independent calculations (training and prediction) were performed
for the eight targets and error bars correspond to the standard error of the mean.

S14
Supplementary Figure 7. Performance on imbalanced datasets. Sensitivity and precision at optimal
efficiency were analyzed for different class imbalances. Five independent CatBoost models were trained
on one million molecules represented by Morgan2 descriptors. Each test set contained ten million
molecules. Three independent calculations (training and prediction) were performed for the eight targets
and error bars correspond to the standard error of the mean.

S15
Supplementary Figure 8. Overview of noise addition. A zero-centered normal distribution was
constructed using the standard deviation (σscores) of the docking score distribution and a noise scaling factor
(γnoise). Noise was added to the score of each sample by taking a sample from the corresponding noise
distribution. Large noise scaling factors led to wide distributions and increased perturbations of the initial
docking score distributions.

S16
Supplementary Figure 9. Performance on noisy datasets. Sensitivity and precision at optimal efficiency
were analyzed for datasets generated with different noise scaling factors (γnoise). Five independent CatBoost
models were trained on one million molecules represented by Morgan2 descriptors. Each test set contained
ten million molecules. Three independent calculations (training and prediction) were performed for the eight
targets and error bars correspond to the standard error of the mean.

S17
Supplementary Figure 10. Performance on non-sensical datasets - labels. Sensitivity and precision at
optimal efficiency were analyzed for datasets where the labels were scrambled without affecting the class
imbalance. Five independent CatBoost models were trained on one million molecules represented by
Morgan2 descriptors. Each test set contained ten million molecules. Five independent calculations (training
and prediction) were performed for the eight targets. When the CP operates at an optimal efficiency of 50%,
has a sensitivity averaging around 50%, and a precision close to the class imbalance (1%), the performance
will correspond to random classification. Values represent individual datapoints and no corresponding error
bars are shown.

S18
Supplementary Figure 11. Performance on non-sensical datasets - features. Sensitivity and precision
at optimal efficiency were analyzed for datasets where the feature vectors were shuffled. Five independent
CatBoost models were trained on one million molecules represented by Morgan2 descriptors. Each test set
contained ten million molecules. Five independent calculations (training and prediction) were performed for
the eight targets. When the CP operates at an optimal efficiency of 50%, has a sensitivity averaging around
50%, and a precision close to the class imbalance (1%), the performance will correspond to random
classification. Values represent individual datapoints and no corresponding error bars are shown.

S19
Supplementary Figure 12. Structural similarity between non-exchangeable datasets and conformal
predictor performance. a) Two-dimensional unsupervised UMAP projection illustrates the chemical
relationships in high-dimensional feature space between WuXi training set (blue), Enamine training set (red)
and Enamine test (gray) sets (b) Difference in sensitivity values obtained from conformal predictors trained
on one million exchangeable (red) and one million non-exchangeable (blue) molecules as a function of the
significance value (ε). Values represent individual datapoints and no corresponding error bars are shown.

S20
Supplementary Figure 13. Correlations between the quality of information metric and molecular docking
results. (a) Pearson correlation coefficients between the quality of information metric (p1 - p0) and molecular
docking results (ranks or scores) for eight different protein targets. (b-i) For eight different protein targets,
boxplots representing the distribution of the quality of information metric (p1 - p0) across different segments
of the library (ten million molecules) ranked by docking scores. Each box spans from the first quartile (Q1,
25th percentile) to the third quartile (Q3, 75th percentile), with the purple line inside the box indicating the
median (50th percentile). The whiskers extend to the most extreme data points within 1.5 times the
interquartile range (IQR) from the quartiles. Data points outside this range are considered outliers and are
not visualized for clarity. Blue and red dots respectively represent the median p1 and p0 values for the
different segments.

S21
Supplementary Figure 14. (a) Number of unique Bemis-Murcko scaffolds in the top-ranked (1%) D2R
compounds prioritized by explicit docking (red) or the conformal predictor (blue) in function of the size of
the virtual library. Values represent individual datapoints and no corresponding error bars are shown. (b)
Distributions of pairwise Tanimoto coefficients in the top-ranked (1%) compounds prioritized by explicit
docking (red) or the conformal predictor (blue) in function of the size of the virtual library. Ten random
samples with no overlap were taken from the top-ranked (1%) compounds and their pairwise Tanimoto
coefficients were calculated, followed by division of the results in one hundred bins. Data points represent
the means of each bin, and error bars correspond to the standard errors on those means. A paired t-test
indicates that the distributions of pairwise Tanimoto coefficients in the top-ranked (1%) compounds from
the full library prioritized by explicit docking or the conformal predictor are not significantly different (p =
3.15e-6).

S22
Supplementary Tables

Supplementary Table 10. Chemical structures and D2R radioligand displacement data.

Vendor Displacement
Chemical Structure SMILES
codea at 10 µM (%)b
F F
F
c1cnc(cc1C(F)(F)F)N2CCN(CC2)C3CCC3 Z1348398263 5 ± 3%
N N
N
S
N
c1ccc(cc1)CC2CN(C2)Cc3cc4c(s3)cccn4 Z8185092667 12 ± 3%
N

N
S c1ccc(c(c1)[C@H](CN2CCc3cc(sc3C2)Br)O)F Z8185092668 3 ± 1%
Br
OH

N
N N
HN c1cnc(cn1)CCN2CCC(CC2)c3c[nH]nc3 Z2833584438 1 ± 1%
N

N Cc1ccc(c(c1)C)C(CN2CCC3(C2)Cc4ccccc4C3)O Z8185092353 4 ± 3%
HO
N O
N S
c1cc(sc1)CCN2CCC(C2)Oc3ccc(cn3)C4CC4 Z2694200197 17 ± 4%

Cl
N O c1cc(cc(c1)Cl)CC(CN2CCC=C(C2)c3ccco3)O Z2102774071 10 ± 3%
OH

F N Cc1cc(ccc1F)CC(CN2CCc3c(ncn3C4CCC4)C2)O Z3516919089 1 ± 1%
OH
N
N
F O
N
N
Cc1c(nc[nH]1)CN2CCC(C2)Oc3cc(cc(c3)F)F Z3516846214 5 ± 1%
F N
H
Cl
N
N N O c1cc(ccc1COC2CCN(CC2)CCn3cc(cn3)Cl)Br Z8185092665 4 ± 1%
Br

Cl
N Z2436421891
N Cc1cnccc1CCN2CCC(C2)Oc3ccccc3Cl 56 ± 2%
2
O

F
H Cc1c(c(ccc1)CN[C@H]2C[C@@H](C2)Oc3ccccc3)
N Z8185092308 6 ± 2%
F

O
O

N Cc1ccccc1n2cc(cn2)CNC3Cc4ccccc4OC3 Z8185092078 3 ± 1%
H N
N

S23
N
H N
N
N c1ccc(cc1)n2nc(cn2)CN[C@H]3C[C@@H](C3)c4c
Z2213086620 27 ± 1%
cc(cc4)F

HN c1ccc(cc1)[C@H]2C[C@@H](C2)NCc3c(cccc3)n4
Z1973963178 1 ± 2%
N N nccc4

N
H
N c1ccc(cc1)O[C@H]2C[C@@H](C2)NCc3n4c(nc3)c
N Z8185092056 2 ± 2%
Br
c(cc4)Br
O

H
N HN
c1ccc(cc1)C2CC(C2)NCc3cc4c([nH]3)cccn4 Z3532611366 24 ± 1%
N
N Cl

O N O c1ccc(cc1)O[C@@H]2CCN(C2)Cc3cc4c(nc3Cl)CC
Z3529190238 1 ± 1%
OC4

S N

N N CC(C)(C)c1c[nH]c(n1)CN2CCC(C2)Cc3nccs3 Z8185092674 2 ± 3%
HN

N c1cc(ccc1O[C@H]2CCN(C2)Cc3cnc4n3cc(cc4)Br)
O Z8185092492 5 ± 3%
Br
N F
F
H
N
N Cc1ccc(cc1)C2CC(C2)NCCn3c4ccccc4cn3 Z8185092514 23 ± 1%
N

N c1ccc2c(c1)cnn2C3CCN(CC3)CC(c4cccc(n4)Cl)O Z3651347041 28 ± 1%
N
N
N HO Cl
N
N
NH
N c1cc(ccc1C2CN=C(N2)NCc3cc(ccn3)OC4CCC4)Cl Z8185092666 24 ± 1%
H O
Cl
H
N
N Z1441695252
c1ccc2c(c1)CC(C2)NCCc3ccn(n3)c4ccc(cc4)F 59 ± 1%
N
F
1

N
N
N N
c1ccc(cc1)C2CN=C(N2)NCCc3cn4c(n3)CCCC4 Z8185092671 5 ± 2%
N H
H

N
N c1ccc(cc1)NC2CCN(C2)Cc3cccc4n3ncn4 Z3310096494 1 ± 2%
HN N

N c1ccc2c(c1)c(ncn2)C3CN(C3)CCc4cccc(c4)Cl Z3806926084 4 ± 1%
N
N Cl

S24
F
N
N
S c1ccc(c(c1)CNCc2nc3cccc(c3s2)F)n4cccn4 Z8185092672 2 ± 1%
H
N
N
O
N S
N Cn1cc(cn1)c2ccc(s2)CN3CC[C@H](C3)Oc4ccccc4 Z2629646171 7 ± 1%
N

N Cc1ccc(cc1)CCN2C[C@@H]3CN(C[C@@H]3C2)
N N Z8185092530 1 ± 2%
C(=O)c4cn5c(n4)CC(CC5)C
N

O
a b
Vendor code ZXXX, manuscript compound numbers in bold. Data represents mean values ± SEM of
two technical replicates.

Supplementary Table 11. Chemical novelty of discovered D2R.

# Chemical Structure Tca Ki (µM)b ChEMBL ID ChEMBL Structure

H
N

1 N
N
0.41 3.0 ± 0.3 CHEMBL3589575 O
F
N N
N N
H
F
O
Cl
N
2 N 0.38 3.8 ± 0.3 CHEMBL397180 N Cl
O

a
Maximal Tanimoto similarity coefficient (Tc) between the compound and ChEMBL human dopamine
receptor ligands with Ki < 10 µM (>11,000 compounds). Coefficients were calculated using the RDKit and
Morgan2 fingerprints. b Data represents mean values ± SEM from three independent experiments.

S25
Supplementary Table 12. Chemical structures and A2AR radioligand displacement data.

Vendor Displacement
Chemical Structure SMILES
codea at 20 µM (%)b
S
N
O

N
H Cc1cc(ns1)NC(=O)c2cccn2CCN3CCOCC3 Z3591989733 19±1%
N
N
O

N N
H
N CC(Cc1c[nH]c2c1cccn2)NCc3c4ccccc4n(n3)C Z7272600070 20±1%
N
NH

c1ccc(cc1)CN2CCCC2C(=O)NCc3[nH]c(=O)ccn3 Z4219981720 6±4%


N N
H
N
N O
H
O
HN
N O
N Cc1cnc(cc1OC)CNC(C)Cc2c[nH]c3c2cccn3 Z7326877935 13±2%
H
N

HN
c1ccc(cc1)O[C@H]2CCN(C2)CC(=O)c3c[nH]c4c3c Z8854579348
N 1.4 µM
N cc(n4)Cl 4
O
Cl
O

O N

OH N N c1cnc(nc1)NC(=O)C2CCN(CC2)CC(C3CCC3)O Z3292775568
H 1±2%
N

H
F N N N
O c1cc(cc(c1)OCCN=C(N)Nc2ncccn2)C(F)(F)F Z8854579346 23±1%
F
F NH2 N

H H
c1ccc(cc1)CC2CCN2CC(=O)Nc3[nH]ccn3 Z6743522026 5±3%
N N
N
O N
F
H
N N N c1cnc(nc1)NC(=NCCOc2ccc(c(c2)Cl)F)N Z8854579360 26±3%
Cl O
NH2 N

F
H
N N N c1cnc(nc1)NC(=NCCOc2ccc(cc2)F)N Z8854579344 17±1%
O
NH2 N

Cl
NH2 N
c1ccc(c(c1)CCN=C(N)Nc2ncccn2)Cl Z8854579337 29±4%
N N N
H

S26
Cl
NH2 N
COc1ccc(c(c1)CCN=C(N)Nc2ncccn2)Cl Z8854579336 39±1%
O N N N
H

NH2 N
c1cc(cc(c1)Br)CCN=C(N)Nc2ncccn2 Z8854579355 28±1%
Br N N N
H

N
S
H c1ccn(c1)c2c(ccs2)CN=C(N)Nc3ncccn3 Z8854579347 32±4%
N N N

NH2 N

F
F F

O c1ccc(c(c1)CCN=C(N)Nc2ncccn2)OC(F)(F)F Z8854579353
NH2 N 22±5%

N N N
H
O F
NH2 N
COc1ccc(c(c1)F)CCN=C(N)Nc2ncccn2 Z8854579362 12±4%
N N N
H
F
H
N N N
c1cc(c(c(c1)Cl)CCN=C(N)Nc2ncccn2)F Z8854579358 16±4%
NH2 N
Cl
O

S
HN
c1ccc(cc1)CCN2CCC(C2)CNc3[nH]c(=O)c4c(n3)cc
Z5332306614 8±1%
N N s4
N H

NH2
HN
N c1ccc2c(c1)C[C@H]([C@H]2N)C(=O)Nc3ccn(n3)c
N Z8854579342 30±2%
4ccccn4
O
N

O
H
N
N Cc1ccc(nc1)NCCNC(=O)C2Cc3c(cccc3CN2)C Z8854579335 6±2%
H
NH N

NH O
H N
N
N
Cn1ccc(n1)NC(=O)CNC(=O)C2Cc3c(cccc3Br)CN2 Z8854579367 10±2%
N
H
Br O
F
H
N N N
c1cc(c(c(c1)Br)CCN=C(N)Nc2ncccn2)F Z8854579363 20±1%
NH2 N
Br
F
NH2 N
Cc1ccc(c(c1)CCN=C(N)Nc2ncccn2)F Z8854579366 15±1%
N N N
H

O N
Br N Cc1c(nc([nH]1)NC(=O)C(Cc2cccc(n2)Br)N)C Z6382061841 2±2%
N N
H H
NH2

Br

F c1ccc(cc1)C[C@H](CC(=O)Nc2[nH]c3cc(cc(c3n2)B
NH2 O N Z8854579338 49±2%
r)F)N
N N
H H

S27
S
NH2 N
CSc1ccc(cc1)CCN=C(N)Nc2ncccn2 Z8854579356 15±1%
N N N
H
F F
NH2 N
c1cnc(nc1)NC(=NCCc2ccc(cc2F)F)N Z8854579368 14±1%
N N N
H
O
NH2
N N c1ccc2c(c1)cc(o2)CN=C(N)Nc3ncccn3 Z8854579345 32±2%
HN
N
Cl

H Z8854579357
N N N c1cnc(nc1)NC(=NCCc2c[nH]c3c2cc(cc3Cl)Cl)N 20±3.0 µM
5
Cl NH2 N
HN
N NH
H
N N N
N c1ccc(cc1)c2nc([nH]n2)CN=C(N)Nc3ccccn3 Z2518713795 23±3%
NH2

NH2 N
O
N N N c1cnc(nc1)NC(=NCCOc2ccc(cc2F)F)N Z8854579369 2±3%
H
F F
F
H
N N N c1cnc(nc1)NC(=NCCCc2ccc(cc2)F)N Z8854579364 17±4%
NH2 N

O
H
N N
N Cc1cccc2c1CC(NC2)C(=O)NCCNc3ncccn3 Z8854579354 1±1%
H
NH N

Cl
H
N N N c1cc(c(c(c1)Cl)CN=C(N)Nc2ncccn2)Cl Z8861112994 32±2%
Cl NH2 N

F
NH2 N
c1cnc(nc1)NC(=NCCc2ccc(cc2)F)N Z8854579370 11±1%
N N N
H

N H
N N
CCc1ccnc(n1)NCC2CN(CC2C)Cc3ccccc3 Z5471612810 2±1%

N N N Cc1ccncc1CNC(=O)C(Cc2c[nH]c3c2cccn3)N Z8857701713 3±2%


H
HN NH2

NH2
Cc1ccc(cc1)C[C@@H](CC(=O)Nc2[nH]c3cccc(c3n
O N Z8854579350 30±3%
2)C)N
N N
H H
H
F N
NH NH2 Cc1cccc(c1)CC(C(=O)Nc2[nH]c3cc(cc(c3n2)Br)F) Z8854579339
N 2.5 µM
N 3
Br O

S28
H
N
NH NH2
N Cc1cccc2c1nc([nH]2)NC(=O)C(Cc3csc4c3cccc4)N Z8854579351 31±1%
O
S
H
N
NH NH2
N CCOC(=O)c1cccc2c1nc([nH]2)NC(=O)C(Cc3ccco3 Z8857701715
1.3 µM
O )N 6
O O O

N H c1ccc(c(c1)CC(C(=O)N2CC(C2)C(=O)Nc3nccs3)N)
Z6437654059 3±2%
NH2
N N Cl
Cl
O S
H
N
NH NH2
N
Cc1cccc2c1nc([nH]2)NC(=O)C(Cc3cccc(c3)Cl)N Z8854579352 57±4%
O

Cl
HN
O
N
N CCc1ccc(nc1)CNCCc2c[nH]c3c2cccc3OC Z3765568162 16±4%
H

N N O NH2
N c1ccc(cc1)COC(=O)C(CC(=O)Nc2cc3ncccn3n2)N Z8854579341 2±2%
O
N
H
O
a
Vendor code ZXXX, manuscript compound numbers in bold. b Percentage displacement data represents
mean values ± SEM of two technical replicates. Ki values obtained from fitting to concentration-response
curve from two technical replicates, except for compound 5 (mean ± SEM) which was tested in three
independent experiments.

S29
Supplementary Table 13. A2AR ligands and the most similar known adenosine receptor ligand.

# Chemical Structure Tca Activityb ChEMBL ID ChEMBL Structure


NH2
H
F N
NH NH2 N N
3 N 0.38 2.5 µM CHEMBL3763215 F
H
N

Br O
O

N
N N
HN N O

4 N 0.36 1.4 µM CHEMBL471853 N


N O
Cl HN
O

O
Cl
H H
H N N N
5 N N N 0.37 20±3.0 µM CHEMBL1098444
HN NH N
Cl NH2 N
HN
H
N O
NH NH2
N HN O
6 O
0.37 1.3 µM CHEMBL3091695 O
O O O N
O

a
Maximal Tanimoto similarity coefficient (Tc) between the compound and ChEMBL human adenosine
receptor ligands with Ki < 10 µM (>10,000 compounds). Coefficients were calculated using the RDKit and
Morgan2 fingerprints. b Ki values obtained from fitting to concentration-response curve from two technical
replicates, except for compound 5 (mean ±SEM) which was tested in three independent experiments.

Supplementary Table 14. Dual-target ligand and the most similar known dopamine and adenosine
receptor ligands.

# Chemical Structure Target Tca Ki (µM)b ChEMBL ID ChEMBL Structure


H H
N N N
Cl
A2AR 0.37 20±3.0 CHEMBL1098444
HN NH N
H
5 N N N
N
H
N
Cl
HN NH2 N
D2R 0.28 14±0.7 CHEMBL267014
N N Cl

a
Maximal Tanimoto similarity coefficient (Tc) between compound 5 and ChEMBL human adenosine and
dopamine receptor ligands with Ki < 10 µM (>21,000 compounds). Coefficients were calculated using the
RDKit and Morgan2 fingerprints. b Data represents mean values ± SEM from three independent
experiments.

S30
Supplementary Figures

Supplementary Figure 15. Radioligand displacement binding curves of discovered D2R ligands.
Percentage D2R radioligand displacement by compounds 1, 2, and 5 in function of their concentration. Data
points represent mean ± SEM from three independent experiments.

S31
Supplementary Figure 16. Functional assay curves of discovered D2R ligands. Representative
concentration-response curves of compounds 1 and 2 in functional assays at the D2R. Data points represent
individual measurements from a single experiment and the corresponding error bars represent the error of
the curve fit on those data points.

S32
Supplementary Figure 17. Ligand enrichment curves for A2AR and D2R models. Logarithmic receiver
operator characteristic (ROC) curves describing the enrichment of known binders of the (a) A2AR and (b)
D2R over corresponding property-matched decoys.

S33
Supplementary Figure 18. Radioligand displacement binding curves of discovered A2AR ligands.
A2AR radioligand displacement by compounds 3-6 in function of their concentration. Data points represent
mean ± SEM from two technical replicates for compound 3, 4 and 6, and three independent experiments
for compound 5.

S34
LC-MS Spectral Data

Supplementary Figure 19. LC-MS data for compound 1. Chemical characterization of compound 1
(Z1441695252) by chromatography (top) and mass-spectrometry (bottom).

S35
Supplementary Figure 20. LC-MS data for compound 2. Chemical characterization of compound 2
(Z2436421891) by chromatography (top) and mass-spectrometry (bottom).

S36
MaxPeak: 96.63%
Ret_Time: 0.881 min BC896335$22 *BC896335$22*
*BC896335$22*
DAD1 A, Sig=215,16 Ref=off (D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D)
mAU 0.881
400
300
200
100 0.603

0 0.5 1 1.5 min


DAD1 B, Sig=254,16 Ref=off (D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D)
mAU

150
100
50
0
0 0.5 1 1.5 min
Mol Wt 391.24 MSD1 TIC, MS File (D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D) ES-API, Fast Scan, Frag: 100, "POS"

Exact Mass 390.07 600000


0.891
# Time Area%
----------------- 400000
1 0.603 3.37 200000
2 0.881 96.63 0.617
0
0 0.5 1 1.5 min
MSD2 TIC, MS File (D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D) ES-API, Fast Scan, Frag: 100, "NEG"
0.893

100000

50000

0
0 0.5 1 1.5 min
ELS1 A, ELS1A, ELSD Signal (D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D)
LSU
10.2

10

9.8

9.6
0 0.5 1 1.5 min
*MSD1 SPC, time=0.616 of D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D ES-API, Fast Scan, Frag: 100, "POS"
157.0
10
RT 0.617 5
158.8 178.8 194.2 229.8
0
100 150 200 250 300 m/z
*MSD1 SPC, time=0.896 of D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D ES-API, Fast Scan, Frag: 100, "POS"
393.0

50
RT 0.891 394.0
157.0
0
100 200 300 400 m/z
*MSD2 SPC, time=0.890 of D:\DATE\1214\L695019D\008-D5B-A7-BC896335$22.D ES-API, Fast Scan, Frag: 100, "NEG"
391.0

10
RT 0.893
392.0
0
200 300 400 500 m/z

Inj.Date 12/14/2023 E Acq. Method C:\Users\ -> ->

Supplementary Figure 21. LC-MS data for compound 3. Chemical characterization of compound 3
(Z8854579339) by chromatography (top) and mass-spectrometry (bottom).

S37
BC896343$2 *BC896343$2*
MaxPeak: 97.25%
Ret_Time: 0.869 min
*BC896343$2*
DAD1 A, Sig=215,16 Ref=off (D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D)
mAU 0.869
400
300
200
1.352
100 0.753
0
0.5 1 1.5 min
DAD1 B, Sig=254,16 Ref=off (D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D)
mAU

100

-100
Mol Wt 355.82 0.5 1 1.5 min
Exact Mass 355.13 MSD1 TIC, MS File (D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D) ES-API, Fast Scan, Frag: 100, "POS"
0.882
# Time Area%
1500000
-----------------
1 0.753 1.09 1000000
2 0.869 97.25 500000
3 1.352 1.65 1.362
0
0.5 1 1.5 min
MSD2 TIC, MS File (D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D) ES-API, Fast Scan, Frag: 100, "NEG"
0.883
150000

100000

50000 1.361

0
0.5 1 1.5 min
ADC1 A, ADC1A, ELSD (D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D)
mV

40
30
20
10
0.5 1 1.5 min
*MSD1 SPC, time=0.884 of D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D ES-API, Fast Scan, Frag: 100, "POS"
356.0

50 358.0
RT 0.882
156.8 359.0
0
100 150 200 250 300 350 400 m/z
*MSD1 SPC, time=1.363 of D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D ES-API, Fast Scan, Frag: 100, "POS"
7.5 442.0
5
RT 1.362 156.8
2.5
83.0 121.8 158.6 239.8 353.8 444.0 479.8
0
100 200 300 400 500 m/z
*MSD2 SPC, time=0.878 of D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D ES-API, Fast Scan, Frag: 100, "NEG"
354.0
20
356.0
RT 0.883 10
357.0
0
100 200 300 400 500 m/z
*MSD2 SPC, time=1.357 of D:\DATA\12\06\L691401D\019-D6F-B8-BC896343$2.D ES-API, Fast Scan, Frag: 100, "NEG"
6 440.0
4
RT 1.361 2 441.2
280.6 476.0
0
100 200 300 400 500 m/z

Inj.Date 12/6/2023 CH <invalid> -3- Acq. Method C:\Users\ -> ->

Supplementary Figure 22. LC-MS data for compound 4. Chemical characterization of compound 4
(Z8854579348) by chromatography (top) and mass-spectrometry (bottom).

S38
MaxPeak: 100.00%
Ret_Time: 0.744 min
BC896314$3 *BC896314$3*
*BC896314$3*
DAD1 A, Sig=215,16 Ref=off (D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D)
mAU 0.744

1000
500
0
0 0.5 1 1.5 min
DAD1 B, Sig=254,16 Ref=off (D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D)
mAU
300
200
100
0
0 0.5 1 1.5 min
MSD1 TIC, MS File (D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D) ES-API, Fast Scan, Frag: 100, "POS"
Mol Wt 430.13 0.763
6000000
Exact Mass 348.08 4000000
# Time Area% 2000000
----------------- 0
0 0.5 1 1.5 min
1 0.744 100.00 MSD2 TIC, MS File (D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D) ES-API, Fast Scan, Frag: 100, "NEG"
1500000 0.762
1000000
500000
0
0 0.5 1 1.5 min
ELS1 A, ELS1A, ELSD Signal (D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D)
LSU
30

20

10
0 0.5 1 1.5 min
*MSD1 SPC, time=0.767 of D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D ES-API, Fast Scan, Frag: 100, "POS"
349.0
75 351.0
50
RT 0.763
25 352.2
157.0
0
100 200 300 400 500 m/z
*MSD2 SPC, time=0.761 of D:\DATA\09.12\L692464D\059-D1B-F8-BC896314$3.D ES-API, Fast Scan, Frag: 100, "NEG"
385.0
347.2
20
349.0
RT 0.762 10 387.2
350.2 389.0
0
100 200 300 400 500 m/z

Inj.Date 12/8/2023 M 33

Supplementary Figure 23. LC-MS data for compound 5. Chemical characterization of compound 5
(Z8854579357) by chromatography (top) and mass-spectrometry (bottom).

S39
MaxPeak: 96.19%
Ret_Time: 0.907 min
BC932714$2 *BC932714$2*
DAD1 A, Sig=215,16 Ref=off (D:\DATE\1209\L692521D\SAMPL000004.D)
*BC932714$2*
mAU 0.907
300
200
100 0.769
0
0.5 1 1.5 min
DAD1 B, Sig=254,16 Ref=off (D:\DATE\1209\L692521D\SAMPL000004.D)
mAU

100
0
-100
0.5 1 1.5 min
MSD1 TIC, MS File (D:\DATE\1209\L692521D\SAMPL000004.D) ES-API, Scan, Frag: 100, "POS"
0.922
Mol Wt 378.81 800000
600000
Exact Mass 342.14 400000
# Time Area% 200000 0.781
0
-----------------
0.5 1 1.5 min
1 0.769 3.81 MSD2 TIC, MS File (D:\DATE\1209\L692521D\SAMPL000004.D) ES-API, Scan, Frag: 100, "NEG"
2 0.907 96.19 80000 0.922
60000
40000
20000 0.779

0.5 1 1.5 min


ADC1 A, ELSD (D:\DATE\1209\L692521D\SAMPL000004.D)
mV

26

25

0.5 1 1.5 min


*MSD1 SPC, time=0.778 of D:\DATE\1209\L692521D\SAMPL000004.D ES-API, Scan, Frag: 100, "POS"

4 206.2

RT 0.781 2 160.0
101.0 159.0 207.0
0
100 200 300 400 500 m/z
*MSD1 SPC, time=0.920 of D:\DATE\1209\L692521D\SAMPL000004.D ES-API, Scan, Frag: 100, "POS"
343.0
75
50
RT 0.922 344.2
25
0
100 200 300 400 500 m/z
*MSD2 SPC, time=0.782 of D:\DATE\1209\L692521D\SAMPL000004.D ES-API, Scan, Frag: 100, "NEG"
204.2
0.2

RT 0.779 0.1 176.2 205.0


241.0
84.8 111.8
141.8 196.0 256.4 311.8 338.8 389.2 426.8 486.8 526.6 570.8
0
100 200 300 400 500 m/z
*MSD2 SPC, time=0.924 of D:\DATE\1209\L692521D\SAMPL000004.D ES-API, Scan, Frag: 100, "NEG"
341.2
7.5
5
RT 0.922 342.0
2.5
0
100 200 300 400 500 m/z

Inj.Date 12/8/2023 CH -6- Acq. Method C:\CHEM32\-> ->

Supplementary Figure 24. LC-MS data for compound 6. Chemical characterization of compound 6
(Z8857701715) by chromatography (top) and mass-spectrometry (bottom).

S40
NMR Spectra

Supplementary Figure 25. 1H-NMR data for compound 1. Chemical characterization of compound 1
(Z1441695252) by proton nuclear magnetic resonance.

S41
Supplementary Figure 26. 1H-NMR data for compound 2. Chemical characterization of compound 2
(Z2436421891) by proton nuclear magnetic resonance.

S42
Supplementary Figure 27. 1H-NMR data for compound 3. Chemical characterization of compound 3
(Z8854579339) by proton nuclear magnetic resonance.

S43
Supplementary Figure 28. 1H-NMR data for compound 4. Chemical characterization of compound 4
(Z8854579348) by proton nuclear magnetic resonance.

S44
Supplementary Figure 29. 1H-NMR data for compound 5. Chemical characterization of compound 5
(Z8854579357) by proton nuclear magnetic resonance.

S45
Supplementary Figure 30. 1H-NMR data for compound 6. Chemical characterization of compound 6
(Z8857701715) by proton nuclear magnetic resonance.

S46
Supplementary References

1. Liu W., et al. Structural basis for allosteric regulation of GPCRs by sodium ions. Science, 337,
232-236 (2012).
2. Lyu J., et al. Ultra-large library docking for discovering new chemotypes. Nature, 566, 224-229
(2019).
3. Beatty J. W., et al. Discovery of Potent and Selective Non-Nucleotide Small Molecule Inhibitors of
CD73. J. Med. Chem., 63, 3935-3955 (2020).
4. Wang S., et al. Structure of the D2 dopamine receptor bound to the atypical antipsychotic drug
risperidone. Nature, 555, 269-273 (2018).
5. Davies T. G., et al. Monoacidic Inhibitors of the Kelch-like ECH-Associated Protein 1: Nuclear
Factor Erythroid 2-Related Factor 2 (KEAP1:NRF2) Protein-Protein Interaction with High Cell
Potency Identified by Fragment-Based Discovery. J. Med. Chem., 59, 3991-4006 (2016).
6. Mesecar A. D., 2020, https://www.rcsb.org/structure/6W63
7. Visnes T., et al. Small-molecule inhibitor of OGG1 suppresses proinflammatory gene expression
and inflammation. Science, 362, 834-839 (2018).
8. Stachel S. J., et al. Identification of potent inhibitors of the sortilin-progranulin interaction. Bioorg.
Med. Chem. Lett., 30, 127403 (2020).
9. Ohno Y., et al. In Vitro Pharmacological Profile of KW-6356, a Novel Adenosine A2A Receptor
Antagonist/Inverse Agonist. Mol. Pharmacol., 103, 6, 311-324 (2023).
10. Xu P., et al. Structures of the human dopamine D3 receptor-Gi complexes. Mol. Cell, 81, 6, 1147-
1159.e4 (2021).
11. Alvarsson, J. et al. Predicting With Confidence: Using Conformal Prediction in Drug Discovery. J.
Pharm. Sci., 110, 42-49 (2021).
12. Tamma, P. D. et al. A Primer on AmpC β-Lactamases: Necessary Knowledge for an Increasingly
Multidrug-resistant World. Clin. Infect. Dis., 69, 1446-1455 (2019).
13. Ullrich, S., & Nitsche C. The SARS-CoV-2 main protease as drug target. Med. Chem. Lett., 30,
127377 (2020).
14. Bellmann L., Penner P., Gastreich M. & Rarey M. Comparison of Combinatorial Fragment Spaces
and Its Application to Ultralarge Make-on-Demand Compound Catalogs. J. Chem. Inf. Model. 62,
553-566 (2022).

S47

You might also like