Computing the relative binding affinity of ligands based on a pairwise binding comparison network

Yu, Jie; Li, Zhaojun; Chen, Geng; Kong, Xiangtai; Hu, Jie; Wang, Dingyan; Cao, Duanhua; Li, Yanbei; Huo, Ruifeng; Wang, Gang; Liu, Xiaohong; Jiang, Hualiang; Li, Xutong; Luo, Xiaomin; Zheng, Mingyue

doi:10.1038/s43588-023-00529-9

Download PDF

Article
Open access
Published: 19 October 2023

Computing the relative binding affinity of ligands based on a pairwise binding comparison network

Jie Yu^1,2,3^na1,
Zhaojun Li^4,5^na1,
Geng Chen^1,6,7,
Xiangtai Kong^1,6,
Jie Hu⁸,
Dingyan Wang^1,3,
Duanhua Cao^1,9,
Yanbei Li^1,6,7,
Ruifeng Huo⁸,
Gang Wang^1,6,
Xiaohong Liu⁵,
Hualiang Jiang^1,6,8,
Xutong Li ORCID: orcid.org/0000-0001-9547-0643^1,6,
Xiaomin Luo ORCID: orcid.org/0000-0003-0426-3417^1,6 &
…
Mingyue Zheng ORCID: orcid.org/0000-0002-3323-3092^1,6,10

Nature Computational Science volume 3, pages 860–872 (2023)Cite this article

15k Accesses
1 Citations
54 Altmetric
Metrics details

Subjects

A preprint version of the article is available at ChemRxiv.

Abstract

Structure-based lead optimization is an open challenge in drug discovery, which is still largely driven by hypotheses and depends on the experience of medicinal chemists. Here we propose a pairwise binding comparison network (PBCNet) based on a physics-informed graph attention mechanism, specifically tailored for ranking the relative binding affinity among congeneric ligands. Benchmarking on two held-out sets (provided by Schrödinger and Merck) containing over 460 ligands and 16 targets, PBCNet demonstrated substantial advantages in terms of both prediction accuracy and computational efficiency. Equipped with a fine-tuning operation, the performance of PBCNet reaches that of Schrödinger’s FEP+, which is much more computationally intensive and requires substantial expert intervention. A further simulation-based experiment showed that active learning-optimized PBCNet may accelerate lead optimization campaigns by 473%. Finally, for the convenience of users, a web service for PBCNet is established to facilitate complex relative binding affinity prediction through an easy-to-operate graphical interface.

Affinity2Vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning

Article Open access 19 March 2022

SQM2.20: Semiempirical quantum-mechanical scoring function yields DFT-quality protein–ligand binding affinity predictions in minutes

Article Open access 06 February 2024

An adaptive graph learning method for automated molecular interactions and properties predictions

Article 23 June 2022

Main

AlphaFold2, which appeared in the 14th round of the Critical Assessment of protein Structure Prediction (CASP), is believed to have solved the half-century-old problem of predicting a protein structure from its primary sequence. This breakthrough has ushered in a new era in structure-based drug design¹. Recently, the Critical Assessment of Computational Hit-finding Experiments (CACHE), a public benchmarking project, has garnered attention from the computational chemistry community and pharmaceutical industry for enhancing small-molecule hit-finding algorithms². However, the hit-to-lead optimization process is still largely driven by hypotheses and depends on the experience of medicinal chemists. Lead optimization aims to design ligands with higher binding affinity while maintaining other properties^3,4,5. During optimization, a congeneric series of ligands is generated that generally share the same core structure and differ only in some substituent groups. The extensive optimization space for a lead, spanning hundreds to thousands of compounds, necessitates substantial resources for experimental evaluations^6,7. Consequently, developing in silico predictive tools is important to expedite drug discovery. By minimizing the number of design-make-test-analyze cycles, these tools facilitate the attainment of compounds possessing desired affinity and property profiles.

In recent decades, many relative binding free energy (RBFE) simulation methods have been proposed for lead optimization, benefiting from improved force fields and sampling algorithms. For example, free energy perturbation (FEP) is a widely used alchemical method⁸ that is achieving remarkable accuracy on specific systems that is nearing 1 kcal mol⁻¹ (ref. ⁹). However, FEP also suffers from several limitations, such as depending on the process of system preparation for its accuracy¹⁰, being limited by considerable computational cost⁹ and being limited to a maximum number of changes between ligands. Another category of RBFE simulation method involves end-points sampling¹¹, such as the molecular mechanics generalized Born surface area (MM-GB/SA)^12,13. End-points sampling methods reduce the computational requirements, but their performance is also compromised. In summary, despite the high accuracy of RBFE simulation methods, their complicated preparation process, limited molecule throughput and low allowance for changes between molecules hinder their practical usage in quickly navigating the optimization space of lead molecules.

In recent years, some artificial intelligence (AI) models designed for guiding lead optimization have emerged^14,15,16. Inspired by RBFE simulation methods, Jiménez–Luna et al. proposed a convolutional Siamese neural network (SNN), called DeltaDelta¹⁵, to directly determine the RBFE between two bound ligands. One advantage of SNN is that it directly determines the RBFE, which eliminates the systematic error derived from the absolute binding free energies (ABFEs). Another advantage is its ability to factor in information from both input ligands, incorporating their structural differences and commonalities. However, DeltaDelta has yet to take full advantage of the SNN architecture. Specifically, DeltaDelta first predicts the ABFE of two inputted compounds, and then directly uses the difference of the predicted ABFE as the final RBFE prediction for loss calculation. This approach does not consider the association between the two inputs (pairwise separability¹⁷). DeltaDelta showed relatively poor outcomes in retrospective lead optimization campaigns without fine-tuning. McNutt et al. recently proposed a multitask convolutional SNN model¹⁶. Their approach involves using the explicit differences between the representations of two inputted ligands as the molecular-pair representation. The potential assumption is that features that are common to two ligands are irrelevant to predicting their difference, which is obviously unreasonable in RBFE predictions. Moreover, they used the prediction of the ABFE as one of the auxiliary tasks, potentially reintroducing the noise originally eliminated by RBFE prediction. Consequently, compared with DeltaDelta, their models did not show substantial performance gains.

In summary, developing an efficient and accurate method to guide lead optimization is an urgent need. To this end, we propose a pairwise binding comparison network (PBCNet) based on a physics-informed graph attention mechanism that is specifically tailored for ranking the relative binding affinity among a congeneric series of ligands. Several physical-oriented modeling strategies are introduced, considering that the formation of intermolecular interactions always follows strict geometric rules¹⁸. Based on our interpretation studies, we found that a relatively high attention score assigned to protein–ligand atom pairs may indicate a more significant interaction. Additionally, PBCNet focuses on molecular substructures that can form intermolecular interactions.

PBCNet has been evaluated in terms of the error and correlation between the predicted and experimental binding affinities. Benchmarking results show that our model substantially outperformed all baselines except FEP+. Furthermore, with a small amount of fine-tuning¹⁹ data, PBCNet is comparable to Schrödinger’s FEP+, but with substantially less computational cost. An ideal model should also have the ability to enrich key high-activity compounds from a batch of structural analogs. We built a benchmark to test whether our model can identify ‘leading’ compounds, and the results indicate that, on average, PBCNet can accelerate lead optimization projects by 473%. Finally, PBCNet has been deployed in the cloud, and the corresponding web service is accessible at https://pbcnet.alphama.com.cn/index.

Results

Model structure

The framework of PBCNet is shown in Fig. 1. It consists of three parts: (1) the message-passing phase, (2) the readout phase and (3) the prediction phase. The input of PBCNet is a pair of pocket–ligand complexes in which the ligands are structural analogs and the parts comprising the pockets are entirely identical. The amino-acid residues of the protein for which the minimum distance for the ligand is less than or equal to 8.0 Å are kept as the protein pocket. The message-passing phase is designed to obtain node-level representations. First, the graph convolutional network (GCN)²⁰ is applied to update the atom representations of the protein pocket alone. Then, the updated protein pocket is combined with the two ligands by building edges between pairs of atoms less than 5.0 Å apart. A well-designed message-passing network (detailed in the Methods) is then used to transmit information across the molecule graphs. Finally, we remove the pocket from the molecular graphs and only retain the ligands. The goal of the readout phase is to obtain the molecular representations (graph-level). In this phase, molecular representations of the ligands (x⁽ⁱ⁾ and x^(j) in Fig. 1) are computed by an Attentive FP²¹ readout operation. Then, the molecular-pair representations (${\widetilde{{\bf{x}}}}^{{{(}}i,\,j{{)}}}$ in Fig. 1) are obtained by equation (7) in the Methods. In the prediction phase, molecular-pair representations are learned by optimizing the losses of two tasks: (1) the predictions of affinity differences and (2) the probabilities that the affinity of ligand i is greater than that of ligand j by two independent branches of three-layer feedforward neural networks (see section Model training and fine-tuning process).

In the inference process, we only need to provide docking poses of a pair of structurally similar small molecules to the same protein to obtain the predicted relative binding affinity. A more detailed description of the model framework, and the difference between the Siamese network and traditional networks are also demonstrated in the Methods.

Performance of PBCNet

Zero-shot learning

First, we analyzed the zero-shot performance of PBCNet on the two held-out test sets (FEP1 and FEP2 sets, see section Benchmark dataset for performance assessment), and selected Schrödinger’s FEP+ (ref. ⁹), Schrödinger’s Glide SP²², MM-GB/SA¹¹, as well as four AI-based models (DeltaDelta¹⁵, Default2018 (ref. ¹⁶), Dense¹⁶ and PIGNet²³) as baselines. The general idea of zero-shot learning is to transfer the knowledge contained in the training instances to the task of testing instance prediction²⁴. This evaluation is designed to simulate the early stage of a lead-optimization campaign, where there is always a lack of compounds with known activity. For each test series we randomly selected one ligand as the reference ligand to infer the absolute binding affinities of the remaining ligands (see section Mathematical formulation), and this process was repeated ten times to avoid randomness. The performances of all methods on the FEP1 and FEP2 sets are summarized in Supplementary Data 1 and 2, respectively. Pearson’s correlation coefficient (R), Spearman’s rank correlation coefficient (ρ) and the pairwise root-mean-square error (r.m.s.e._pw) are used here (see section Determination of model performance). For PIGNet, the results were calculated using its officially reported code and weights. For other baselines, we utilized performance metrics as detailed in their respective original literature.

The results show that the performance of PBCNet is substantially better than that of all baselines except FEP+, meaning that PBCNet is the best of all high-throughput methods mentioned here. Moreover, the accuracy of PBCNet on the FEP1 set has achieved 1.11 kcal mol⁻¹, which is very close to 1 kcal mol⁻¹, and it also achieves the lowest average r.m.s.e._pw (1.49 kcal mol⁻¹) on the FEP2 set. Supplementary Fig. 1 visualizes the model predictions, demonstrating a strong alignment between the predicted ∆pIC₅₀ values (ΔpIC₅₀ is the difference between the pIC₅₀ values of two ligands, pIC₅₀ is the negative logarithm of IC₅₀ in molar concentration and IC₅₀ means 50% inhibitory concentration, which is a type of binding affinity. Please see section Training dataset and data balance) and the corresponding experimental values across the majority of the test series.

We also find that PBCNet is robust, with more stable performance across all testing series compared with other high-throughput baseline methods. This is evident from the Spearman’s rank correlation coefficient; PBCNet shows correlations of over 0.30 in all test series, whereas other high-throughput baseline methods show a more fluctuating ρ, such as Glide SP (CKD2, ρ = −0.36; Tyk2, ρ = 0.79). This phenomenon reflects the good generalization ability of PBCNet.

Then, we can also observe that the performance of PBCNet on the FEP1 set is better than that on the FEP2 set, possibly due to the several out-of-domain samples in the FEP2 set. As a model for lead optimization, PBCNet is designed to infer the activity differences of structural analogs, which always generate high molecule similarities. To be closely consistent with the application scenario, the training set is composed of molecule pairs whose Tanimoto similarity scores are higher than 0.6 (ref. ²⁵). Figure 2a shows the relationship between the model accuracy and molecule similarity, and an obvious negative correlation can be observed. It is not a surprise to notice the similarity-dependent performance of PBCNet, because identifying molecules with different structures is more relevant to virtual screening than lead optimization. Correspondingly, the methods and models designed for virtual screening are always poor at lead optimization, such as Glide and PIGNet, which have been evaluated here. We further counted the proportions of ligand pairs with different similarity scores in the FEP1 and FEP2 sets (Fig. 2b). Figure 2b shows that the proportion of molecule pairs with a Tanimoto similarity score of less than 0.6 in the FEP2 set are substantially higher than that in the FEP1 set (70.4% versus 54.4%), which may lead to the performance differences of our model on the FEP1 and FEP2 sets. However, PBCNet’s ranking performance on the FEP2 set still surpassed all the baselines, except for FEP+. Given this, we may conclude that PBCNet should be of practical value for guiding lead-optimization projects.

**Fig. 2: Performance analysis of PBCNet on the FEP1 and FEP2 sets.**

Finally, we also find our model is highly robust to small changes in ligand poses (specific information is provided in Supplementary Section 1).

Few-shot learning

The reason why we assumed the ranking ability of PBCNet to be inferior to that of FEP+ is because of the ability of FEP+ to sample various binding conformations. Other methods, except MM-GB/SA, only use a single snapshot, which leads to less comprehensive information about the molecular binding process. However, PBCNet has two advantages over FEP+ in a real-world application. First, PBCNet is not limited by molecule throughput, allowing for comprehensive exploration of lead optimization. According to public information⁹, running FEP+ for four perturbations per day requires eight commodity Nvidia GTX-780 graphics processing units (GPUs). In contrast, PBCNet takes only 0.9 s to calculate one perturbation by use of a commodity Nvidia V100 GPU. Through a rough performance conversion, PBCNet is ~100,000 times faster than FEP+. The second advantage is PBCNet’s flexibility. During a lead-optimization campaign, the binding affinity data newly generated can be used to fine-tune PBCNet. Few-shot learning¹⁹ is used to achieve this. For each test congeneric series, we randomly selected several ligands (~2–10) as fine-tuning ligands with known binding affinity, which also serve as reference ligands in the inference phase. The remaining ligands are still the ligands to be tested (referred to as the new testing series). We repeat the above process ten times to avoid randomness.

The performances of the fine-tuned models on the new testing series are summarized in Supplementary Data 3 and Fig. 3. Figure 3 shows that the few-shot learning strategy substantially improves the performance of PBCNet, and the performance increases with the number of fine-tuning ligands. Supplementary Table 1 shows that the performances of the fine-tuned PBCNet on the new and original testing series are similar. This suggests that the performance improvement is not due to the bias resulting from the reduced length of the test series. This consistency is also essential for comparing the fine-tuned PBCNet and FEP+ under existing conditions. We find that, after fine-tuning, PBCNet’s ranking ability is comparable to that of FEP+. For example, PBCNet fine-tuned with four ligands even outperformed FEP+ in terms of Spearman’s rank correlation coefficient on the FEP1 set (0.724 versus 0.720).

**Fig. 3: Change in performance of PBCNet as the number of fine-tuning ligands varies.**

Using PBCNet to accelerate lead optimization

In this section we test whether our model can efficiently identify high-activity compounds in a close-to-real-world lead-optimization scenario by comparing the order of model selection to the experimental order of synthesis, similar to the study of Jiménez–Luna and others¹⁵. We use active learning (AL)²⁶, an uncertainty-guided algorithm, to intelligently prioritize sample acquisition. Data acquisition was simulated as iterative selection from each chemical series, with PBCNet as the active learner. In each series, the compound displaying the highest activity was used as the target ligand that needs to be identified. In cases where multiple compounds hold the same highest activity, we prioritized the earliest synthesized among them as the target ligand. In the first iteration, the earliest synthesized compound in each chemical series was chosen as the reference ligand, and activity values were evaluated across the remaining compounds. Subsequently, three ligands with the highest predictive values were selected. If the target ligand was not among these three, they become new reference ligands for the next iteration. In the second iteration, four existing reference ligands were paired to form a fine-tune set for refining PBCNet. Both the predicted activity values and uncertainties (equations (10) and (11) in the Methods) of the remaining ligands were evaluated by the fine-tuned PBCNet. This evaluation guided the prioritization of three ligands, according to the predefined sampling method. This iteration was repeated until the target ligand was successfully identified.

We adopted three sampling methods with different settings (see section The sample method for simulation-based experiment). Results for this simulation-based benchmark are presented in Supplementary Data 4. We find that the strategies taking uncertainty into consideration are superior to the purely exploitation-oriented one, and the model-oriented as well as user-oriented strategies do not exhibit an obvious performance difference. The model-oriented AL strategy is selected as the representative for further comparison, and three metrics are used and computed as follows:

$${\mathrm{Advantage}}\,{\mathrm{order}}={\mathrm{Experimental}}\,{\mathrm{order}}-{\mathrm{Model}}\,{\mathrm{selection}}\,{\mathrm{order}}$$

(1)

$$\begin{array}{l}{\rm{Advantage}}\,{\rm{ratio}} \\ ={\frac{{\rm{Experimental}}\,{\rm{order}}-{\rm{Model}}\,{\rm{selection}}\,{\rm{order}}}{{\rm{Number}}\,{\rm{of}}\,{\rm{ligands}}}\times 100 \%}\end{array}$$

(2)

$$\begin{array}{l}{\rm{Efficiency}}\,{\rm{improvement}}\,{\rm{ratio}} \\ ={\frac{{\rm{Experimental}}\,{\rm{order}}-{\rm{Model}}\,{\rm{selection}}\,{\rm{order}}}{{\rm{Model}}\,{\rm{selection}}\,{\rm{order}}}\times 100 \%}\end{array}$$

(3)

The ‘advantage ratio’ represents the theoretical percentage of resources saved when utilizing PBCNet for guiding lead optimization, compared to not using it. The ‘efficiency improvement ratio’ represents the increase in efficiency when completing a compound optimization project before and after using PBCNet, assuming that a project ends after obtaining the most active compound.

In six out of nine datasets, AL-equipped PBCNet is able to attain the compound with the highest affinity faster than its experimental order. On average, it accelerated the lead-optimization projects by ~473%, while also achieving an ~30% reduction in resource investment. Surprisingly, for the BCL6, sEH and AAK1 targets, the compounds with the highest affinity were found by PBCNet in the first iteration without the fine-tuning operation. We compared our results to the baseline MM-GB/SA, which was implemented using the Schrödinger Prime MM-GBSA with default settings. The results, presented in Supplementary Table 2, demonstrate that PBCNet consistently outperforms MM-GB/SA across all evaluated metrics. Overall, the results are very promising and suggest that PBCNet could be successfully applied in a prospective scenario to accelerate lead optimization.

Model interpretability analysis

Atom level

Given PBCNet’s impressive performance, it is valuable to investigate how the model makes predictions. Because PBCNet is attention-based, the attention score between a pair of atoms can be seen as a measure of importance. A strong model should assign high scores to atom pairs forming key intermolecular interactions. To illustrate this, we performed a case study on two different ligands in the FEP1 set, focusing on identifying hydrogen bonds²⁷, which are crucial and common intermolecular interactions.

We first computed the intermolecular interactions between the ligands and proteins with Schrödinger2020-4. Because the positions of the hydrogen atoms depended heavily on the program used to add hydrogens, we did not take them into account. For hydrogen-bond donors, we selected the heavy atoms covalently linked with hydrogen atoms for further analysis. We then extracted the attention weights, generated in the last layer of the Distance-aware edge to node block (Methods), of the atoms involved in the formation of hydrogen bonds. The results of these operations are illustrated in Fig. 4, and the intermolecular interactions computed by Schrödinger are summarized in Supplementary Table 3.

**Fig. 4: Node-level interpretability analysis results of PBCNet on two ligands.**

Compound 6a from the thrombin series forms three hydrogen bonds with the target at the 3, 8 and 10 positions (Fig. 4a). We found that the hydrogen bonds formed at the 3 and 10 positions are highlighted. The covalent bonds are also emphasized. This is consistent with a chemical prior that the chemical environment of a ligand atom is largely determined by its covalently linked atoms and the protein atoms involved in key intermolecular interactions. It reveals that PBCNet is able to capture key intermolecular interactions. The computed hydrogen bond at the 8 position is not emphasized, unlike its counterparts at the 3 and 10 positions, possibly due to the relatively weaker hydrogen-bond donor nature of the amide-donor hydrogen atom²⁸. Compound 18660-1 from the JNK1 series forms two hydrogen bonds with the target at the 12 and 18 positions (Fig. 4b). As expected, all of them are highlighted. Moreover, the carbon atom of 18660-1 at the 5 position, which does not form any key intermolecular interaction (computed by Schrödinger), was selected as a negative sample. We can clearly see that only covalent bonds are assigned relatively high attention scores, while the attention scores of the virtual distance bonds are small and uniform in value. The above results all reflect the rationality of the prediction basis of our model.

Substructure level

Medicinal chemists prefer to investigate molecular properties in terms of chemically meaningful fragments rather than individual atoms²⁹. Therefore, we extended our analysis to include substructure-level interpretability.

In this analysis, we employed the substructure mask explanation (SME) methodology, as recently proposed by Wu and others²⁹. We assume that the model’s prediction value for a compound is denoted as ${\hat{y}}$. Then, the compounds are split into substructures using the BRICS method. Sequentially, the hidden representations of the atoms of each substructure are masked during the readout phase, yielding the corresponding prediction value ${\hat{y}}_{{\rm{sub}}_{i}}$ where the subscript sub_i represents the ith substructure. When the predicted value represents the compound’s activity, we consider that a greater decrease in ${\hat{y}}_{{\rm{sub}}_{i}}$ compared to ${\hat{y}}$ indicates that the corresponding substructure plays a more crucial role in the model’s prediction. Thus, the attribution scores used to quantify the importance of each substructure are defined by the following equation:

$${{\rm{Attribution}}}_{{\rm{sub}}_{i}}={\hat{y}-{\hat{y}}_{{\rm{sub}}_{i}}}$$

(4)

and we normalize the attribution scores to normalized attribution scores (Attribution_N) within a range of 0 and 1, according to

$${{{\rm{Attribution}}}\_{{{N}}}}_{{\rm{sub}}_{i}}={\frac{{{\rm{Attribution}}}_{{\rm{sub}}_{i}}}{\mathop{\sum }\nolimits_{i=1}^{N}{{\rm{Attribution}}}_{{\rm{sub}}_{i}}}}$$

(5)

where N is the number of substructures.

Here, we take compound 6a from the thrombin system as a case study, using compound 1a as a reference ligand to illustrate PBCNet’s activity prediction for compound 6a (Fig. 5a). Compound 6a was segmented into seven substructures using the BRICS method, with the amide group being divided into two distinct substructures. To provide a more intuitive representation for medicinal chemists, we manually merged the amide group as a whole (Supplementary Table 4). The visualization is presented in Fig. 5b.

**Fig. 5: Result of PBCNet’s interpretability analysis on the substructure level.**

As shown, we found that Sub₄ and Sub₁ (Supplementary Table 4) have the greatest impact on the predictive results. PBCNet is designed to predict the relative binding affinities, which are predominantly derived from the different substructures of a pair of ligands. Sub4, being the part of compound 6a that structurally deviates from compound 1a, has been emphasized, suggesting that PBCNet indeed captures the structural differences between input ligands. Moreover, as depicted in Fig. 4a, Sub₁ forms two hydrogen bonds with the protein, so the emphasizing of Sub₁ also implies that PBCNet focuses on key molecular motifs that form intermolecular interactions.

Ablation experiments

To enhance the performance of PBCNet, we implemented various strategies, which can be broadly divided into two categories: framework-related and knowledge-related. The former includes the SNN architecture and the classification assistance task, while the latter incorporates physical and prior knowledge. To verify whether these strategies really contribute to the model performance improvement, we performed the following ablation experiments on PBCNet.

PBCNet stands out due to its SNN network framework with paired inputs. We constructed a single-input model termed ‘Singular PBCNet’ to remove the SNN framework. Meanwhile, to verify the effect of pairwise separability on the SNN framework, we built a pairwise separated model referred to ‘Separated PBCNet’. Their frameworks are shown in Supplementary Fig. 2. We also removed the classification auxiliary task and obtained ‘MSE PBCNet’. Note that Singular PBCNet and Separated PBCNet lack the assistance task as they do not use molecular pairs information, and their performance should be compared with MSE PBCNet subsequently. The performance of the ablated models is shown in Supplementary Table 5.

Compared with PBCNet, MSE PBCNet showed a small decrease in performance on both the FEP1 and FEP2 sets (FEP1, 0.636 versus 0.629; FEP2, 0.513 versus 0.488). This aligns with expectations, as the auxiliary task addresses samples with small errors but wrong rankings, which constitute a small fraction of the dataset. Compared with MSE PBCNet, the performance of Singular PBCNet showed a substantial decrease both on the FEP1 set and on the FEP2 set (FEP1, 0.629 versus 0.559; FEP2, 0.488 versus 0.372 (statistically significant)). This result illustrates the advantage of the SNN framework in relative binding affinity prediction. Compared with MSE PBCNet, the performance of Separated PBCNet significantly decreases on the FEP2 set (0.488 versus 0.425). For such results we believe that the ability to consider the structural information of both inputted molecules and their connections simultaneously is crucial for the model performance.

We next removed the distance information, angle information and aromatic information, separately. The performance of the ablated PBCNet is shown in Supplementary Table 5. After removing any of the knowledge-related strategies, the performance of PBCNet decreases on both the FEP1 and FEP2 sets, especially the distance information. This phenomenon indicates that all three knowledge-related strategies contribute to the performance of PBCNet.

Discussion

AI has gained prominence in solving scientific problems by incorporating domain-specific knowledge into its modeling. PBCNet is an example of this integration of physical knowledge into its framework. However, there are still avenues for improvement. First, although PBCNet shows substantial predictive advancements over prior attempts, its zero-shot performance is lower than that of Schrödinger’s FEP+. Therefore, capturing protein conformational changes prompted by ligand binding, just like FEP+, remains an ongoing pursuit to improve model accuracy. Second, the underlying assumption of this study is that similar ligands exhibit similar binding modes. Therefore, extreme cases, where highly similar ligands bind to the protein with entirely different binding modes, may pose challenges for PBCNet’s handling capabilities. Furthermore, PBCNet still relies on medicinal chemists for molecule design and molecular docking binding poses generation. A direct-shot pipeline that integrates molecular generation, docking and optimization, could circumvent cumulative errors in the process of lead optimization.

In the future, we will continue to refine our modeling strategies to enhance PBCNet’s predictive performance by considering the alterations of protein conformation and ligand pose. Simultaneously, we will also try to combine PBCNet with deep molecular generative models to streamline the automated design of high-potency molecules.

Methods

Mathematical formulation

In traditional modeling protocols (single-input modeling methods), suppose we are given a training set with N samples (protein–ligand complexes from the same congeneric series) ${\mathcal{D}}{\mathscr{=}}{\left\{{{\bf{x}}}^{{{(}}i{{)}}},\,{y}^{(i)}\right\}}_{i=1}^{N}$. Here, ${{\bf{x}}}^{{{(}}i{{)}}}\,{\boldsymbol{\in }}\,{{\mathbb{R}}}^{m}$ represents the feature vector of an input, m means its dimension and ${y}^{(i)}\,{\mathbb{\in }}\,{\mathbb{R}}$ is a real-valued property (pIC₅₀ here). ${\mathcal{M}}$ is a deep learning-based regression model parameterized by weights θ and trained on ${\mathcal{D}}$, and ${\hat{y}}^{(i)}={\mathcal{M}}({{\bf{x}}}^{\left(i\right)}{;}\,{\mathbf{\uptheta }})$ represents the prediction result of ${\mathcal{M}}$ for x⁽ⁱ⁾.

For Siamese models, however, these concepts are subject to slight change. First, N training samples are paired with each other to form ${N}\choose{2}$ paired training samples, and tuple p is used to index them:

$${p\in \left\{\left(i,\,j\right){\rm{|}}1\le i < j\le N\right\}}$$

(6)

where i and j correspond to indexes of the first and second complex of a paired sample. Then, the feature vector ${\widetilde{{\bf{x}}}}^{{{(}}i,\,j{{)}}}$ of a paired sample is dependent on x⁽ⁱ⁾ and x^(j). Here, ${\widetilde{{\bf{x}}}}^{{{(}}i,\,j{{)}}}\,{{\in }}\,{{\mathbb{R}}}^{3* m}$ is constructed by the following equation:

$${{\widetilde{{\bf{x}}}}^{\left(i,\,j\right)}={{\bf{x}}}^{\left(i\right)}\oplus {{\bf{x}}}^{\left(j\right)}\oplus \left({{\bf{x}}}^{\left(i\right)}-{{\bf{x}}}^{\left(j\right)}\right)}$$

(7)

where ⊕ is the concatenation operation. The label of a paired sample ${\widetilde{y}}^{(i,\,j)}$ (∆pIC₅₀ here) is calculated according to

$${\widetilde{y}}^{\left(i,\,j\right)}={y}^{\left(i\right)}-{y}^{\left(j\right)}$$

(8)

Finally, the pairwise training dataset ${{\mathcal{D}}}_{p}={\left\{{\widetilde{{\bf{x}}}}^{(i,\,j)},\,{\widetilde{y}}^{(i,\,j)}\right\}}_{1\le i < j\le N}$ is obtained. ${{\mathcal{M}}}_{p}$ is a Siamese regression model parameterized by weights θ_p and trained on ${{\mathcal{D}}}_{p}$. ${\hat{y}}^{(i,\,j)}={{\mathcal{M}}}_{p}({\widetilde{{\bf{x}}}}^{(i,\,j)}{{;}}\,{{\mathbf{\uptheta }}}_{p})$ represents the prediction result of ${{\mathcal{M}}}_{p}$ for ${\widetilde{{\bf{x}}}}^{(i,\,j)}$.

For an unseen complex u whose feature vector is represented by x^(u), we pair it with every complex in ${\mathcal{D}}$, which can be seen as a set of reference samples with known binding affinities in the inference phase, to obtain the pairwise test dataset ${\left\{{\widetilde{{\bf{x}}}}^{{{(}}i,\,u{{)}}},\,{\widetilde{y}}^{(i,\,u)}\right\}}_{i=1}^{N}$. ${{\cal{M}}}_{p}$ is able to output the corresponding N predictions ${\left\{{\hat{y}}^{\left(i,\,u\right)}\right\}}_{i=1}^{N}$, and the predicted absolute affinity of u ${\left\{{\hat{y}}_{i}^{\left(u\right)}\right\}}_{i=1}^{N}$ based on different reference samples can be obtained by the equations

$$\begin{array}{c}{\hat{y}}_{1}^{\left(u\right)}={y}^{\left(1\right)}-{\hat{y}}^{\left(1,\,u\right)}\\ {\hat{y}}_{2}^{\left(u\right)}={y}^{\left(2\right)}-{\hat{y}}^{\left(2,\,u\right)}\\ \vdots \\ {\hat{y}}_{N}^{\left(u\right)}={y}^{\left(N\right)}-{\hat{y}}^{\left(N,\,u\right)}\end{array}$$

(9)

The mean value and variance of ${\left\{{\hat{y}}_{{\rm{i}}}^{\left(u\right)}\right\}}_{i=1}^{N}$ can be deemed the final prediction ${\hat{y}}^{\left(u\right)}$ and uncertainty estimation ${{\sigma }^{2}}^{(u)}$ of u, respectively (equations (10) and (11)):

$${\hat{y}}^{\left(u\right)}={\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}{\hat{y}}_{i}^{\left(u\right)}}$$

(10)

$${{\sigma }^{2}}^{\left(u\right)}={\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}{\left({\hat{y}}^{\left(u\right)}-{\hat{y}}_{i}^{\left(u\right)}\right)}^{2}}$$

(11)

The structure of alternately updated message-passing neural network

A well-designed message-passing neural network (alternately updated message-passing neural network, AU-MPNN) is applied in the message-passing phase (Fig. 1a). Before the detailed introduction of AU-MPNN, some definitions need to be clarified. First, the complex of a ligand and the corresponding protein binding pocket is deemed a directed molecular graph G, in which all heavy atoms are treated as nodes (Nd), and all covalent bonds are treated as edges (E). Moreover, virtual distance edges are built between atom pairs of the ligand and the binding pocket, whose distances are less than or equal to 5.0 Å. Additionally, virtual aromatic nodes are set up for the centroid of each aromatic ring, and virtual aromatic edges are also established between virtual aromatic nodes and the nodes in corresponding aromatic rings. During message passing, all nodes (heavy atom nodes and virtual aromatic nodes) and all edges (covalent bond edges, virtual distance edges and virtual aromatic edges) are equivalent. Finally, the final whole graph G = 〈Nd, E〉 is constructed. Here, all edges are directed, and an edge ${e}_{\overrightarrow{{uv}}}$ indicates that its direction goes from node a_u to node a_v. If there is an edge ${e}_{\overrightarrow{{uv}}}$ in G, a_u is a neighbor node of a_v. In the following, a_v is assumed to be the target node whose representation needs to be updated. The set $V_{nei}=\{a_{u_1},a_{u_2},a_{u_3},\cdots\}$ represents all neighbor nodes of a_v, and a_u refers to any neighbor node of a_v (Supplementary Fig. 3a). Correspondingly, the set ${UV}={\left\{{e}_{\overrightarrow{{u}_{1}v}},\,{e}_{\overrightarrow{{u}_{2}v}},\,{e}_{\overrightarrow{{u}_{3}v}},\,\cdots \right\}}$ is all incoming edges of a_v (edges that point to a_v). Moreover, ${e}_{\vec{{uv}}}$ is assumed to be the target edge that needs to be updated. The set ${U}_{\rm{nei}}={\left\{{a}_{{k}_{1}},\,{a}_{{k}_{2}},\,{a}_{{k}_{3}},\,\cdots \right\}}$ represents all neighbor nodes of a_u except a_v. The set ${KU}={\left\{{e}_{\overrightarrow{{k}_{1}u}},\,{e}_{\overrightarrow{{k}_{2}u}},\,{e}_{\overrightarrow{{k}_{3}u}},\,\cdots \right\}}$ stands for all neighbor edges of ${e}_{\overrightarrow{{uv}}}$, and ${e}_{\overrightarrow{{ku}}}$ refers to any neighbor edge of ${e}_{\overrightarrow{{uv}}}$ (Supplementary Fig. 3a).

The specific architecture of AU-MPNN is shown in Supplementary Fig. 3c. In general, AU-MNPP consists of two phases: (1) distance and angle-aware bond-to-bond blocks and (2) distance-aware bond-to-atom blocks. In the following sections, we will give a detailed introduction for these two phases and the corresponding preparations.

Initial featurization

Node and edge features need to be defined before message passing. Here we use a total of 15 types of atomic feature (Supplementary Table 6) and five types of bond feature (Supplementary Table 7) to characterize them and their local chemical environment. Except for atomic mass, explicit valence, implicit valence and van der Waals (vdw) radius, the rest of these features are encoded in a one-hot fashion. Of note is that the feature vectors of virtual nodes and edges are set as zero vectors.

Initial hidden representations

Initial node and edge features should be further encoded as their initial hidden representations before the first step of message passing. Taking a_v and ${e}_{\overrightarrow{{uv}}}$ as examples, we initialize their hidden representations with

$${{\bf{h}}}_{v}^{0}={\rm{ReLU}}\left({W}_{{\rm{i}}-{\rm{node}}}\times{{\bf{x}}}_{v}+{b}_{{\rm{i}}-{\rm{node}}}\right)$$

(12)

$${{\bf{x}}}_{\overrightarrow{{uv}}}^{{\prime} }={\rm{ReLU}}\left({W}_{{\rm{i}}-{\rm{edge}}}\times{{\bf{x}}}_{\overrightarrow{{uv}}}+{b}_{{\rm{i}}-{\rm{edge}}}\right)$$

(13)

$${{\bf{h}}}_{\overrightarrow{{uv}}}^{0}={\rm{ReLU}}\left({W}_{\rm{i}}\times {\rm{cat}}\left({{\bf{h}}}_{u}^{0},\,{{\bf{x}}}_{\overrightarrow{{uv}}}^{{\prime} }\right)+{b}_{\rm{i}}\right)$$

(14)

where ${{\bf{x}}}_{v}\in {{\mathbb{R}}}^{{l}_{\rm{node}}}$ and ${{\bf{x}}}_{\overrightarrow{{uv}}}\in {{\mathbb{R}}}^{{l}_{\rm{edge}}}$ are initial features of a_v and ${e}_{\overrightarrow{{uv}}}$; ${{\bf{h}}}_{v}^{0}\in {{\mathbb{R}}}^{m}$, ${{\bf{h}}}_{u}^{0}\in {{\mathbb{R}}}^{m}$ and ${{\bf{h}}}_{\overrightarrow{{uv}}}^{0}\in {{\mathbb{R}}}^{m}$ are initial hidden representations of a_v, a_u and ${e}_{\overrightarrow{{uv}}}$, respectively; ${{\bf{x}}}_{\overrightarrow{{uv}}}^{{\prime} }\in {{\mathbb{R}}}^{\frac{m}{2}}$ is an intermediate vector to obtain ${{\bf{h}}}_{\overrightarrow{{uv}}}^{0}$; cat(∙) is the concatenate operation; ${W}_{{\rm{i}}-{\rm{node}}}$, ${W}_{{\rm{i}}-{\rm{edge}}}$ and W_i are learned matrices; and i means ‘initial’. This process is visualized in Supplementary Fig. 3b.

Distance and angle-aware edge-to-edge blocks (DAEE blocks)

The aim of this block is to use the information of the neighbor edges in KU to update the hidden representation of ${e}_{\overrightarrow{{uv}}}$. For ${e}_{\overrightarrow{{uv}}}$, the neighbor edges are not equally important. For example, a neighbor edge that stands for a key intermolecular interaction between ligand and protein should be highlighted. Hence, the attention mechanism in GAT³⁰ is applied here. Moreover, considering that intermolecular interactions are determined by the atomic types and distances, atom pairwise statistical potentials³¹ are introduced as an additional attention bias term. Here, the Bayesian field theory-based potentials³² proposed by Zheng et al. are adopted. Additionally, the degree of the angle between two edges also limits the formation of intermolecular interactions (for example, hydrogen bonds and halogen bonds). Thus, angle information is taken into consideration in computing the attention scores.

The computing process of this block is summarized in Supplementary Fig. 3c (left). First, on each step l, the queries of ${e}_{\overrightarrow{{uv}}}$ (${{\bf{q}}}_{\overrightarrow{{uv}}}^{l}$) and the keys of its any neighbor edge ${e}_{\overrightarrow{ku}}$ (${{\bf{k}}}_{\overrightarrow{{ku}}}^{l}$) are obtained according to

$${{\bf{q}}}_{\overrightarrow{{uv}}}^{l}={W}_{q-{\rm{edge}}}^{l}\times {{\bf{h}}}_{\overrightarrow{{uv}}}^{l-1}+{b}_{q-{\rm{edge}}}^{l}$$

(15)

$${{\bf{k}}}_{\overrightarrow{{ku}}}^{l}={W}_{k-{\rm{edge}}}^{l}\times {{\bf{h}}}_{\overrightarrow{{ku}}}^{l-1}+{b}_{k-{\rm{edge}}}^{l}$$

(16)

where ${W}_{q-{\rm{edge}}}^{l}$ and ${W}_{k-{\rm{edge}}}^{l}$ are two learned matrices. According to the spatial coordinates of nodes a_k, a_u and a_v, the degree of angle θ_kuv between ${e}_{\overrightarrow{ku}}$ and ${e}_{\overrightarrow{{uv}}}$ can be computed. Then, we divide the angles into six angle domains with a cutoff of ${\frac{\uppi }{6}}$ (Supplementary Fig. 3d), and encode them as the corresponding angle embedding. Here, the angle information is fused by extending the original attention mechanism in the GAT with angle-aware attention:

$${\varepsilon }_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}={{{\bf{w}}}_{{\rm{edge}}}^{{{l}}}}\cdot{\rm{LeakyReLU}}\left[{{\bf{q}}}_{\overrightarrow{{uv}}}^{l}+{{\bf{k}}}_{\overrightarrow{{ku}}}^{l}+{W}_{\rm{angle}}^{l}\times {{\mathrm{Divider}}}\left({\theta }_{{kuv}}\right)\right]$$

(17)

where Divider is used to map θ_kuv to the located angle domain one-hot vector, ${W}_{\rm{angle}}^{l}$ is a learned matrix, ${{\bf{w}}}_{\rm{edge}}^{l}$ is a learned vector and ${\varepsilon }_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}$ is the correlation coefficient of ${e}_{\overrightarrow{ku}}$ and ${e}_{\overrightarrow{{uv}}}$. After that, atom pairwise statistical potentials are converted as an additional bias term (p_k, u) to combine distance information:

$${p}_{k,u}=\left\{\begin{array}{cl}1 &{\,\mathrm{if}\,{e}_{\overrightarrow{ku}}\,{\mathrm{is}}\,a\,{\mathrm{covalent}}\,\mathrm{bond}}\\ 2\times \log \left(P\left({\mathrm{type}}_{k},{\mathrm{type}}_{u},{\mathrm{dist}}_{\overrightarrow{ku}}\right)\right) &\mathrm{if}\,{e}_{\overrightarrow{{ku}}}\,\mathrm{is}\,a\,\mathrm{virtual}\,\mathrm{bond}\\ 0.8 &\mathrm{if}\,{\mathrm{type}}_{k}\,\mathrm{or}\,{\mathrm{type}}_{u}\mathrm{is}\,\mathrm{not}\,\mathrm{covered}\end{array}\right.$$

(18)

$${{\varepsilon }^{{\prime} }}_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}={\varepsilon }_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}+{p}_{k,\,u}$$

(19)

$${\alpha }_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}=\frac{\exp \left({{\varepsilon }^{{\prime} }}_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}\right)}{\sum_{{{e}_{\overrightarrow{ku}}}\in {KU}}\exp \left({{\varepsilon }^{{\prime} }}_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}\right)}$$

(20)

where type_k and type_u are atomic types of a_k and a_u; ${\rm{dist}}_{\overrightarrow{{ku}}}$ represents the distance between a_k and a_u (meaning the length of ${e}_{\overrightarrow{{ku}}}$); ${P\left(\cdot \right)}$ is the mapping function of atom pairwise statistical potentials; ${{\varepsilon }^{{\prime} }}_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}$ is the updated correlation coefficient of ${e}_{\overrightarrow{ku}}$ and ${e}_{\overrightarrow{{uv}}}$; and the final calculated attention score ${\alpha }_{\overrightarrow{{uv}},\,\overrightarrow{{ku}}}^{l}$ reflects how important ${e}_{\overrightarrow{ku}}$ is for ${e}_{\overrightarrow{{uv}}}$. Then, the message embedding (${{\bf{m}}}_{\overrightarrow{{uv}}}^{l}$) used to update the hidden representation of ${e}_{\overrightarrow{{uv}}}$ is computed according to:

$${{\bf{m}}}_{\overrightarrow{{uv}}}^{l}=\sum _{{e}_{\overrightarrow{{k}u}}\in {KU}}{\alpha }_{\overrightarrow{{uv}},\,\overrightarrow{{k}u}}^{l}\times {{\bf{k}}}_{\overrightarrow{{ku}}}^{l}$$

(21)

Finally, the updated hidden representation of ${e}_{\overrightarrow{{uv}}}$ (${{\bf{h}}}_{\overrightarrow{{uv}}}^{l}$) is acquired by residual connections by the following equation:

$${{\bf{h}}}_{\overrightarrow{{uv}}}^{l}={\rm{Res}}\left({\rm{Res}}\left({{\bf{h}}}_{\overrightarrow{{uv}}}^{l-1}+{W}_{{\rm{edge}}-2}^{l}\times{\rm{ReLU}}\left({W}_{{\rm{edge}}-1}^{l}\times {{\bf{m}}}_{\overrightarrow{{uv}}}^{l}\right)\right)\right)$$

(22)

where ${W}_{\,{\rm{edge}}-1}^{l}$ and ${W}_{{\rm{edge}}-2}^{l}$ are trained parameter matrices, and ${\rm{Res}}{(\cdot )}$ is the residual connection module (Supplementary Fig. 3e).

Distance-aware edge-to-node blocks (DEN blocks)

The goal of this block is to use the information of the neighbor nodes in ${V}_{\rm{nei}}$ and the incoming edges in UV to update the hidden representation of a_v. The computing process of this block is summarized in Supplementary Fig. 3c (right). Similar to DAEE blocks, we also introduce the attention mechanism and additional distance-based bias term. Similarly, the message-passing phase of the DEN block operates according to

$${{\bf{q}}}_{v}^{l}={W}_{q-{\rm{node}}}^{l}\times {{\bf{h}}}_{v}^{l-1}+{b}_{q-{\rm{node}}}^{l}$$

(23)

$${{\bf{k}}}_{u}^{l}={W}_{k-{\rm{node}}}^{l}\times {{\bf{h}}}_{u}^{l-1}+{b}_{k-{\rm{node}}}^{l}$$

(24)

followed by

$${\varepsilon }_{u,\,v}^{l}={{{\bf{w}}}_{\rm{node}}^{l}}\cdot{\rm{L}}{\rm{eaky}}{\rm{R}}{\rm{e}}{\rm{LU}}\left({{\bf{q}}}_{v}^{l}+{{\bf{k}}}_{u}^{l}\right)$$

(25)

$${{\varepsilon }^{{\prime} }}_{{uv}}^{l}={\varepsilon }_{{uv}}^{l}+{p}_{u,\,v}$$

(26)

$${\alpha }_{u,\,v}^{l}={\frac{\exp \left({\varepsilon ^{\prime} }_{u,\,v}^{l}\right)}{\sum _{{a}_{{u}}\in {V}_{\rm{nei}}}\exp \left({\varepsilon ^{\prime} }_{u,\,v}^{l}\right)}}$$

(27)

followed by

$${{\bf{m}}}_{v}^{l}=\sum _{{e}_{\overrightarrow{{u}v}}\in {UV}}{\alpha }_{{u},\,v}^{l}\times {{\bf{h}}}_{\overrightarrow{{u}v}}^{l}$$

(28)

$${{\bf{h}}}_{v}^{l}={\rm{Res}}\left({\rm{Res}}\left({{\bf{h}}}_{v}^{l-1}+{W}_{{\rm{node}}-2}^{l}\times{\rm{ReLU}}\left({W}_{{\rm{node}}-1}^{l}\times {{\bf{m}}}_{v}^{l}\right)\right)\right)$$

(29)

Note that all the variables here correspond to those in the DAEE blocks.

Data collection and processing

Training dataset and data balance

In this study, the BindingDB protein–ligand validation sets (2020 version)³³ were selected as the original training data source. A total of 1,265 congeneric series were included in the dataset, and, for each series, SMILES (Simplified Molecular Input Line Entry System) of the ligands, PDB IDs of the available cocrystal structures and corresponding binding affinity values were provided by the dataset.

The goal of data processing is to generate docking poses of all the ligands and their corresponding proteins by Glide as the input of our model. SMILES that failed during preparation with RDKit³⁴ were removed. Binding affinity measurements without values as well as uncertain, for example, qualified data with either the ‘<’ or ‘>’ sign, were discarded. The initial three-dimensional structures of the ligands were constructed using RDKit. Then, the ligands were further preprocessed for docking using the Schrödinger LigPrep module with default parameters. From the protein side, the PDB files were prepared using the Protein Preparation Wizard of the Schrödinger suite, following the default protocol. Resolved water molecules that made more than three hydrogen bonds to ligand or receptor atoms were kept, and the structure was centered using the co-crystallized ligand as the center of the receptor grid generated for each protein structure. According to the statistics, 843 (out of 1,265) series possessed multiple available PDB files. For each of these congeneric series, a cross-docking experiment (taking the observed binding site from one protein–ligand complex and docking a different ligand into the site) was carried out to obtain the protein structure with the best pose prediction accuracy for further investigation³⁵. After the pretreatment, the docking was performed using the Glide module in Schrödinger with default parameters, and at most 100 poses per ligand can be written out. Medicinal chemists have long recognized that ligands from the same chemical series tend to bind a given protein in similar poses³⁶; therefore, a key step of pose selection was performed here. For each series, the maximum common substructure (MCS) of each ligand and the co-crystallized ligand was extracted first. Then, the r.m.s.d. of each pose of a ligand and the experimentally determined pose of the co-crystallized ligand in the MCS moiety were calculated, and if the r.m.s.d. was within 2.0 Å, the corresponding pose (referred to as the acceptable pose) will be considered to share the same binding mode with the co-crystallized ligand. When there are multiple acceptable poses of a ligand, the pose with the highest glide score is selected as the final pose. When we cannot obtain the acceptable pose of a ligand through docking, however, the ligand will be discarded to ensure data quality. The above operations associated with Schrödinger were implemented with the 2020-4 version and by the Schrödinger Python API. The Numpy³⁷, Pandas³⁸ and scikit-learn³⁹ packages were used for data processing. Matplotlib⁴⁰ was used for visualization.

A total of 1,007 (out of 1,265) series with IC₅₀ affinity values were extracted (this was the unit with most data available), containing a diverse set of targets. The IC₅₀ affinity values were then log-converted to avoid target scaling issues (pIC₅₀ = −log₁₀IC₅₀). Accordingly, the pIC₅₀ difference (ΔpIC₅₀) between a pair of ligands from the same congeneric series was chosen as the model prediction target here. Twenty-six congeneric series including only one ligand (could not form ligand pairs) and ten congeneric series containing the same protein and ligand as the hold-out test congeneric series (detailed in the next section) were also removed. As a result, there is no overlap in the test congeneric series with the training datasets. Finally, we obtained 971 congeneric series with an average of ~34 ligands per series.

Additionally, we found that the labels of the training data were normally distributed, and most of them were concentrated in the area of [−1, 1] (Supplementary Fig. 4a), which would easily lead to overfitting (a model is able to achieve a low training error as long as the model predicts the mean value of the training labels). Thus, we balanced the training data by undersampling the samples in the high-density regions and oversampling the samples in the low-density regions to alleviate this problem. The label distribution of the balanced training dataset is shown in Supplementary Fig. 4b. The final training dataset consists of 0.6 million pairwise samples.

Benchmark dataset for performance assessment

Datasets provided by Wang et al.⁹ and Schindler et al.⁶ were chosen as the held-out test sets and used to benchmark the performance of different methods for lead optimization in this study. Wang et al. provide eight congeneric series (referred to as the FEP1 set) on different targets with experimentally validated binding free energy ∆G values and corresponding evaluation statistics of FEP calculations. We converted ∆G values to the pIC₅₀ range assuming non-competitive binding, generating the following equation for conversion:

$${\rm{p}}{{\rm{IC}}}_{50}\approx -{\log }_{10}\left({\rm{e}}^{\frac{\Delta {{G}}}{{{RT}}}}\right)$$

(30)

where R = 1.987 × 10⁻³ kcal K⁻¹ mol⁻¹ is the gas constant, T = 297 K is the thermodynamic temperature and e = 2.718 is the Euler number. Schindler et al. also provided eight congeneric series (referred to as the FEP2 set) with pharmaceutically relevant targets, all with experimentally measured binding affinities (IC₅₀ values). Compared with the FEP1 set, the congeneric series in the FEP2 set contains changes in net charge and the charge distribution of molecules as well as ring openings and core hopping. For each series, we also log-converted the labels and paired the ligands as we did for the training data.

Benchmark dataset for simulation-based experiment

Apart from the assessment of model accuracy and model ranking ability on the whole congeneric series, we still intend to test whether our model is able to efficiently identify key high-activity compounds in a close-to-real-world lead-optimization scenario, by retrospectively comparing the order of model selection to the experimental order of synthesis, similar to Jiménez-Luna and others¹⁵. On this basis, we constructed a benchmark consisting of nine recently published datasets^{41,42,43,44,45,46,47,48,49} with available cocrystal structures and pharmaceutically relevant targets. All series were processed as we did for the training data. The information (for example, protein name and PDB ID) about the benchmark is summarized in Supplementary Table 8.

Determination of model performance

We include three different metrics used to determine the performance of the predictive models. Pearson’s correlation coefficient (R) and Spearman’s rank correlation coefficient (ρ) are used to evaluate the ranking ability, and r.m.s.e._pw is used to assess the accuracy of the predictive models.

Note that PBCNet requires at least one reference complex to infer the predictive affinities of other test samples and calculate the corresponding R and ρ. As a result, the test process was repeated ten times independently and the reference complex of each test process was randomly selected to simulate the uncertainty in real applications.

R.m.s.e. is defined as

$${\rm{R.m.s.e.}}={\sqrt{\frac{1}{N}\mathop{\sum }\limits_{u=1}^{N}{\left({y}^{\left(u\right)}-{\hat{y}}^{\left(u\right)}\right)}^{2}}}$$

(31)

where u corresponds to a test sample (a protein–ligand complex here); y^(u) and ${\hat{y}}^{\left(u\right)}$ are the true label and prediction results of the test sample, respectively; and N is the total number of test samples. R.m.s.e._pw is defined as

$${{\rm{R.m.s.e.}}}_{{\rm{pw}}}={\sqrt{\frac{1}{N}\mathop{\sum }\limits_{u=1}^{N}{\left({\widetilde{y}}^{\left(i,\,u\right)}-{\hat{y}}^{\left(i,\,u\right)}\right)}^{2}}}$$

(32)

where (i, u) corresponds to a paired test sample composed of a test complex and any reference complex (from the same congeneric series), and ${\widetilde{y}}^{\left(i,\,u\right)}$ and ${\hat{y}}^{\left(i,\,u\right)}$ are the true label and prediction results of the paired test sample, respectively. Note that here we use r.m.s.e._pw to evaluate the accuracy of the models. The reason for this is that we use experimental affinities of reference complexes to achieve the conversion of ${\hat{y}}^{\left(u\right)}$ and ${\hat{y}}^{\left(i,\,u\right)}$ (equation (8)), as Wang et al. and Schindler et al. did in their studies. Additionally, r.m.s.e._pw in the kcal mol⁻¹ and pIC₅₀ units of our model are reported to compare with baseline models from different studies.

Model training and fine-tuning process

As discussed in the Model structure section, a hybrid loss function is deployed in the training process with equation (33):

$${\rm{Loss}}_{\rm{total}}={\rm{Loss}}_{\rm{MSE}}+{\alpha {\rm{Loss}}}_{\rm{entropy}}$$

(33)

where α is a factor controlling the balance between the two types of loss, which can be seen as a hyperparameter. Here, α is set as 1, Loss_MSE is the loss of mean-square-error loss function, Loss_entropy is entropy loss and Loss_total is final loss. The aim of the introduction of entropy loss is to penalize the predictions with low errors but completely wrong ranking. For example, it is difficult for the regression loss function to penalize a sample with a label of 0.1 and a predicted value of −0.1 due to its low MSE value, but this can be effectively realized by the classification loss function. Additionally, the ranking information contained in the hidden representation of a paired sample may be further reinforced by the auxiliary task to improve the ranking ability of PBCNet.

Hyperparameter optimization was performed by grid research on the training data with inter-congeneric series fivefold cross-validation. Considering the considerable number of training samples, 0.25 epochs was set as the unit of early stopping. In the final training process, the model is trained using a batch size of 96 samples for 5.75 epochs with a learning rate of 5e⁻⁷.

In the fine-tuning phase, we did not perform the auxiliary task of PBCNet. PBCNet was fine-tuned using a batch size of 30 samples for 10 epochs with a learning rate of 1e⁻⁵.

Sample method for simulation-based experiment

The sampling method we define here is as follows:

$${a}={\left\{\begin{array}{cc}\hat{y} & {N}_{\rm{ite}}=1\\ \hat{y}+\beta {\sigma }^{2} & {N}_{\rm{ite}}\ge 2\end{array}\right.}$$

(34)

where ${\hat{y}}$ and σ² are the predicted activity value and uncertainty, a is the acquisition score, N_ite is the number of iterations and β is a user-defined parameter adjusting the exploration–exploitation trade-off. Different values of β correspond to three different situations:

β is equal to zero. It is a purely exploitation-oriented AL scenario where the users do not take uncertainty into consideration.
β is more than zero (a hybrid AL scenario). This sampling strategy is model-oriented or in favor of ‘exploration’. Samples with greater uncertainty have a higher possibility to be selected (meaning more structure–activity relationship will be explored), so that the fine-tuned model’s applicability domain may be expanded and the model is expected to give more reliable predictions in the followed iterations.
β is less than zero. This sampling strategy is user-oriented or in favor of ‘exploitation’. In a real-world scenario, the compounds with the highest predicted activity values will be selected for further experimental verification. However, compounds with greater uncertainty are more likely to be overestimated. Given this point, users may tend to treat uncertainty as a penalty term to ensure the data quality in this iteration.

The strategies mentioned above are all simulated in our work (β = 0, 2, −2, respectively), and six independent runs with different random seeds are conducted.

Statistics and reproducibility

The P values to test for differences in ablation experiments were calculated using a two-sided Wilcoxon signed rank test. The sample size for each analysis was determined by the maximum number of eligible samples available in the respective datasets. The study design did not require blinding. The model’s performance testing involves randomness in the selection of test and reference samples. To mitigate its impact, we conducted multiple repeated experiments using controlled random seed settings (n = 10). To reproduce the primary results of this research, refer to the analytical pipeline available at https://doi.org/10.5281/zenodo.8275244 (ref. ⁵⁰).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this Article.

Data availability

The unprocessed training data are from BindingDB source and can be found at https://www.bindingdb.org/validation_sets/index.jsp. The test datasets used in this study are available at https://doi.org/10.5281/zenodo.8275244 (ref. ⁵⁰). Source data are provided with this paper.

Code availability

The source code for PBCNet is available in the Code Ocean software capsule: https://doi.org/10.24433/CO.1095515.v2 (ref. ⁵¹).

References

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article Google Scholar
Ackloo, S. et al. CACHE (Critical Assessment of Computational Hit-finding Experiments): a public–private partnership benchmarking initiative to enable the development of computational methods for hit-finding. Nat. Rev. Chem. 6, 287–295 (2022).
Article Google Scholar
Nicolaou, C. A. & Brown, N. Multi-objective optimization methods in drug design. Drug Discov. Today Technol. 10, e427–e435 (2013).
Article Google Scholar
Kola, I. & Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat. Rev. Drug Discov. 3, 711–716 (2004).
Article Google Scholar
Ekins, S., Honeycutt, J. D. & Metz, J. T. Evolving molecules using multi-objective optimization: applying to ADME/Tox. Drug Discov. Today 15, 451–460 (2010).
Article Google Scholar
Schindler, C. E. M. et al. Large-scale assessment of binding free energy calculations in active drug discovery projects. J. Chem. Inf. Model. 60, 5457–5474 (2020).
Article Google Scholar
Williams-Noonan, B. J., Yuriev, E. & Chalmers, D. K. Free energy methods in drug design: prospects of ‘alchemical perturbation’ in medicinal chemistry: miniperspective. J. Med. Chem. 61, 638–649 (2018).
Article Google Scholar
Steinbrecher, T. & Labahn, A. Towards accurate free energy calculations in ligand protein-binding studies. Curr. Med. Chem. 17, 767–785 (2010).
Article Google Scholar
Wang, L. et al. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 137, 2695–2703 (2015).
Article Google Scholar
Cournia, Z., Allen, B. & Sherman, W. Relative binding free energy calculations in drug discovery: recent advances and practical considerations. J. Chem. Inf. Model. 57, 2911–2937 (2017).
Article Google Scholar
Genheden, S. & Ryde, U. The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin. Drug Discov. 10, 449–461 (2015).
Article Google Scholar
Srinivasan, J., Cheatham, T. E., Cieplak, P., Kollman, P. A. & Case, D. A. Continuum solvent studies of the stability of DNA, RNA and phosphoramidate-DNA helices. J. Am. Chem. Soc. 120, 9401–9409 (1998).
Article Google Scholar
Kollman, P. A. et al. Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models. Acc. Chem. Res. 33, 889–897 (2000).
Article Google Scholar
Green, H., Koes, D. R. & Durrant, J. D. DeepFrag: a deep convolutional neural network for fragment-based lead optimization. Chem. Sci. 12, 8036–8047 (2021).
Article Google Scholar
Jiménez-Luna, J. et al. DeltaDelta neural networks for lead optimization of small molecule potency. Chem. Sci. 10, 10911–10918 (2019).
Article Google Scholar
McNutt, A. T. & Koes, D. R. Improving ΔΔG predictions with a multitask convolutional Siamese network. J. Chem. Inf. Model. 62, 1819–1829 (2022).
Article Google Scholar
Tynes, M. et al. Pairwise difference regression: a machine learning meta-algorithm for improved prediction and uncertainty quantification in chemical search. J. Chem. Inf. Model. 61, 3846–3857 (2021).
Article Google Scholar
Bissantz, C., Kuhn, B. & Stahl, M. A medicinal chemist’s guide to molecular interactions. J. Med. Chem. 53, 5061–5084 (2010).
Article Google Scholar
Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).
Article Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. International Conference on Learning Representations (ICLR) (OpenReview.net, 2017); https://arxiv.org/pdf/1609.02907.pdf
Xiong, Z. et al. Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63, 8749–8760 (2020).
Article Google Scholar
Friesner, R. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. Med. Chem. 47, 1739–1749 (2004).
Article Google Scholar
Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGNet: a physics-informed deep learning model toward generalized drug-target interaction predictions. Chem. Sci. 13, 3661–3673 (2022).
Article Google Scholar
Romera-Paredes, B. & Torr, P. An embarrassingly simple approach to zero-shot learning. In Visual Attributes. Advances in Computer Vision and Pattern Recognition (Eds. Feris, R. et al.) 2152–2161 (Springer, Cham, 2015).
Zilian, D. & Sotriffer, C. A. SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. J. Chem. Inf. Model. 53, 1923–1933 (2013).
Article Google Scholar
Ding, X. et al. Active learning for drug design: a case study on the plasma exposure of orally administered drugs. J. Med. Chem. 64, 16838–16853 (2021).
Article Google Scholar
Kenny, P. W. Hydrogen-bond donors in drug design. J. Med. Chem. 65, 14261–14275 (2022).
Article Google Scholar
Kenny, P. W. Hydrogen bonding, electrostatic potential and molecular design. J. Chem. Inf. Model. 49, 1234–1244 (2009).
Article Google Scholar
Wu, Z. et al. Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking. Nat. Commun. 14, 2585 (2023).
Article Google Scholar
Velikovi, P. et al. Graph Attention Networks. In Proc. International Conference on Learning Representations (ICLR) (OpenReview.net, 2018); https://openreview.net/forum?id=rJXMpikCZ
Muegge, I. & Martin, Y. C. A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J. Med. Chem. 42, 791–804 (1999).
Article Google Scholar
Zheng, Z. et al. Generation of pairwise potentials using multidimensional data mining. J. Chem. Theory Comput. 14, 5045–5067 (2018).
Article MathSciNet Google Scholar
Gilson, M. K. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045–D1053 (2016).
Article Google Scholar
Landrum, G. RDKit: open-source cheminformatics from machine learning to chemical registration. RDKit https://rdkit.org/docs/source/rdkit.Chem.Scaffolds.rdScaffoldNetwork.html (2019).
Fischer, A., Smiesko, M., Sellner, M. & Lill, M. A. Decision making in structure-based drug discovery: visual inspection of docking results. J. Med. Chem. 64, 2489–2500 (2021).
Article Google Scholar
Paggi, J. M. et al. Leveraging nonstructural data to predict structures and affinities of protein-ligand complexes. Proc. Natl Acad. Sci. USA 118, e2112621118 (2021).
Article Google Scholar
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article Google Scholar
McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conference (Eds. van der Walt, S. & Millma, J.) 56–61 (SCIPY, 2010); https://doi.org/10.25080/Majora-92bf1922-00a
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Article Google Scholar
Wilson, C. et al. Optimization of TAM16, a benzofuran that inhibits the thioesterase activity of Pks13; evaluation toward a preclinical candidate for a novel antituberculosis clinical target. J. Med. Chem. 65, 409–423 (2022).
Article Google Scholar
Keylor, M. H. et al. Structure-guided discovery of aminoquinazolines as brain-penetrant and selective LRRK2 inhibitors. J. Med. Chem. 65, 838–856 (2022).
Article Google Scholar
Davis, O. A. et al. Optimizing shape complementarity enables the discovery of potent tricyclic BCL6 inhibitors. J. Med. Chem. 65, 8169–8190 (2022).
Article Google Scholar
Hartz, R. A. et al. Bicyclic heterocyclic replacement of an aryl amide leading to potent and kinase-selective adaptor protein 2-associated kinase 1 inhibitors. J. Med. Chem. 65, 4121–4155 (2022).
Article Google Scholar
Teuscher, K. B. et al. Discovery of potent orally bioavailable WD repeat domain 5 (WDR5) inhibitors using a pharmacophore-based optimization. J. Med. Chem. 65, 6287–6312 (2022).
Article Google Scholar
Lillich, F. F. et al. Structure-based design of dual partial peroxisome proliferator-activated receptor γ agonists/soluble epoxide hydrolase inhibitors. J. Med. Chem. 64, 17259–17276 (2021).
Article Google Scholar
Barlaam, B. et al. Discovery of a series of 7-azaindoles as potent and highly selective CDK9 inhibitors for transient target engagement. J. Med. Chem. 64, 15189–15213 (2021).
Article Google Scholar
Fallica, A. N. et al. Discovery of novel acetamide-based heme oxygenase-1 inhibitors with potent in vitro antiproliferative activity. J. Med. Chem. 64, 13373–13393 (2021).
Article Google Scholar
Turner, L. D. et al. From fragment to lead: de novo design and development toward a selective FGFR2 inhibitor. J. Med. Chem. 65, 1481–1504 (2022).
Article Google Scholar
Yu, J. et al. Computing the relative binding affinity of ligands based on a pairwise binding comparison network. Zenodo https://doi.org/10.5281/zenodo.8275244 (2023).
Yu, J. et al. Computing the relative binding affinity of ligands based on a pairwise binding comparison network. Code Ocean https://doi.org/10.24433/CO.1095515.v2 (2023).

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (T2225002, 82273855 to M.Z.; 82130108 to X. Luo; 82204278 to X. Li), Lingang Laboratory (LG202102-01-02 to M.Z.), the National Key Research and Development Program of China (2022YFC3400504 to M.Z.), China Postdoctoral Science Foundation (2022M720153 to X. Li), SIMM-SHUTCM Traditional Chinese Medicine Innovation Joint Research Program (E2G805H to M.Z.), Shanghai Municipal Science and Technology Major Project, and the open fund of state key laboratory of Pharmaceutical Biotechnology, Nanjing University, China (KF-202301 to M.Z.).

Author information

These authors contributed equally: Jie Yu, Zhaojun Li.

Authors and Affiliations

Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
Jie Yu, Geng Chen, Xiangtai Kong, Dingyan Wang, Duanhua Cao, Yanbei Li, Gang Wang, Hualiang Jiang, Xutong Li, Xiaomin Luo & Mingyue Zheng
School of Information Science and Technology, Shanghai Tech University, Shanghai, China
Jie Yu
Lingang Laboratory, Shanghai, China
Jie Yu & Dingyan Wang
College of Computer and Information Engineering, Dezhou University, Dezhou City, China
Zhaojun Li
Development Department, Suzhou Alphama Biotechnology Co., Ltd, Suzhou City, China
Zhaojun Li & Xiaohong Liu
University of Chinese Academy of Sciences, Beijing, China
Geng Chen, Xiangtai Kong, Yanbei Li, Gang Wang, Hualiang Jiang, Xutong Li, Xiaomin Luo & Mingyue Zheng
School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
Geng Chen & Yanbei Li
School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing, Jiangsu, China
Jie Hu, Ruifeng Huo & Hualiang Jiang
Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
Duanhua Cao
State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University, Nanjing, Jiangsu, China
Mingyue Zheng

Authors

Jie Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zhaojun Li
View author publications
You can also search for this author in PubMed Google Scholar
Geng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiangtai Kong
View author publications
You can also search for this author in PubMed Google Scholar
Jie Hu
View author publications
You can also search for this author in PubMed Google Scholar
Dingyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Duanhua Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yanbei Li
View author publications
You can also search for this author in PubMed Google Scholar
Ruifeng Huo
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hualiang Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xutong Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Mingyue Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.Y., M.Z., X. Luo, X. Li, H.J. and D.W. designed the research study. J.Y. developed the method and wrote the code. G.C., X.K., J.H., D.C., G.W., R.H. and Y.L. performed the analysis. J.Y., M.Z. and X. Luo wrote the paper. Z.L., J.Y. and X. Liu developed the web service. All authors read and approved the manuscript.

Corresponding authors

Correspondence to Xutong Li, Xiaomin Luo or Mingyue Zheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Sandro Cosconati and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Section 1, Figs. 1–4 and Tables 1–10.

Reporting Summary

Peer Review File

Supplementary Data 1

The performance of PBCNet with zero-shot learning on the FEP1 set. The first column of the table denotes the different methods and the second column denotes the different metrics, where R denotes Pearson’s correlation coefficient, ρ denotes Spearman’s rank correlation coefficient and RMSEpw denotes the pairwise root-mean-square-error. For each average metric, the best one is in bold, and the suboptimal one is underlined. For PBCNet, the mean and the variance (in brackets) of the ranking metrics are all reported (n = 10).

Supplementary Data 2

The performance of PBCNet with zero-shot learning on the FEP2 set. The first column of the table denotes the different methods and the second column denotes the different metrics, where R denotes Pearson’s correlation coefficient, ρ denotes Spearman’s rank correlation coefficient and RMSEpw denotes the pairwise root-mean-square-error. For each average metric, the best one is in bold, and the suboptimal one is underlined. For PBCNet, the mean and the variance (in brackets) of the ranking metrics are all reported (n = 10).

Supplementary Data 3

The performance of PBCNet with few-shot learning on the FEP1 and FEP2 sets. The first column of the table denotes the different methods and the second column denotes the different metrics, where R denotes Pearson’s correlation coefficient, ρ denotes Spearman’s rank correlation coefficient and RMSEpw denotes the pairwise root-mean-square-error. For each average metric, the best one is in bold. For PBCNet, the mean and the variance (in brackets) of the ranking metrics are all reported (n = 10). One thing to keep in mind is that there are only 11 ligands in Thrombin (a testing series in FEP1 set), so the performance of the FEP1 set reported in the 11^th column (fine-tuned with 10 ligands) is based on the remaining seven series.

Supplementary Data 4

Selection experiment results of the active learning equipped PBCNet for nine different datasets. The first column of the table indicates the name of the system, the second column is the number of compounds per system and the third column indicates the order of experimental synthesis of the target ligands (the compound with the highest affinity in each chemical series). Columns 4–6 indicate the order of selection of the target compounds for PBCNet with different β values, which is a user-defined parameter adjusting the exploration–exploitation trade-off (see equation (34) in the main text), and the average and corresponding variance values based on six independently runs with different random seeds are reported here (n = 6). For the definition of the last three indicators, refer to equations (1)–(3) in the main text.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, J., Li, Z., Chen, G. et al. Computing the relative binding affinity of ligands based on a pairwise binding comparison network. Nat Comput Sci 3, 860–872 (2023). https://doi.org/10.1038/s43588-023-00529-9

Download citation

Received: 05 May 2023
Accepted: 05 September 2023
Published: 19 October 2023
Issue Date: October 2023
DOI: https://doi.org/10.1038/s43588-023-00529-9

Subjects

Abstract

Similar content being viewed by others

Main

Results

Model structure

Performance of PBCNet

Zero-shot learning

Few-shot learning

Using PBCNet to accelerate lead optimization

Model interpretability analysis

Atom level

Substructure level

Ablation experiments

Discussion

Methods

Mathematical formulation

The structure of alternately updated message-passing neural network

Initial featurization

Initial hidden representations

Distance and angle-aware edge-to-edge blocks (DAEE blocks)

Distance-aware edge-to-node blocks (DEN blocks)

Data collection and processing

Training dataset and data balance

Benchmark dataset for performance assessment

Benchmark dataset for simulation-based experiment

Determination of model performance

Model training and fine-tuning process

Sample method for simulation-based experiment

Statistics and reproducibility

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links