Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Runhua      Zhang; Baozhong      Zhu; Tengsheng      Jiang; Zhiming      Cui; Hongjie      Wu

doi:10.2174/0115748936285519240110070209

Abstract

Background: Conventional approaches to drug discovery are often characterized by lengthy and costly processes. To expedite the discovery of new drugs, the integration of artificial intelligence (AI) in predicting drug-target binding affinity (DTA) has emerged as a crucial approach. Despite the proliferation of deep learning methods for DTA prediction, many of these methods primarily concentrate on the amino acid sequence of proteins. Yet, the interactions between drug compounds and targets occur within distinct segments within the protein structures, whereas the primary sequence primarily captures global protein features. Consequently, it falls short of fully elucidating the intricate relationship between drugs and their respective targets.

Objective: This study aims to employ advanced deep-learning techniques to forecast DTA while incorporating information about the secondary structure of proteins.

Methods: In our research, both the primary sequence of protein and the secondary structure of protein were leveraged for protein representation. While the primary sequence played the role of the overarching feature, the secondary structure was employed as the localized feature. Convolutional neural networks and graph neural networks were utilized to independently model the intricate features of target proteins and drug compounds. This approach enhanced our ability to capture drugtarget interactions more effectively.

Results: We have introduced a novel method for predicting DTA. In comparison to DeepDTA, our approach demonstrates significant enhancements, achieving a 3.9% increase in the Concordance Index (CI) and a remarkable 34% reduction in Mean Squared Error (MSE) when evaluated on the KIBA dataset.

Conclusion: In conclusion, our results unequivocally demonstrate that augmenting DTA prediction with the inclusion of the protein's secondary structure as a localized feature yields significantly improved accuracy compared to relying solely on the primary structure.

« Previous Next »

[1]
DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Econ  2016; 47(47): 20-33.
 [http://dx.doi.org/10.1016/j.jhealeco.2016.01.012] [PMID:  26928437]

[2]
Mullard A. New drugs cost US$2.6 billion to develop. Nat Rev Drug Discov  2014; 13(12): 877-7.
 [http://dx.doi.org/10.1038/nrd4507] [PMID:  25435204]

[3]
Ding Y, Tang J, Guo F. Identification of drug–target interactions via dual laplacian regularized least squares with multiple kernel fusion. Knowl Base Syst  2020; 204: 106254.
 [http://dx.doi.org/10.1016/j.knosys.2020.106254]

[4]
Sun M, Tiwari P, Qian Y, Ding Y, Zou Q. MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity. Knowl Base Syst  2022; 250: 109174.
 [http://dx.doi.org/10.1016/j.knosys.2022.109174]

[5]
Ding Y, Tang J, Guo F. Identification of drug–target interactions via fuzzy bipartite local model. Neural Comput Appl  2020; 32(14): 10303-19.
 [http://dx.doi.org/10.1007/s00521-019-04569-z]

[6]
Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics  2010; 26(12): i246-54.
 [http://dx.doi.org/10.1093/bioinformatics/btq176] [PMID:  20529913]

[7]
Gohlke H, Klebe G. Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. Angew Chem Int Ed  2002; 41(15): 2644-76.
 [http://dx.doi.org/10.1002/1521-3773(20020802)41:15<2644::AID-ANIE2644>3.0.CO;2-O] [PMID:  12203463]

[8]
Tang J, Szwajda A, Shakyawar S, et al. Making sense of large-scale kinase inhibitor bioactivity data sets: A comparative and integrative analysis. J Chem Inf Model  2014; 54(3): 735-43.
 [http://dx.doi.org/10.1021/ci400709d] [PMID:  24521231]

[9]
Fielding L. NMR methods for the determination of protein–ligand dissociation constants. Prog Nucl Magn Reson Spectrosc  2007; 51(4): 219-42.
 [http://dx.doi.org/10.1016/j.pnmrs.2007.04.001]

[10]
Cer RZ, Mudunuri U, Stephens R, Lebeda FJ. IC50-To-Ki: A web-based tool for converting IC50 to Ki values for inhibitors of enzyme activity and ligand binding. Nucleic Acids Res  2009; 37: W441-5.
 [http://dx.doi.org/10.1093/nar/gkp253]

[11]
Yang H, Ding Y, Tang J, Guo F. Drug–disease associations prediction via multiple Kernel-based dual graph regularized least squares. Appl Soft Comput  2021; 112: 107811.
 [http://dx.doi.org/10.1016/j.asoc.2021.107811]

[12]
Ding Y, Tang J, Guo F. Human protein subcellular localization identification via fuzzy model on kernelized neighborhood representation. Appl Soft Comput  2020; 96: 106596.
 [http://dx.doi.org/10.1016/j.asoc.2020.106596]

[13]
Wu H, Ling H, Gao L, et al. Empirical potential energy function toward ab initio folding G protein-coupled receptors. IEEE/ACM Trans Comput Biol Bioinformatics  2021; 18(5): 1752-62.
 [http://dx.doi.org/10.1109/TCBB.2020.3008014] [PMID:  32750885]

[14]
Karimi M, Wu D, Wang Z, Shen Y. Explainable deep relational networks for predicting compound–protein affinities and contacts. J Chem Inf Model  2021; 61(1): 46-66.
 [http://dx.doi.org/10.1021/acs.jcim.0c00866] [PMID:  33347301]

[15]
Ding Y, Tang J, Guo F. Identification of drug-target interactions via multi-view graph regularized link propagation model. Neurocomputing  2021; 461: 618-31.
 [http://dx.doi.org/10.1016/j.neucom.2021.05.100]

[16]
Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Model  1988; 28(1): 31-6.

[17]
Ding Y, Tang J, Guo F. Identification of drug-side effect association via semisupervised model and multiple kernel learning. IEEE J Biomed Health Inform  2019; 23(6): 2619-32.
 [http://dx.doi.org/10.1109/JBHI.2018.2883834] [PMID:  30507518]

[18]
Öztürk H, Özgür A, Ozkirimli E. DeepDTA: Deep drug–target binding affinity prediction. Bioinformatics  2018; 34(17): i821-9.
 [http://dx.doi.org/10.1093/bioinformatics/bty593] [PMID:  30423097]

[19]
Öztürk H, Ozkirimli E, Özgür A. WideDTA: Prediction of drug-target binding affinity. arXiv:190204166 2019.

[20]
Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S. GraphDTA: Predicting drug–target binding affinity with graph neural networks. Bioinformatics  2020; 37(8): 1140-7.
 [PMID:  33119053]

[21]
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? arXiv:181000826 2019.

[22]
Veličković P, Cucurull G, Casanova A, Romero A, Pietro L, Bengio Y. Graph attention networks. arXiv:171010903 2017.

[23]
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:160902907 2017.

[24]
Chu Z, Huang F, Fu H, et al. Hierarchical graph representation learning for the prediction of drug-target binding affinity. Inf Sci  2022; 613: 507-23.
 [http://dx.doi.org/10.1016/j.ins.2022.09.043]

[25]
Yang Z, Zhong W, Zhao L, Yu-Chian CC. MGraphDTA: Deep multiscale graph neural network for explainable drug–target binding affinity prediction. Chem Sci  2022; 13(3): 816-33.
 [http://dx.doi.org/10.1039/D1SC05180F]

[26]
Karimi M, Wu D, Wang Z, Shen Y. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics  2019; 35(18): 3329-38.
 [http://dx.doi.org/10.1093/bioinformatics/btz111] [PMID:  30768156]

[27]
Kha QH, Ho QT, Le NQK. Identifying SNARE proteins using an alignment-free method based on multiscan convolutional neural network and PSSM profiles. J Chem Inf Model  2022; 62(19): 4820-6.
 [http://dx.doi.org/10.1021/acs.jcim.2c01034] [PMID:  36166351]

[28]
Yuan Q, Chen K, Yu Y, Le NQK, Chua MCH. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief Bioinform  2023; 24(1): bbac630.
 [http://dx.doi.org/10.1093/bib/bbac630] [PMID:  36642410]

[29]
Nguyen TM, Nguyen T, Le TM, Tran T. Gefa: early fusion approach in drug-target affinity prediction. IEEE/ACM Trans Comput Biol Bioinformatics  2022; 19(2): 718-28.
 [http://dx.doi.org/10.1109/TCBB.2021.3094217] [PMID:  34197324]

[30]
Pandey M, Radaeva M, Mslati H, et al. Ligand binding prediction using protein structure graphs and residual graph attention networks. Molecules  2022; 27(16): 5114.
 [http://dx.doi.org/10.3390/molecules27165114] [PMID:  36014351]

[31]
Davis MI, Hunt JP, Herrgard S, et al. Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol  2011; 29(11): 1046-51.
 [http://dx.doi.org/10.1038/nbt.1990] [PMID:  22037378]

[32]
Guermeur Y, Geourjon C, Gallinari P, Deléage G. Improved performance in protein secondary structure prediction by inhomogeneous score combination. Bioinformatics  1999; 15(5): 413-21.
 [http://dx.doi.org/10.1093/bioinformatics/15.5.413] [PMID:  10366661]

[33]
Combet C, Blanchet C, Geourjon C, Deléage G. NPS@: network protein sequence analysis. Trends Biochem Sci  2000; 25(3): 147-50.
 [http://dx.doi.org/10.1016/S0968-0004(99)01540-6] [PMID:  10694887]

[34]
Garnier J, Gibrat JF, Robson B. GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol.   1996; 266: pp. 540-53.
 [http://dx.doi.org/10.1016/S0076-6879(96)66034-0] [PMID:  8743705]

[35]
Levin JM, Robson B, Garnier J. An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett  1986; 205(2): 303-8.
 [http://dx.doi.org/10.1016/0014-5793(86)80917-6] [PMID:  3743779]

[36]
Geourjon C, Deléage G. SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Bioinformatics  1995; 11(6): 681-4.
 [http://dx.doi.org/10.1093/bioinformatics/11.6.681] [PMID:  8808585]

[37]
Wu H, Wang K, Lu L, Xue Y, Lyu Q, Jiang M. Deep conditional random field approach to transmembrane topology prediction and application to GPCR three-dimensional structure modeling. IEEE/ACM Trans Comput Biol Bioinformatics  2017; 14(5): 1106-14.
 [http://dx.doi.org/10.1109/TCBB.2016.2602872] [PMID:  27576262]

[38]
Chan WKB, Zhang H, Yang J, et al. GLASS: A comprehensive database for experimentally validated GPCR-ligand associations. Bioinformatics  2015; 31(18): 3035-42.
 [http://dx.doi.org/10.1093/bioinformatics/btv302] [PMID:  25971743]

[39]
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins  2001; 43(3): 246-55.
 [http://dx.doi.org/10.1002/prot.1035] [PMID:  11288174]

[40]
Wang H, Tang J, Ding Y, Guo F. Exploring associations of non-coding RNAs in human diseases via three-matrix factorization with hypergraph-regular terms on center kernel alignment. Brief Bioinform  2021; 22(5): bbaa409.
 [http://dx.doi.org/10.1093/bib/bbaa409] [PMID:  33443536]

[41]
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv:13013781 2013.

[42]
Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers  1983; 22(12): 2577-637.
 [http://dx.doi.org/10.1002/bip.360221211] [PMID:  6667333]

[43]
Landrum G. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum  2013; 8: 31.

[44]
Li W, Matthew Z, Sixin Z, Le Cun Y, Fergus R. Regularization of neural networks using DropConnect. Proceedings of the 30th International Conference on Machine Learning, PMLR.  1058-66.

[45]
Kingma D, Ba J. Adam: A Method for Stochastic Optimization. Comput Sci 2014.

[46]
Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML-10).  807-14.

[47]
Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci  2021; 7: e623.
 [http://dx.doi.org/10.7717/peerj-cs.623] [PMID:  34307865]

[48]
Brentnall AR, Cuzick J. Use of the concordance index for predictors of censored survival data. Stat Methods Med Res  2018; 27(8): 2359-73.
 [http://dx.doi.org/10.1177/0962280216680245] [PMID:  27920368]

[49]
Zhao Q, Xiao F, Yang M, Li Y, Wang J. AttentionDTA: Prediction of drug–target binding affinity using attention model. 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)  64-9.
 [http://dx.doi.org/10.1109/BIBM47256.2019.8983125]

[50]
Tang Z, Liu X, Li Z, et al. SpaRx: elucidate single-cell spatial heterogeneity of drug responses for personalized treatment. Brief Bioinform  2023; 24(6): bbad338.
 [http://dx.doi.org/10.1093/bib/bbad338] [PMID:  37798249]

[51]
Tang Z, Li Z, Hou T, et al. SiGra: Single-cell spatial elucidation through an image-augmented graph transformer. Nat Commun  2023; 14(1): 5618.
 [http://dx.doi.org/10.1038/s41467-023-41437-w] [PMID:  37699885]

Rights & Permissions Print Cite

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/0115748936285519240110070209	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Abstract Play Pause

Related Journals

Related Books

Abstract