摘要
癌症是全球死亡的主要原因之一,潜在的血管生成是癌症的标志之一。已经在努力发现抗血管生成肽 (AAP) 作为一种有前途的治疗途径,它可以解决新血管的形成。因此,AAP 的鉴定为了解其与发现新抗癌药物相关的机械特性提供了一条可行的途径。尽管公共数据库中有丰富的肽序列,但由于高成本和费力的性质,鉴定抗血管生成肽的实验工作进展非常缓慢。由于其固有的理解大量数据的能力,机器学习 (ML) 代表了一种可用于基于肽的药物发现的有利可图的技术。在这篇综述中,我们对基于 ML 的 AAP 预测器使用的特征描述符、ML 算法、交叉验证方法和预测性能进行了全面的比较分析。此外,还讨论了这些 AAP 预测器的通用框架及其固有的弱点。特别是,我们探索了提高预测准确性和模型可解释性的未来前景,这代表了克服现有 AAP 预测器的一些固有弱点的有趣途径。我们预计,这项审查将有助于研究人员快速筛选和鉴定有希望用于临床的 AAP。
关键词: 抗血管生成肽、治疗性肽、分类、机器学习、特征表示、特征选择
[http://dx.doi.org/10.3322/caac.21551] [PMID: 30620402]
[http://dx.doi.org/10.1371/journal.pone.0184360] [PMID: 29016607]
[http://dx.doi.org/10.1016/j.clinthera.2016.03.026] [PMID: 27158009]
[PMID: 24659637]
[http://dx.doi.org/10.1038/sj.leu.2404756] [PMID: 17637715]
[http://dx.doi.org/10.1038/nrd.2015.17] [PMID: 26775688]
[http://dx.doi.org/10.3892/ol.2018.8733] [PMID: 29963134]
[http://dx.doi.org/10.1093/carcin/21.3.505] [PMID: 10688871]
[http://dx.doi.org/10.1089/scd.2012.0376] [PMID: 23249281]
[http://dx.doi.org/10.1038/ncb2103] [PMID: 20871601]
[http://dx.doi.org/10.1056/NEJM197111182852108] [PMID: 4938153]
[http://dx.doi.org/10.7150/thno.21674] [PMID: 29290825]
[http://dx.doi.org/10.3390/biomedicines5020034] [PMID: 28635679]
[PMID: 31526842]
[http://dx.doi.org/10.3390/genes12050783] [PMID: 34065368]
[http://dx.doi.org/10.1371/journal.pone.0136990] [PMID: 26335203]
[http://dx.doi.org/10.1038/s41598-018-33911-z] [PMID: 30356060]
[http://dx.doi.org/10.1038/s41598-018-32443-w] [PMID: 30218091]
[http://dx.doi.org/10.1186/s12967-019-1813-7] [PMID: 30832671]
[http://dx.doi.org/10.1093/bioinformatics/btz246] [PMID: 30994882]
[http://dx.doi.org/10.3390/ijms20122950] [PMID: 31212918]
[http://dx.doi.org/10.1093/bioinformatics/btaa275] [PMID: 32348463]
[http://dx.doi.org/10.2174/092986712801661004] [PMID: 22725698]
[http://dx.doi.org/10.1016/j.lfs.2017.10.025] [PMID: 29055800]
[http://dx.doi.org/10.3389/fphar.2016.00526] [PMID: 28111551]
[http://dx.doi.org/10.3389/fcimb.2016.00194] [PMID: 28083516]
[http://dx.doi.org/10.1016/j.bmc.2017.06.052] [PMID: 28720325]
[http://dx.doi.org/10.1074/jbc.274.17.11721] [PMID: 10206987]
[http://dx.doi.org/10.1016/S1357-2725(96)00171-9] [PMID: 9304800]
[http://dx.doi.org/10.1159/000088478] [PMID: 16301830]
[http://dx.doi.org/10.1016/j.clinthera.2006.11.015] [PMID: 17212999]
[http://dx.doi.org/10.1159/000497161] [PMID: 31799203]
[http://dx.doi.org/10.3390/cancers9090116] [PMID: 28869579]
[http://dx.doi.org/10.1158/1535-7163.MCT-06-0100] [PMID: 16985061]
[http://dx.doi.org/10.1038/nrg3920] [PMID: 25948244]
[http://dx.doi.org/10.1016/j.tig.2017.12.005] [PMID: 29331490]
[http://dx.doi.org/10.3389/fgene.2020.564515] [PMID: 33101385]
[http://dx.doi.org/10.1093/bioinformatics/btq003] [PMID: 20053844]
[http://dx.doi.org/10.3390/cells8020095] [PMID: 30696115]
[http://dx.doi.org/10.3390/molecules23071667] [PMID: 29987232]
[http://dx.doi.org/10.1039/C7MB00491E] [PMID: 28990628]
[http://dx.doi.org/10.2174/0929866525666180905110619] [PMID: 30182830]
[http://dx.doi.org/10.2147/IJN.S140875] [PMID: 28894368]
[http://dx.doi.org/10.1371/journal.pone.0200283] [PMID: 30312302]
[PMID: 31805335]
[http://dx.doi.org/10.1039/C9MO00098D] [PMID: 31710075]
[http://dx.doi.org/10.1038/s41598-019-44548-x] [PMID: 31164681]
[http://dx.doi.org/10.1039/C5MB00853K] [PMID: 26739209]
[http://dx.doi.org/10.1371/journal.pone.0129635] [PMID: 26080082]
[http://dx.doi.org/10.3389/fgene.2019.00129] [PMID: 30891059]
[http://dx.doi.org/10.1002/1873-3468.13536] [PMID: 31297788]
[http://dx.doi.org/10.1038/s41598-021-82513-9] [PMID: 33542286]
[http://dx.doi.org/10.1021/acs.jproteome.0c00590] [PMID: 32897718]
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 33017626]
[http://dx.doi.org/10.3390/cells9020353] [PMID: 32028709]
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868]
[http://dx.doi.org/10.1021/acs.jcim.0c00707] [PMID: 33094610]
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 32234434]
[http://dx.doi.org/10.1016/0022-5193(83)90265-5] [PMID: 6876837]
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[http://dx.doi.org/10.1093/bioinformatics/bty1047] [PMID: 30590410]
[http://dx.doi.org/10.1016/j.csbj.2019.06.024] [PMID: 31372196]
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[http://dx.doi.org/10.3390/cells8111332] [PMID: 31661923]
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696]
[http://dx.doi.org/10.1023/A:1010933404324]
[http://dx.doi.org/10.1201/9781315139470]
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[http://dx.doi.org/10.18632/oncotarget.20365] [PMID: 29100375]
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290]
[http://dx.doi.org/10.1007/BF00994018]
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[http://dx.doi.org/10.1093/bib/bbz088] [PMID: 31729528]
[PMID: 30590410]
[PMID: 32910169]
[http://dx.doi.org/10.1093/bib/bbab172] [PMID: 33963832]
[http://dx.doi.org/10.1093/bib/bby124] [PMID: 30649170]
[http://dx.doi.org/10.1093/bioinformatics/bty508] [PMID: 29931187]
[http://dx.doi.org/10.1093/bioinformatics/btz408] [PMID: 31099381]
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[http://dx.doi.org/10.1016/j.knosys.2018.10.007]
[http://dx.doi.org/10.1093/bioinformatics/btu624] [PMID: 25246429]
[http://dx.doi.org/10.1186/gb-2004-5-10-r80] [PMID: 15461798]
[http://dx.doi.org/10.18637/jss.v028.i05]
[http://dx.doi.org/10.1093/nar/gks450] [PMID: 22638580]
[http://dx.doi.org/10.1186/1471-2105-8-263] [PMID: 17645800]
[http://dx.doi.org/10.3389/fphar.2018.00276] [PMID: 29636690]
[http://dx.doi.org/10.1021/acs.jproteome.7b00019] [PMID: 28436664]
[http://dx.doi.org/10.1371/journal.pone.0120066] [PMID: 25781990]
[http://dx.doi.org/10.1155/2017/5761517]
[http://dx.doi.org/10.1371/journal.pone.0122217] [PMID: 25837679]
[http://dx.doi.org/10.1046/j.1432-1327.1999.00227.x] [PMID: 10103018]
[http://dx.doi.org/10.1111/j.1349-7006.2012.02276.x] [PMID: 22429838]
[http://dx.doi.org/10.1111/j.1582-4934.2009.01004.x] [PMID: 20050964]
[http://dx.doi.org/10.1093/emboj/17.6.1656] [PMID: 9501087]
[http://dx.doi.org/10.1083/jcb.105.5.2409] [PMID: 3680388]
[http://dx.doi.org/10.1242/jcs.01112] [PMID: 15150318]
[http://dx.doi.org/10.2174/156802612802652484] [PMID: 22827522]
[http://dx.doi.org/10.1042/BSR20150210] [PMID: 26464514]
[http://dx.doi.org/10.1038/srep35347] [PMID: 27734947]
[http://dx.doi.org/10.1021/cb7001126] [PMID: 17894440]
[http://dx.doi.org/10.1073/pnas.0807055105] [PMID: 18818312]
[http://dx.doi.org/10.18632/aging.100663] [PMID: 24860943]
[http://dx.doi.org/10.1038/sj.bjc.6600141] [PMID: 11875744]
[http://dx.doi.org/10.1021/bc2000929] [PMID: 21668003]
[http://dx.doi.org/10.1208/s12248-011-9298-1] [PMID: 21904966]
[http://dx.doi.org/10.1016/j.jare.2017.06.006] [PMID: 28808589]
[http://dx.doi.org/10.1215/S115285170400119X] [PMID: 15831230]
[http://dx.doi.org/10.1200/JCO.2008.19.8721] [PMID: 19720927]
[PMID: 10632329]
[http://dx.doi.org/10.1038/nature10144] [PMID: 21593862]
[http://dx.doi.org/10.1002/ijc.30510] [PMID: 27861852]
[http://dx.doi.org/10.1186/1479-5876-11-97] [PMID: 23578144]
[PMID: 31157855]
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[http://dx.doi.org/10.1016/j.ab.2019.02.017] [PMID: 30822398]
[http://dx.doi.org/10.1016/j.neunet.2020.05.027] [PMID: 32593932]
[PMID: 32599617]
[http://dx.doi.org/10.1186/1471-2105-13-S17-S3]
[http://dx.doi.org/10.1093/bioinformatics/btaa160] [PMID: 32145017]