Abstract
The interactions between RNAs and proteins play critical roles in many biological processes. Therefore, characterizing these interactions becomes critical for mechanistic, biomedical, and clinical studies. Many experimental methods can be used to determine RNA-protein interactions in multiple aspects. However, due to the facts that RNA-protein interactions are tissuespecific and condition-specific, as well as these interactions are weak and frequently compete with each other, those experimental techniques can not be made full use of to discover the complete spectrum of RNA-protein interactions. To moderate these issues, continuous efforts have been devoted to developing high quality computational techniques to study the interactions between RNAs and proteins. Many important progresses have been achieved with the application of novel techniques and strategies, such as machine learning techniques. Especially, with the development and application of CLIP techniques, more and more experimental data on RNA-protein interaction under specific biological conditions are available. These CLIP data altogether provide a rich source for developing advanced machine learning predictors. In this review, recent progresses on computational predictors for RNA-protein interaction were summarized in the following aspects: dataset, prediction strategies, and input features. Possible future developments were also discussed at the end of the review.
Keywords: RNA-protein interaction, RNA-binding protein, RNA-binding domain, RNA-binding motif, RNA-binding residue, protein-binding nucleotide, machine learning, deep learning, meta-strategy, UniProt, PDB, CLIP, sequence feature, structural feature, physicochemical feature, evolutionary information, PSSM.
[http://dx.doi.org/10.1007/s12551-016-0223-9] [PMID: 28510023]
[http://dx.doi.org/10.1016/j.febslet.2008.03.004] [PMID: 18342629]
[http://dx.doi.org/10.1093/bfgp/elu047] [PMID: 25504152]
[http://dx.doi.org/10.1016/j.cell.2012.04.031] [PMID: 22658674]
[http://dx.doi.org/10.1038/s41590-017-0028-4] [PMID: 29348497]
[http://dx.doi.org/10.1007/s00203-002-0407-5] [PMID: 12029388]
[http://dx.doi.org/10.3389/fmicb.2015.00141] [PMID: 25784899]
[http://dx.doi.org/10.1128/MCB.00500-07] [PMID: 17620417]
[http://dx.doi.org/10.1002/wrna.1378] [PMID: 27503141]
[http://dx.doi.org/10.7554/eLife.37663] [PMID: 30281021]
[PMID: 1617299]
[http://dx.doi.org/10.1038/nrg3778] [PMID: 25112293]
[http://dx.doi.org/10.1146/annurev.micro.091208.073514] [PMID: 19385727]
[http://dx.doi.org/10.1371/journal.pcbi.1005198] [PMID: 27923063]
[http://dx.doi.org/10.1002/wrna.1465] [PMID: 29341429]
[http://dx.doi.org/10.3389/fmolb.2017.00067] [PMID: 29034245]
[http://dx.doi.org/10.1038/nature02871] [PMID: 15372042]
[http://dx.doi.org/10.1016/S0092-8674(04)00045-5] [PMID: 14744438]
[http://dx.doi.org/10.1002/wrna.1414] [PMID: 28130820]
[http://dx.doi.org/10.2174/2211536611201010070] [PMID: 25048093]
[http://dx.doi.org/10.1016/j.molcel.2018.02.012]
[http://dx.doi.org/10.4161/rna.24641] [PMID: 23696003]
[http://dx.doi.org/10.1016/j.molcel.2011.08.018] [PMID: 21925379]
[http://dx.doi.org/10.1038/nature10887] [PMID: 22337053]
[http://dx.doi.org/10.3389/fbioe.2014.00088] [PMID: 25642422]
[http://dx.doi.org/10.1002/wrna.1471] [PMID: 29516680]
[http://dx.doi.org/10.1126/sciadv.aao2110] [PMID: 28959731]
[http://dx.doi.org/10.1016/j.tcb.2014.08.009] [PMID: 25441720]
[http://dx.doi.org/10.1016/j.tig.2008.05.004] [PMID: 18597886]
[http://dx.doi.org/10.1016/S1050-1738(03)00075-6] [PMID: 12837581]
[http://dx.doi.org/10.1016/j.tig.2013.01.004] [PMID: 23415593]
[http://dx.doi.org/10.1007/s11427-014-4647-9] [PMID: 24658850]
[http://dx.doi.org/10.1016/j.brainres.2016.02.050] [PMID: 26972534]
[http://dx.doi.org/10.1093/eurheartj/ehw567] [PMID: 28064149]
[http://dx.doi.org/10.1261/rna.064352.117] [PMID: 29282313]
[http://dx.doi.org/10.1038/s41598-018-28485-9] [PMID: 30018314]
[http://dx.doi.org/10.1016/j.febslet.2015.04.036] [PMID: 25937124]
[http://dx.doi.org/10.1371/journal.pone.0195969] [PMID: 29689087]
[http://dx.doi.org/10.1083/jcb.201211138] [PMID: 23420871]
[http://dx.doi.org/10.1038/nchem.1110] [PMID: 21860462]
[http://dx.doi.org/10.1016/j.cell.2012.05.022] [PMID: 22682242]
[http://dx.doi.org/10.1146/annurev-cellbio-100913-013325] [PMID: 25288112]
[http://dx.doi.org/10.1016/j.ejpb.2014.09.001] [PMID: 25218319]
[http://dx.doi.org/10.1007/s11084-014-9355-8] [PMID: 24577897]
[http://dx.doi.org/10.1021/bm900886k] [PMID: 19947624]
[http://dx.doi.org/10.1038/s41467-018-06072-w] [PMID: 30194374]
[http://dx.doi.org/10.1073/pnas.1615395114] [PMID: 28115706]
[http://dx.doi.org/10.1073/pnas.1222321110] [PMID: 23818642]
[http://dx.doi.org/10.1021/acs.langmuir.6b02499] [PMID: 27599198]
[http://dx.doi.org/10.1021/acs.biochem.8b00081] [PMID: 29560725]
[http://dx.doi.org/10.1038/nrm2178] [PMID: 17473849]
[http://dx.doi.org/10.1016/j.sbi.2012.03.013] [PMID: 22516180]
[http://dx.doi.org/10.1007/3-540-45701-1_1]
[http://dx.doi.org/10.1002/wrna.1405] [PMID: 27863061]
[http://dx.doi.org/10.1021/acs.chemrev.7b00427] [PMID: 29297679]
[http://dx.doi.org/10.1002/bip.20620] [PMID: 17080418]
[http://dx.doi.org/10.1016/j.bpj.2009.09.035] [PMID: 20006951]
[http://dx.doi.org/10.1016/j.bbagen.2014.10.021] [PMID: 25450173]
[PMID: 19941322]
[http://dx.doi.org/10.3390/ijms161125952] [PMID: 26540053]
[http://dx.doi.org/10.1016/j.jsb.2011.10.001] [PMID: 22019768]
[http://dx.doi.org/10.1038/nrm1403] [PMID: 15173824]
[http://dx.doi.org/10.1016/S0092-8674(00)81134-4] [PMID: 9506521]
[http://dx.doi.org/10.1126/science.1621097] [PMID: 1621097]
[http://dx.doi.org/10.1006/jmbi.1997.1551] [PMID: 9514734]
[http://dx.doi.org/10.1074/jbc.M000920200] [PMID: 10747964]
[http://dx.doi.org/10.1074/jbc.M114.574780] [PMID: 25086036]
[http://dx.doi.org/10.1126/science.aab1452] [PMID: 26113724]
[http://dx.doi.org/10.1038/s41467-017-00459-x] [PMID: 28912471]
[http://dx.doi.org/10.1016/j.cell.2016.06.048] [PMID: 28777949]
[http://dx.doi.org/10.3389/fmolb.2018.00007] [PMID: 29459899]
[http://dx.doi.org/10.1111/febs.13503] [PMID: 26365095]
[http://dx.doi.org/10.1038/nri.2016.129] [PMID: 27990022]
[http://dx.doi.org/10.1096/fj.04-1584rev] [PMID: 15284216]
[http://dx.doi.org/10.1006/jmbi.1999.2991] [PMID: 10550207]
[http://dx.doi.org/10.1016/S0959-440X(99)80009-8] [PMID: 10400475]
[http://dx.doi.org/10.1080/15476286.2015.1040977] [PMID: 25932908]
[http://dx.doi.org/10.1111/j.1742-4658.2005.04653.x] [PMID: 15853797]
[http://dx.doi.org/10.1080/07391102.2012.675145] [PMID: 22702725]
[http://dx.doi.org/10.1002/pmic.201800064] [PMID: 29806170]
[http://dx.doi.org/10.1126/science.1853201] [PMID: 1853201]
[http://dx.doi.org/10.1038/358086a0] [PMID: 1614539]
[http://dx.doi.org/10.1016/j.sbi.2008.05.007] [PMID: 18554899]
[http://dx.doi.org/10.1186/1752-0509-7-S4-S13] [PMID: 24565058]
[http://dx.doi.org/10.1006/jmbi.1998.1993] [PMID: 9719646]
[http://dx.doi.org/10.1371/journal.pcbi.1005120] [PMID: 27662342]
[http://dx.doi.org/10.1002/prot.24100] [PMID: 22522696]
[http://dx.doi.org/10.1186/s12864-018-4889-1] [PMID: 29970003]
[http://dx.doi.org/10.1093/nar/gky1049] [PMID: 30395287]
[http://dx.doi.org/10.1155/2015/425810] [PMID: 26543860]
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]
[http://dx.doi.org/10.1093/bioinformatics/btg224] [PMID: 12912846]
[http://dx.doi.org/10.1186/1471-2105-9-S12-S6] [PMID: 19091029]
[http://dx.doi.org/10.1110/ps.062597307] [PMID: 17456745]
[http://dx.doi.org/10.1093/nar/gkq1266] [PMID: 21183467]
[http://dx.doi.org/10.1371/journal.pone.0158445] [PMID: 27383535]
[http://dx.doi.org/10.1186/s12859-017-1879-2] [PMID: 29219069]
[http://dx.doi.org/[http://10.1093/nar/gkl298]] [PMID: 16845003]
[http://dx.doi.org/10.1186/1752-0509-4-S1-S3] [PMID: 20522253]
[http://dx.doi.org/10.1002/prot.22958] [PMID: 21268114]
[http://dx.doi.org/10.1006/jmbi.2001.4857] [PMID: 11469858]
[http://dx.doi.org/[10.1093/nar/gkm294]] [PMID: 17483510]
[http://dx.doi.org/10.7554/eLife.02848] [PMID: 24935936]
[http://dx.doi.org/10.1039/C7CP07889G] [PMID: 29565070]
[http://dx.doi.org/10.1002/prot.24610] [PMID: 24854765]
[http://dx.doi.org/10.1186/gb4152] [PMID: 24467948]
[http://dx.doi.org/10.1093/bfgp/elu020] [PMID: 24951655]
[http://dx.doi.org/10.1016/j.semcdb.2011.12.001] [PMID: 22212136]
[http://dx.doi.org/10.1007/978-1-4939-6380-5_7] [PMID: 27659976]
[http://dx.doi.org/10.1016/j.ymeth.2005.07.018] [PMID: 16314267]
[http://dx.doi.org/10.1016/j.molcel.2013.01.030] [PMID: 23438856]
[http://dx.doi.org/10.1016/S0076-6879(88)64050-X] [PMID: 3071669]
[http://dx.doi.org/10.1038/nprot.2006.47] [PMID: 17406249]
[http://dx.doi.org/10.1016/S1046-2023(02)00021-X] [PMID: 12054895]
[http://dx.doi.org/10.1038/nprot.2006.82] [PMID: 17406284]
[http://dx.doi.org/10.1038/nature07488] [PMID: 18978773]
[http://dx.doi.org/10.1016/j.cell.2010.03.009] [PMID: 20371350]
[http://dx.doi.org/10.1038/nsmb.1838] [PMID: 20601959]
[http://dx.doi.org/10.1016/j.celrep.2016.03.052] [PMID: 27068461]
[http://dx.doi.org/10.1038/nmeth.3810] [PMID: 27018577]
[http://dx.doi.org/10.1093/nar/gkq940] [PMID: 21087992]
[http://dx.doi.org/10.1093/nar/gkr1007] [PMID: 22086949]
[http://dx.doi.org/10.1093/nar/gkt1248] [PMID: 24297251]
[http://dx.doi.org/[10.4161/trla.27738]] [PMID: 26779400]
[http://dx.doi.org/10.1186/s12864-015-1273-2] [PMID: 25652745]
[http://dx.doi.org/10.1093/nar/gky830] [PMID: 30239819]
[http://dx.doi.org/10.1186/gb-2011-12-8-r79] [PMID: 21851591]
[http://dx.doi.org/10.1093/bioinformatics/bts569] [PMID: 23024010]
[http://dx.doi.org/10.1101/gad.1788009] [PMID: 19528315]
[http://dx.doi.org/10.1371/journal.pone.0030377] [PMID: 22276185]
[http://dx.doi.org/10.1142/S021972001541005X] [PMID: 25790785]
[http://dx.doi.org/10.1186/1471-2105-7-91] [PMID: 16504092]
[http://dx.doi.org/10.1016/j.bbapap.2010.01.011] [PMID: 20100603]
[http://dx.doi.org/10.1016/j.str.2016.07.007] [PMID: 27545621]
[http://dx.doi.org/10.7554/eLife.31486] [PMID: 29424691]
[http://dx.doi.org/10.1038/s41467-018-07530-1] [PMID: 30498217]
[http://dx.doi.org/10.1016/j.tibs.2018.03.007] [PMID: 29716768]
[http://dx.doi.org/10.1186/1742-4682-4-15] [PMID: 17430588]
[http://dx.doi.org/10.1371/journal.pcbi.1005941] [PMID: 29364893]
[http://dx.doi.org/10.1093/bib/bbk007] [PMID: 16761367]
[http://dx.doi.org/10.1016/j.compbiomed.2005.09.002] [PMID: 16226240]
[PMID: 15130837]
[http://dx.doi.org/10.1371/journal.pcbi.0030116] [PMID: 17604446]
[http://dx.doi.org/10.1261/rna.5890304] [PMID: 14970381]
[http://dx.doi.org/10.1186/1471-2105-12-489] [PMID: 22192482]
[http://dx.doi.org/10.1002/jmr.1061] [PMID: 20677174]
[http://dx.doi.org/10.1093/nar/gkv020] [PMID: 25609700]
[http://dx.doi.org/10.3390/molecules23030540] [PMID: 29495575]
[PMID: 27993780]
[http://dx.doi.org/10.1093/bioinformatics/btt495] [PMID: 23975767]
[http://dx.doi.org/10.1186/1471-2164-14-651] [PMID: 24063787]
[http://dx.doi.org/10.1093/nar/gkr160] [PMID: 21459850]
[http://dx.doi.org/10.1186/1471-2105-15-123] [PMID: 24780077]
[http://dx.doi.org/10.1093/bioinformatics/btx361] [PMID: 28637296]
[PMID: 15712114]
[http://dx.doi.org/10.1002/prot.21677] [PMID: 17932917]
[http://dx.doi.org/10.1093/bioinformatics/btq253] [PMID: 20483814]
[http://dx.doi.org/[https://doi.org/10.1109/TCBB.2015.2418773] [PMID: 26671809]
[http://dx.doi.org/10.1186/s12859-016-1110-x] [PMID: 27266516]
[http://dx.doi.org/[10.1093/nar/gkq361]] [PMID: 20478832]
[http://dx.doi.org/10.1016/j.jtbi.2017.01.040] [PMID: 28137600]
[http://dx.doi.org/10.1093/nar/gkl819] [PMID: 17130160]
[http://dx.doi.org/10.1371/journal.pcbi.1000832] [PMID: 20617199]
[http://dx.doi.org/10.1186/gb-2014-15-1-r17] [PMID: 24451197]
[http://dx.doi.org/10.1093/bioinformatics/btw259] [PMID: 27307637]
[http://dx.doi.org/10.1093/nar/gkx756] [PMID: 28977546]
[http://dx.doi.org/10.1016/j.biosystems.2015.10.004] [PMID: 26607710]
[http://dx.doi.org/10.1186/s12918-017-0386-4] [PMID: 28361677]
[http://dx.doi.org/10.1093/nar/gkx279] [PMID: 28472523]
[http://dx.doi.org/10.1093/bioinformatics/bty208] [PMID: 29617966]
[http://dx.doi.org/10.1186/1471-2105-12-S13-S5] [PMID: 22373260]
[http://dx.doi.org/10.1038/s41598-017-00795-4] [PMID: 28377624]
[http://dx.doi.org/10.1038/nbt.3300] [PMID: 26213851]
[http://dx.doi.org/10.1186/s12864-016-2931-8] [PMID: 27506469]
[http://dx.doi.org/10.1186/s12859-017-1561-8] [PMID: 28245811]
[http://dx.doi.org/10.1093/bioinformatics/bty364] [PMID: 29722865]
[http://dx.doi.org/10.1093/bioinformatics/bty222] [PMID: 29659719]
[http://dx.doi.org/10.1093/nar/gkv1025] [PMID: 26467480]
[http://dx.doi.org/10.1155/2011/506205] [PMID: 21826121]
[http://dx.doi.org/10.1371/journal.pone.0097725] [PMID: 24846307]
[http://dx.doi.org/10.1371/journal.pone.0133260] [PMID: 26176857]
[http://dx.doi.org/10.1186/s12859-015-0691-0] [PMID: 26254826]
[http://dx.doi.org/10.1093/nar/gkn008] [PMID: 18276647]
[http://dx.doi.org/10.3390/ijms19103052] [PMID: 30301243]
[http://dx.doi.org/10.1186/s12864-019-5528-1] [PMID: 30813885]
[http://dx.doi.org/10.1016/0022-2836(78)90297-8] [PMID: 642007]
[http://dx.doi.org/10.1002/pro.5560050824] [PMID: 8844859]
[http://dx.doi.org/10.1006/jmbi.1999.3091] [PMID: 10493868]
[http://dx.doi.org/10.1016/j.cmpb.2007.12.003] [PMID: 18261823]
[http://dx.doi.org/10.1007/s00726-010-0639-7] [PMID: 20549269]
[http://dx.doi.org/10.1186/1471-2105-12-S13-S7] [PMID: 22373313]
[http://dx.doi.org/10.1093/bioinformatics/btp257] [PMID: 19389733]
[http://dx.doi.org/10.1504/IJDMB.2010.030965] [PMID: 20300450]
[http://dx.doi.org/10.1016/j.compbiomed.2013.08.011] [PMID: 24209914]
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[http://dx.doi.org/10.1038/srep11476] [PMID: 26098304]
[http://dx.doi.org/10.1093/bioinformatics/16.4.404] [PMID: 10869041]
[http://dx.doi.org/10.1093/protein/9.2.133] [PMID: 9005434]
[http://dx.doi.org/10.1002/jcc.21968] [PMID: 22045506]
[http://dx.doi.org/10.1186/1472-6807-9-51] [PMID: 19646261]
[http://dx.doi.org/10.1093/nar/gkv332] [PMID: 25883141]
[http://dx.doi.org/10.1093/bioinformatics/btk010] [PMID: 16357029]
[http://dx.doi.org/10.1016/0022-2836(82)90515-0] [PMID: 7108955]
[http://dx.doi.org/10.1110/ps.0228903] [PMID: 12592033]
[http://dx.doi.org/10.1110/ps.0241703] [PMID: 12717015]
[http://dx.doi.org/[10.1093/nar/gki588]] [PMID: 15988830]
[http://dx.doi.org/10.1186/1471-2105-10-341] [PMID: 19835626]
[http://dx.doi.org/10.1186/1471-2105-8-463] [PMID: 18042272]
[http://dx.doi.org/10.1073/pnas.89.22.10915] [PMID: 1438297]
[http://dx.doi.org/10.1007/s00726-007-0634-9] [PMID: 18235992]
[http://dx.doi.org/10.2174/092986610790780279] [PMID: 19508202]
[http://dx.doi.org/10.1186/1471-2164-11-S4-S2] [PMID: 21143803]
[http://dx.doi.org/10.2174/092986707779941014] [PMID: 17305545]
[http://dx.doi.org/10.1093/nar/gkv446] [PMID: 25940624]
[http://dx.doi.org/10.1371/journal.pone.0168392] [PMID: 28002428]
[http://dx.doi.org/10.1186/s12859-017-1864-9] [PMID: 29065858]
[http://dx.doi.org/10.1093/nar/gkw104] [PMID: 26896799]
[http://dx.doi.org/10.3389/fgene.2017.00231] [PMID: 29403526]
[http://dx.doi.org/10.1016/j.jmb.2015.09.026] [PMID: 26522935]