Abstract
In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of ‘big data’ derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.
Keywords: Big data, Machine learning, Next generation sequencing, High-through sequence, Support vector machine, Naïve Bayes classifier, Artifical neural work, Ensemble learning, Adaboost, bagging.
[http://dx.doi.org/10.2174/157489361403190220112855]
[http://dx.doi.org/10.2174/1574893614666190102125403]
[http://dx.doi.org/10.1093/nar/gks1450] [PMID: 23303794]
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[http://dx.doi.org/10.1093/nar/gku1019] [PMID: 25361964]
[http://dx.doi.org/10.1016/j.ab.2014.06.022] [PMID: 25016190]
[http://dx.doi.org/10.1155/2014/286419] [PMID: 24991545]
[http://dx.doi.org/10.1016/j.jtbi.2015.08.025] [PMID: 26362104]
[http://dx.doi.org/10.1016/j.ab.2014.12.009] [PMID: 25596338]
[http://dx.doi.org/10.1080/07391102.2014.998710] [PMID: 25513722]
[http://dx.doi.org/10.1016/j.ab.2015.12.009] [PMID: 26723495]
[http://dx.doi.org/10.1093/bioinformatics/btv604] [PMID: 26476782]
[http://dx.doi.org/10.18632/oncotarget.13758] [PMID: 27926534]
[http://dx.doi.org/10.1016/j.ab.2018.09.002] [PMID: 30201554]
[http://dx.doi.org/10.1016/j.ygeno.2017.10.008] [PMID: 29107015]
[http://dx.doi.org/10.1016/j.ygeno.2018.01.005] [PMID: 29360500]
[http://dx.doi.org/10.1016/j.ab.2018.12.019] [PMID: 30593778]
[http://dx.doi.org/10.1016/j.jtbi.2019.02.007] [PMID: 30768975]
[http://dx.doi.org/10.1016/j.jtbi.2018.10.021] [PMID: 30312687]
[http://dx.doi.org/10.1016/j.jtbi.2018.12.015] [PMID: 30550863]
[http://dx.doi.org/10.2174/1570178615666180724103325]
[http://dx.doi.org/10.2174/1570178615666180802122953]
[http://dx.doi.org/10.2174/1381612824666181119145030] [PMID: 30451108]
[http://dx.doi.org/10.2174/1381612824666181113120948] [PMID: 30421671]
[http://dx.doi.org/10.2174/1381612825666181127101039] [PMID: 30479209]
[http://dx.doi.org/10.2174/1573406415666181218102517]
[http://dx.doi.org/10.1016/j.ygeno.2018.05.017]] [PMID: 29842950]
[http://dx.doi.org/10.2174/1573406415666181217114710]
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024] [PMID: 21168420]
[http://dx.doi.org/10.2174/0929867326666190507082559] [PMID: 31060481]
[http://dx.doi.org/10.1016/j.ajhg.2018.04.001] [PMID: 29779563]
[http://dx.doi.org/10.2174/156802612801319007] [PMID: 22690679]
[http://dx.doi.org/10.1093/bioinformatics/btp698] [PMID: 20080505]
[http://dx.doi.org/10.1186/gb-2009-10-11-r134] [PMID: 19930550]
[http://dx.doi.org/10.1101/gr.194201] [PMID: 11591649]
[http://dx.doi.org/10.1101/gr.078212.108] [PMID: 18714091]
[http://dx.doi.org/10.1093/bioinformatics/btp336] [PMID: 19497933]
[http://dx.doi.org/10.1101/gr.107524.110] [PMID: 20644199]
[http://dx.doi.org/10.1038/nbt.2514] [PMID: 23396013]
[http://dx.doi.org/10.1038/nmeth.1628] [PMID: 21666668]
[http://dx.doi.org/10.1016/j.ajhg.2012.08.005] [PMID: 23040492]
[http://dx.doi.org/10.1038/nbt.1621] [PMID: 20436464]
[http://dx.doi.org/10.1093/bioinformatics/btp616] [PMID: 19910308]
[http://dx.doi.org/10.1186/gb-2010-11-10-r106] [PMID: 20979621]
[http://dx.doi.org/10.1186/gb-2013-14-2-r12] [PMID: 23409703]
[http://dx.doi.org/10.1186/gb-2013-14-4-r36] [PMID: 23618408]
[http://dx.doi.org/10.1371/journal.pcbi.1001138] [PMID: 21625565]
[http://dx.doi.org/10.2174/1574893611666160609081155]
[http://dx.doi.org/10.2174/157489361301180219151606]
[http://dx.doi.org/10.3389/fphys.2016.00075]
[http://dx.doi.org/10.1038/nature17656] [PMID: 27135929]
[http://dx.doi.org/10.1126/science.aaf7066] [PMID: 27338706]
[http://dx.doi.org/10.1021/bi101435c] [PMID: 21189021]
[http://dx.doi.org/10.2174/1573406415666181218101623]
[http://dx.doi.org/10.2174/1568026615666150819104617] [PMID: 26286215]
[http://dx.doi.org/10.2174/0929866511107010966] [PMID: 21592084]
[http://dx.doi.org/10.1038/nature06531] [PMID: 18235503]
[http://dx.doi.org/10.1038/nature10257] [PMID: 21785437]
[http://dx.doi.org/10.1038/nsb1101-990] [PMID: 11685248]
[http://dx.doi.org/10.1038/nature12283] [PMID: 23739335]
[http://dx.doi.org/10.1038/nsmb.1707] [PMID: 19898475]
[http://dx.doi.org/10.1016/j.molcel.2016.01.009] [PMID: 26853147]
[http://dx.doi.org/10.1038/ni.1943] [PMID: 20890284]
[http://dx.doi.org/10.1038/nsmb.3059] [PMID: 26167881]
[http://dx.doi.org/10.1073/pnas.1620316114] [PMID: 28325874]
[http://dx.doi.org/10.1021/jacs.7b09352] [PMID: 29193965]
[http://dx.doi.org/10.1016/j.cell.2019.02.001] [PMID: 30827683]
[http://dx.doi.org/10.1110/ps.051528905] [PMID: 16131665]
[http://dx.doi.org/10.1021/pr050145a] [PMID: 16212421]
[http://dx.doi.org/10.1006/bbrc.2002.6686] [PMID: 11922623]
[http://dx.doi.org/10.2174/1568026617666170414150730] [PMID: 28413949]
[http://dx.doi.org/10.2174/0929866521666141019192221] [PMID: 25329332]
[http://dx.doi.org/10.1016/j.bbrc.2005.03.123] [PMID: 15845357]
[http://dx.doi.org/10.1021/pr050135+] [PMID: 16212418]
[http://dx.doi.org/10.1016/j.bbrc.2006.12.235] [PMID: 17266937]
[http://dx.doi.org/10.1016/j.bbrc.2009.06.016] [PMID: 19523442]
[http://dx.doi.org/10.1371/journal.pone.0028111] [PMID: 22140516]
[http://dx.doi.org/10.1016/j.jtbi.2016.01.020] [PMID: 26807806]
[http://dx.doi.org/10.2174/1573406411666141229162834] [PMID: 25548930]
[http://dx.doi.org/10.1093/protein/gzt042] [PMID: 24048266]
[http://dx.doi.org/10.3390/ijms150610410] [PMID: 24918295]
[http://dx.doi.org/10.3390/ijms15057594] [PMID: 24857907]
[http://dx.doi.org/10.1080/07391102.2014.968875] [PMID: 25248923]
[http://dx.doi.org/10.18632/oncotarget.9148] [PMID: 27153555]
[http://dx.doi.org/10.1016/j.jtbi.2016.02.020] [PMID: 26908349]
[http://dx.doi.org/10.18632/oncotarget.10027] [PMID: 27322424]
[http://dx.doi.org/10.1016/j.omtn.2017.03.006] [PMID: 28624191]
[http://dx.doi.org/10.1016/j.omtn.2017.04.008] [PMID: 28624202]
[http://dx.doi.org/10.2174/1573406413666170623082245] [PMID: 28641529]
[http://dx.doi.org/10.1002/prot.25689] [PMID: 30958587]
[http://dx.doi.org/10.1016/j.jtbi.2018.10.046] [PMID: 30365947]
[http://dx.doi.org/10.1186/s12859-019-2700-1] [PMID: 30841845]
[http://dx.doi.org/10.1016/j.jtbi.2018.04.037] [PMID: 29727634]
[http://dx.doi.org/10.1007/s11033-018-4417-z] [PMID: 30311130]
[http://dx.doi.org/10.1016/j.ab.2018.04.021] [PMID: 29704476]
[http://dx.doi.org/10.1093/bib/bby053] [PMID: 29947743]
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
[http://dx.doi.org/10.1016/j.jtbi.2015.04.011] [PMID: 25908206]
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583]
[http://dx.doi.org/10.1016/j.ab.2014.04.001] [PMID: 24732113]
[http://dx.doi.org/10.2174/1574893611666160711162006]
[http://dx.doi.org/10.1007/s13721-015-0078-1]
[http://dx.doi.org/10.1016/j.ins.2014.05.042]
[http://dx.doi.org/10.1023/A:1012487302797]
[http://dx.doi.org/10.1016/S0004-3702(97)00063-5]
[http://dx.doi.org/10.1016/j.inffus.2018.11.019]
[http://dx.doi.org/10.1016/j.compeleceng.2013.11.024]
[http://dx.doi.org/10.1093/bioinformatics/btw539] [PMID: 27531102]
[http://dx.doi.org/10.1016/j.patcog.2018.02.020]
[http://dx.doi.org/10.1002/jcb.28159 ] [PMID: 30536412]
[http://dx.doi.org/10.1016/j.bbrc.2007.03.162] [PMID: 17434148]
[http://dx.doi.org/10.1007/s00726-005-0189-6] [PMID: 15838592]
[http://dx.doi.org/10.1016/S0196-9781(01)00540-X] [PMID: 11786179]
[http://dx.doi.org/10.7717/peerj.171] [PMID: 24109555]
[http://dx.doi.org/10.1093/bioinformatics/btx579] [PMID: 28968797]
[http://dx.doi.org/10.18632/oncotarget.11975] [PMID: 27626500]
[http://dx.doi.org/10.18632/oncotarget.7815] [PMID: 26942877]
[http://dx.doi.org/10.2174/2468422806666160618091522]
[http://dx.doi.org/10.1023/A:1010933404324]
[http://dx.doi.org/10.1037/a0016973] [PMID: 19968396]
[http://dx.doi.org/10.1039/C4MB00659C] [PMID: 25512221]
[http://dx.doi.org/10.1186/1471-2105-11-S1-S62] [PMID: 20122238]
[http://dx.doi.org/10.1109/TCBB.2012.63] [PMID: 22547432]
[http://dx.doi.org/10.1371/journal.pcbi.1004590] [PMID: 26562774]
[http://dx.doi.org/10.1186/s12864-016-3281-2] [PMID: 27842494]
[http://dx.doi.org/10.3389/fnbot.2013.00003] [PMID: 23450126]
[http://dx.doi.org/10.1155/2014/195470] [PMID: 25162043]
[http://dx.doi.org/10.1049/iet-syb.2012.0057] [PMID: 24067416]
[http://dx.doi.org/10.1016/j.compbiomed.2013.12.002] [PMID: 24529200]
[http://dx.doi.org/10.1007/s00521-012-1148-2]
[http://dx.doi.org/10.1016/j.csda.2009.05.007]
[http://dx.doi.org/10.1093/bioinformatics/bts271] [PMID: 22581179]
[http://dx.doi.org/10.1093/bioinformatics/btw186] [PMID: 27153623]
[http://dx.doi.org/10.1016/j.bbrc.2007.06.027] [PMID: 17586467]
[http://dx.doi.org/10.1002/minf.20160001] [PMID: 28488814]
[http://dx.doi.org/10.1093/bioinformatics/btw380] [PMID: 27334473]
[http://dx.doi.org/10.18632/oncotarget.9987] [PMID: 27323404]
[http://dx.doi.org/10.1007/s00726-006-0439-2] [PMID: 17031474]
[http://dx.doi.org/10.1021/pr800957q] [PMID: 19226167]
[http://dx.doi.org/10.1016/j.ab.2009.07.046] [PMID: 19651102]
[http://dx.doi.org/10.2174/157489310794072508]
[http://dx.doi.org/10.1016/j.asoc.2019.01.015]
[http://dx.doi.org/10.1371/journal.pone.0204371] [PMID: 30388122]
[http://dx.doi.org/10.1166/jctn.2015.3984]
[http://dx.doi.org/10.1198/016214502753479248]
[http://dx.doi.org/10.1002/jum.14350] [PMID: 28804937]
[http://dx.doi.org/10.1042/bj1870829] [PMID: 7188428]
[http://dx.doi.org/10.1042/bj2220169] [PMID: 6477507]
[PMID: 2745429]
[http://dx.doi.org/10.1016/0301-4622(90)80056-D] [PMID: 2183882]
[http://dx.doi.org/10.1016/0301-4622(80)80002-0] [PMID: 7225518]
[http://dx.doi.org/10.1016/0301-4622(80)80003-2] [PMID: 7225519]
[http://dx.doi.org/10.1002/bip.360260209] [PMID: 3828475]
[http://dx.doi.org/10.1016/0301-4622(88)85002-6] [PMID: 3046672]
[http://dx.doi.org/10.1155/2018/7523165] [PMID: 30356365]
[http://dx.doi.org/10.1109/ACCESS.2018.2837654]
[http://dx.doi.org/10.1016/j.canlet.2017.06.004] [PMID: 28610955]
[http://dx.doi.org/10.2174/1574893611666160815150746]
[http://dx.doi.org/10.1007/BF00058655]
[http://dx.doi.org/10.1007/s10796-016-9718-y]
[http://dx.doi.org/10.1016/j.asoc.2012.10.023]
[http://dx.doi.org/10.1007/s10916-010-9518-8] [PMID: 20703679]
[http://dx.doi.org/10.2174/1574893611666160511130633]
[http://dx.doi.org/10.1007/BF00994018]
[http://dx.doi.org/10.1109/72.788640] [PMID: 18252602]
[http://dx.doi.org/10.2174/1574893612666170405125637]
[http://dx.doi.org/10.1023/A:1009715923555]
[http://dx.doi.org/10.2174/1574893613666180726163429]
[http://dx.doi.org/10.1371/journal.pone.0169605] [PMID: 28056073]
[http://dx.doi.org/10.1016/j.gene.2016.07.059] [PMID: 27468948]
[http://dx.doi.org/10.1016/j.compbiolchem.2014.07.001] [PMID: 25086506]
[http://dx.doi.org/10.1016/j.eswa.2011.01.120]
[http://dx.doi.org/10.1166/asl.2015.6589]
[http://dx.doi.org/10.1158/1078-0432.CCR-14-0205] [PMID: 24668645]
[http://dx.doi.org/10.1155/2015/491502] [PMID: 26539502]
[http://dx.doi.org/10.1155/2015/781023] [PMID: 26543867]
[http://dx.doi.org/10.1186/1471-2105-15-S17-S2] [PMID: 25559354]
[http://dx.doi.org/10.1016/j.gene.2013.09.118] [PMID: 24120395]
[http://dx.doi.org/10.1016/j.jbi.2016.03.010] [PMID: 26992567]
[http://dx.doi.org/10.1371/journal.pone.0196836] [PMID: 29750795]
[http://dx.doi.org/10.1016/j.ejor.2017.12.001]
[http://dx.doi.org/10.1109/101.8118]
[http://dx.doi.org/10.2174/1574893612666170221152848]
[http://dx.doi.org/10.4161/bioe.26997] [PMID: 24335433]
[http://dx.doi.org/10.1177/0954411913483637] [PMID: 23636761]
[http://dx.doi.org/10.1016/j.clinph.2013.04.005] [PMID: 23643311]
[http://dx.doi.org/10.1007/s10278-013-9600-0] [PMID: 23645344]
[http://dx.doi.org/10.1039/c3lc41361f] [PMID: 23640025]
[http://dx.doi.org/10.1186/2008-2231-20-31] [PMID: 23351435]
[http://dx.doi.org/10.1016/j.neunet.2012.10.006] [PMID: 23201554]
[http://dx.doi.org/10.2174/1574893612666171122152208]
[http://dx.doi.org/10.2174/1574893612666170125124538]
[http://dx.doi.org/10.2174/1574893612666170707095707]
[http://dx.doi.org/10.1155/2014/634123] [PMID: 24959000]
[http://dx.doi.org/10.1016/j.artmed.2008.03.001] [PMID: 18420392]
[http://dx.doi.org/10.1263/jbb.101.377] [PMID: 16781465]
[http://dx.doi.org/10.1183/13993003.00986-2018]
[http://dx.doi.org/10.1093/jnci/djy225] [PMID: 30629194]
[http://dx.doi.org/10.2174/1574893610666151008011731]
[http://dx.doi.org/10.3390/cancers11010053] [PMID: 30626092]
[http://dx.doi.org/10.1007/s00432-018-02834-7] [PMID: 30603908]
[http://dx.doi.org/10.1038/nbt.4313] [PMID: 30556813]
[http://dx.doi.org/10.1186/s12859-018-2509-3] [PMID: 30577754]
[http://dx.doi.org/10.1038/s41588-018-0257-y] [PMID: 30397337]
[http://dx.doi.org/10.2174/15680266113139990113] [PMID: 23889055]
[http://dx.doi.org/10.18632/oncotarget.17104] [PMID: 28476023]
[http://dx.doi.org/10.1038/s41598-018-19491-y] [PMID: 29348418]
[http://dx.doi.org/10.1093/bioinformatics/bty628] [PMID: 30010789]
[http://dx.doi.org/10.2174/1568026617666170414145508] [PMID: 28413951]