Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Systematic Review Article

A Systematic Review of the Application of Machine Learning in CpG Island (CGI) Detection and Methylation Prediction

Author(s): Rui Wei, Le Zhang, Huiru Zheng* and Ming Xiao*

Volume 19, Issue 3, 2024

Published on: 03 July, 2023

Page: [235 - 249] Pages: 15

DOI: 10.2174/1574893618666230508104341

Price: $65

Abstract

Background: CpG island (CGI) detection and methylation prediction play important roles in studying the complex mechanisms of CGIs involved in genome regulation. In recent years, machine learning (ML) has been gradually applied to CGI detection and CGI methylation prediction algorithms in order to improve the accuracy of traditional methods. However, there are a few systematic reviews on the application of ML in CGI detection and CGI methylation prediction. Therefore, this systematic review aims to provide an overview of the application of ML in CGI detection and methylation prediction.

Methods: The review was carried out using the PRISMA guideline. The search strategy was applied to articles published on PubMed from 2000 to July 10, 2022. Two independent researchers screened the articles based on the retrieval strategies and identified a total of 54 articles. After that, we developed quality assessment questions to assess study quality and obtained 46 articles that met the eligibility criteria. Based on these articles, we first summarized the applications of ML methods in CGI detection and methylation prediction, and then identified the strengths and limitations of these studies.

Result: Finally, we have discussed the challenges and future research directions.

Conclusion: This systematic review will contribute to the selection of algorithms and the future development of more efficient algorithms for CGI detection and methylation prediction.

Graphical Abstract

[1]
Dor Y, Cedar H. Principles of DNA methylation and their implications for biology and medicine. Lancet 2018; 392(10149): 777-86.
[http://dx.doi.org/10.1016/S0140-6736(18)31268-6] [PMID: 30100054]
[2]
Wu H, Zhang Y. Reversing DNA methylation: mechanisms, genomics, and biological functions. Cell 2014; 156(1-2): 45-68.
[http://dx.doi.org/10.1016/j.cell.2013.12.019] [PMID: 24439369]
[3]
Zhang L, Xiao M, Zhou J, Yu J. Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA). Bioinformatics 2018; 34(21): 3624-30.
[http://dx.doi.org/10.1093/bioinformatics/bty392] [PMID: 29762634]
[4]
Takahashi Y, Wu J, Suzuki K, et al. Integration of CpG-free DNA induces de novo methylation of CpG islands in pluripotent stem cells. Science 2017; 356(6337): 503-8.
[http://dx.doi.org/10.1126/science.aag3260] [PMID: 28473583]
[5]
Pongor CI, Bianco P, Ferenczy G, Kellermayer R, Kellermayer M. Optical trapping nanometry of hypermethylated CPG-island DNA. Biophys J 2017; 112(3): 512-22.
[http://dx.doi.org/10.1016/j.bpj.2016.12.029] [PMID: 28109529]
[6]
Straussman R, Nejman D, Roberts D, et al. Developmental programming of CpG island methylation profiles in the human genome. Nat Struct Mol Biol 2009; 16(5): 564-71.
[http://dx.doi.org/10.1038/nsmb.1594] [PMID: 19377480]
[7]
Zhang L, Dai Z, Yu J, Xiao M. CpG-island-based annotation and analysis of human housekeeping genes. Brief Bioinform 2021; 22(1): 515-25.
[http://dx.doi.org/10.1093/bib/bbz134] [PMID: 31982909]
[8]
Yang A, Zhang W, Wang J, Yang K, Han Y, Zhang L. Review on the application of machine learning algorithms in the sequence data mining of DNA. Front Bioeng Biotechnol 2020; 8: 1032.
[http://dx.doi.org/10.3389/fbioe.2020.01032] [PMID: 33015010]
[9]
Tahir RA, Zheng D, Nazir A, Qing H. A review of computational algorithms for CpG islands detection. J Biosci 2019; 44(6): 143.
[http://dx.doi.org/10.1007/s12038-019-9961-8] [PMID: 31894124]
[10]
Cai Y, Dong Q, Li A. Review of CpG island recognition algorithms. J Phys Conf Ser 2020; 1624(4): 042026.
[http://dx.doi.org/10.1088/1742-6596/1624/4/042026]
[11]
Chuang LY, Huang HC, Lin MC, Yang CH. Particle swarm optimization with reinforcement learning for the prediction of CpG islands in the human genome. PLoS One 2011; 6(6): e21036.
[http://dx.doi.org/10.1371/journal.pone.0021036] [PMID: 21738602]
[12]
Gardiner-Garden M, Frommer M. CpG Islands in vertebrate genomes. J Mol Biol 1987; 196(2): 261-82.
[http://dx.doi.org/10.1016/0022-2836(87)90689-9] [PMID: 3656447]
[13]
Ku JL, Jeon YK, Park JG. Methylation-specific PCR. Methods Mol Biol 2011; 791: 23-32.
[http://dx.doi.org/10.1007/978-1-61779-316-5_3] [PMID: 21913069]
[14]
Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet 2002; 3(5): 370-9.
[http://dx.doi.org/10.1038/nrg798] [PMID: 11988762]
[15]
Bock C, Walter J, Paulsen M, Lengauer T. CpG island mapping by epigenome prediction. PLOS Comput Biol 2007; 3(6): e110.
[http://dx.doi.org/10.1371/journal.pcbi.0030110] [PMID: 17559301]
[16]
Chen YH, Nyeo SL, Yeh CY. Model for the distributions of k -mers in DNA sequences. Phys Rev E Stat Nonlin Soft Matter Phys 2005; 72(1): 011908.
[http://dx.doi.org/10.1103/PhysRevE.72.011908] [PMID: 16090002]
[17]
Ji L, Sasaki T, Sun X, Ma P, Lewis ZA, Schmitz RJ. Methylated DNA is over-represented in whole-genome bisulfite sequencing data. Front Genet 2014; 5: 341.
[http://dx.doi.org/10.3389/fgene.2014.00341] [PMID: 25374580]
[18]
Catak FO, Balaban E. CloudSVM: Training an SVM classifier in cloud computing systems. 2013. Available from: https://arxiv.org/pdf/1301.0082.pdf
[19]
Zhang Z. Naïve Bayes classification in R. Ann Transl Med 2016; 4(12): 241.
[http://dx.doi.org/10.21037/atm.2016.03.38] [PMID: 27429967]
[20]
Rabiner L, Juang B. An introduction to hidden Markov models. IEEE ASSP Mag 1986; 3(1): 4-16.
[http://dx.doi.org/10.1109/MASSP.1986.1165342]
[21]
Song YY, Lu Y. Decision tree methods: Applications for classification and prediction. Shanghai Jingshen Yixue 2015; 27(2): 130-5.
[PMID: 26120265]
[22]
Breiman L. Random Forests. Mach Learn 2001; 45(1): 5-32.
[http://dx.doi.org/10.1023/A:1010933404324]
[23]
Sutton RS, Barto AG. Reinforcement learning-an introduction cambridge, massachusetts. The MIT Press 2005.
[24]
Sayers EW, Barrett T, Benson DA, et al. Database resources of the national center for biotechnology information. Nucleic Acids Res 2011; 39(Database): D38-51.
[http://dx.doi.org/10.1093/nar/gkq1172] [PMID: 21097890 ]
[25]
Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021; 372(71): n71.
[http://dx.doi.org/10.1136/bmj.n71] [PMID: 33782057]
[26]
Lowe HJ, Barnett GO. Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA 1994; 271(14): 1103-8.
[http://dx.doi.org/10.1001/jama.1994.03510380059038] [PMID: 8151853]
[27]
Takai D, Jones PA. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci USA 2002; 99(6): 3740-5.
[http://dx.doi.org/10.1073/pnas.052410099] [PMID: 11891299]
[28]
Luque-Escamilla PL, Martínez-Aroza J, Oliver JL, Gómez-Lopera JF, Román-Roldán R. Compositional searching of CpG islands in the human genome. Phys Rev E Stat Nonlin Soft Matter Phys 2005; 71(6): 061925.
[http://dx.doi.org/10.1103/PhysRevE.71.061925] [PMID: 16089783]
[29]
Yu N, Guo X, Zelikovsky A, Pan Y. GaussianCpG: A Gaussian model for detection of CpG island in human genome sequences. BMC Genomics 2017; 18(S4) (Suppl. 4): 392.
[http://dx.doi.org/10.1186/s12864-017-3731-5] [PMID: 28589860]
[30]
Garg P, Sharma S. Identification of CpG islands in DNA sequences using short-time fourier transform. Interdiscip Sci 2020; 12(3): 355-67.
[http://dx.doi.org/10.1007/s12539-020-00370-y] [PMID: 32394270]
[31]
Lai FL, Gao F. GC-Profile 2.0: An extended web server for the prediction and visualization of CpG islands. Bioinformatics 2021; 38(6): 1738-40.
[32]
Ponger L, Mouchiroud D. CpGProD: Identifying CpG islands associated with transcription start sites in large genomic mammalian sequences. Bioinformatics 2002; 18(4): 631-3.
[http://dx.doi.org/10.1093/bioinformatics/18.4.631] [PMID: 12016061]
[33]
Wang Y, Leung FCC. An evaluation of new criteria for CpG islands in the human genome as gene markers. Bioinformatics 2004; 20(7): 1170-7.
[http://dx.doi.org/10.1093/bioinformatics/bth059] [PMID: 14764558]
[34]
Park HC, Ahn ER, Jung JY, et al. Enhanced sensitivity of CpG island search and primer design based on predicted CpG island position. Forensic Sci Int Genet 2018; 34: 134-40.
[http://dx.doi.org/10.1016/j.fsigen.2018.02.013] [PMID: 29477876]
[35]
Ye S, Asaithambi A, Liu Y. CpGIF: An algorithm for the identification of CpG islands. Bioinformation 2008; 2(8): 335-8.
[http://dx.doi.org/10.6026/97320630002335] [PMID: 18685720]
[36]
Deininger P. Alu elements: Know the SINEs. Genome Biol 2011; 12(12): 236.
[http://dx.doi.org/10.1186/gb-2011-12-12-236] [PMID: 22204421]
[37]
Rosenbloom KR, Sloan CA, Malladi VS, et al. ENCODE data in the UCSC Genome Browser: Year 5 update. Nucleic Acids Res 2013; 41: D56-63.
[PMID: 23193274]
[38]
Heisler LE, Torti D, Boutros PC, et al. CpG Island microarray probe sequences derived from a physical library are representative of CpG Islands annotated on the human genome. Nucleic Acids Res 2005; 33(9): 2952-61.
[http://dx.doi.org/10.1093/nar/gki582] [PMID: 15911630]
[39]
Kakumani R, Ahmad O, Devabhaktuni V. Identification of CpG islands in DNA sequences using statistically optimal null filters. EURASIP J Bioinform Syst Biol 2012; 2012(1): 12.
[http://dx.doi.org/10.1186/1687-4153-2012-12] [PMID: 22931396]
[40]
Yang CH, Lin YD, Chiang YC, Chuang LY. A hybrid approach for CpG island detection in the human genome. PLoS One 2016; 11(1): e0144748.
[http://dx.doi.org/10.1371/journal.pone.0144748] [PMID: 26727213]
[41]
Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006; 27(8): 861-74.
[http://dx.doi.org/10.1016/j.patrec.2005.10.010]
[42]
Yu H, Sun C, Yang W, Xu S, Dan Y. A review of class imbalance learning methods in bioinformatics. Curr Bioinform 2015; 10(4): 360-9.
[http://dx.doi.org/10.2174/1574893609666140829204535]
[43]
Huska M, Vingron M. Improved prediction of non-methylated islands in vertebrates highlights different characteristic sequence patterns. PLOS Comput Biol 2016; 12(12): e1005249.
[http://dx.doi.org/10.1371/journal.pcbi.1005249] [PMID: 27984582]
[44]
Tang J, Alelyani S, Liu H. Feature selection for classification: A review 2014. Available from: https://www.cse.msu.edu/~tangjili/publication/feature_selection_for_classification.pdf
[45]
Kohavi R, John GH. Wrappers for feature subset selection. Artif Intell 1997; 97(1-2): 273-324.
[http://dx.doi.org/10.1016/S0004-3702(97)00043-X]
[46]
Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273-97.
[http://dx.doi.org/10.1007/BF00994018]
[47]
Singer M, Engström A, Schönhuth A, Pachter L. Determining coding CpG islands by identifying regions significant for pattern statistics on Markov chains. Stat Appl Genet Mol Biol 2011; 10(1): 10.
[http://dx.doi.org/10.2202/1544-6115.1677] [PMID: 23089814]
[48]
Irizarry RA, Wu H, Feinberg AP. A species-generalized probabilistic model-based definition of CpG islands. Mamm Genome 2009; 20(9-10): 674-80.
[http://dx.doi.org/10.1007/s00335-009-9222-5] [PMID: 19777308]
[49]
Kakumani R, Ahmad MO, Devabhaktuni V. Identification of CpG islands in DNA sequences using matched filters. Annual International Conference. 2011; 2011: pp. 6029-32.
[http://dx.doi.org/10.1109/IEMBS.2011.6091490]
[50]
Leslie C, Eskin E, Noble WS. The spectrum kernel: a string kernel for SVM protein classification. Pac Symp Biocomput 2002; 564-75.
[PMID: 11928508]
[51]
Schweikert G, Zien A, Zeller G, et al. mGene: Accurate SVM-based gene finding with an application to nematode genomes. Genome Res 2009; 19(11): 2133-43.
[http://dx.doi.org/10.1101/gr.090597.108] [PMID: 19564452]
[52]
Lee D, Karchin R, Beer MA. Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res 2011; 21(12): 2167-80.
[http://dx.doi.org/10.1101/gr.121905.111] [PMID: 21875935]
[53]
Wu H, Caffo B, Jaffee HA, Irizarry RA, Feinberg AP. Redefining CpG islands using hidden Markov models. Biostatistics 2010; 11(3): 499-514.
[http://dx.doi.org/10.1093/biostatistics/kxq005] [PMID: 20212320]
[54]
Hsieh F, Chen SC, Pollard K. A nearly exhaustive search for CpG islands on whole chromosomes. Int J Biostat 2009; 5(1): 5.
[http://dx.doi.org/10.2202/1557-4679.1158] [PMID: 20148132]
[55]
Bäck T, Schwefel H-P. An overview of evolutionary algorithms for parameter optimization. Evol Comput 1993; 1(1): 1-23.
[56]
Kennedy J, Eberhart R. Particle swarm optimization. In Proceedings of ICNN'95 - International Conference on Neural Networks.
[http://dx.doi.org/10.1109/ICNN.1995.488968]
[57]
Yang X-S. Chapter 8 - Particle Swarm Optimization. In Nature-Inspired Optimization Algorithms (Second Edition),. X.-S. Yang, ed. (Academic Press), 2021; pp. 111-21.https://doi.org/10.1016/C2013-0-01368-0
[58]
Whitehead SD, Sutton RS, Ballard DH. Advances in reinforcement learning and their implications for intelligent control. 5th IEEE International Symposium on Intelligent Control. 1289-97.
[http://dx.doi.org/10.1109/ISIC.1990.128621]
[59]
Hackenberg M, Previti C, Luque-Escamilla PL, Carpena P, Martínez-Aroza J, Oliver JL. CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinforma 2006; 7(1): 446.
[http://dx.doi.org/10.1186/1471-2105-7-446] [PMID: 17038168]
[60]
Chuang LY, Yang CH, Lin MC, Yang CH. CpGPAP: CpG island predictor analysis platform. BMC Genet 2012; 13(1): 13.
[http://dx.doi.org/10.1186/1471-2156-13-13] [PMID: 22385986]
[61]
Rice P, Longden I, Bleasby A. EMBOSS: The European molecular biology open software suite. Trends Genet 2000; 16(6): 276-7.
[http://dx.doi.org/10.1016/S0168-9525(00)02024-2] [PMID: 10827456]
[62]
Xiao M, Li J, Hong S, et al. K-mer Counting: memory-efficient strategy, parallel computing and field of application for Bioinformatics IEEE Int Conf on Bioinformatic and Biomedicine. BIBM 2018; pp. 2561-7.
[63]
Yang CH, Chiang YC, Chuang LY, Lin YD. 2018; A CpGCluster-teaching-learning-based optimization for prediction of CpG islands in the human genome. J Comput Biol 2018; 25(2): 158-69.
[http://dx.doi.org/10.1089/cmb.2016.0178]
[64]
Ribeca P, Raineri E. Faster exact Markovian probability functions for motif occurrences: a DFA-only approach. Bioinformatics 2008; 24(24): 2839-48.
[http://dx.doi.org/10.1093/bioinformatics/btn525] [PMID: 18845582]
[65]
Spontaneo L, Cercone N. Correlating CpG islands, motifs, and sequence variants in human chromosome 21. BMC Genomics 2011; 12(Suppl 2): S10.
[http://dx.doi.org/10.1186/1471-2164-12-S2-S10] [PMID: 21989037]
[66]
Benesch T. The Baum-Welch algorithm for parameter estimation of Gaussian autoregressive mixture models. J Math Sci 2001; 105(6): 2515-8.
[http://dx.doi.org/10.1023/A:1011342715567]
[67]
Mohamed Hashim EK, Abdullah R. Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter. J Theor Biol 2015; 387: 88-100.
[http://dx.doi.org/10.1016/j.jtbi.2015.09.014] [PMID: 26427337]
[68]
Su J, Zhang Y, Lv J, et al. CpG_MI: A novel approach for identifying functional CpG islands in mammalian genomes. Nucleic Acids Res 2010; 38(1): e6.
[http://dx.doi.org/10.1093/nar/gkp882] [PMID: 19854943]
[69]
Larrañaga P, Calvo B, Santana R, et al. Machine learning in bioinformatics. Brief Bioinform 2006; 7(1): 86-112.
[http://dx.doi.org/10.1093/bib/bbk007] [PMID: 16761367]
[70]
Zhou G, Si J. A systematic and effective supervised learning mechanism based on Jacobian rank deficiency. Neural Comput 1998; 10(4): 1031-45.
[http://dx.doi.org/10.1162/089976698300017610] [PMID: 9573418]
[71]
Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. J Artif Intell Res 1996; 4: 237-85.
[http://dx.doi.org/10.1613/jair.301]
[72]
Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007; 23(19): 2507-17.
[http://dx.doi.org/10.1093/bioinformatics/btm344] [PMID: 17720704]
[73]
McCabe MT, Lee EK, Vertino PM. A multifactorial signature of DNA sequence and polycomb binding predicts aberrant CpG island methylation. Cancer Res 2009; 69(1): 282-91.
[http://dx.doi.org/10.1158/0008-5472.CAN-08-3274] [PMID: 19118013]
[74]
Frommer M, McDonald LE, Millar DS, et al. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci USA 1992; 89(5): 1827-31.
[http://dx.doi.org/10.1073/pnas.89.5.1827] [PMID: 1542678]
[75]
Okuizumi H, Takamiya T, Okazaki Y, Hayashizaki Y. Restriction landmark genome scanning. Methods Mol Biol 2011; 791: 101-12.
[http://dx.doi.org/10.1007/978-1-61779-316-5_8] [PMID: 21913074]
[76]
Fang F, Fan S, Zhang X, Zhang MQ. Predicting methylation status of CpG islands in the human brain. Bioinformatics 2006; 22(18): 2204-9.
[http://dx.doi.org/10.1093/bioinformatics/btl377] [PMID: 16837523]
[77]
Ali I, Mohamoud HS. 2011; pp. An identification and prediction methods for feature-subsets of CpG islands methylation based on human peripheral blood leukocytes of chromosome 21q. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society Annual International Conference . 2011; pp. 3233-6.
[http://dx.doi.org/10.1109/IEMBS.2011.6090879]
[78]
Feltus FA, Lee EK, Costello JF, Plass C, Vertino PM. Predicting aberrant CpG island methylation. Proc Natl Acad Sci USA 2003; 100(21): 12253-8.
[http://dx.doi.org/10.1073/pnas.2037852100] [PMID: 14519846]
[79]
James P, Girijadevi R, Charles S, Pillai MR. MethFinder - A software package for prediction of human tissue-specific methylation status of CpG islands. Bioinformation 2013; 9(1): 61-4.
[http://dx.doi.org/10.6026/97320630009061] [PMID: 23390346]
[80]
Wrzodek C, Büchel F, Hinselmann G, Eichner J, Mittag F, Zell A. Linking the epigenome to the genome: Correlation of different features to DNA methylation of CpG islands. PLoS One 2012; 7(4): e35327.
[http://dx.doi.org/10.1371/journal.pone.0035327] [PMID: 22558141]
[81]
Bonello N, Sampson J, Burn J, et al. Bayesian inference supports a location and neighbour-dependent model of DNA methylation propagation at the MGMT gene promoter in lung tumours. J Theor Biol 2013; 336: 87-95.
[http://dx.doi.org/10.1016/j.jtbi.2013.07.019] [PMID: 23911575]
[82]
Aliaga B, Bulla I, Mouahid G, Duval D, Grunau C. Universality of the DNA methylation codes in Eucaryotes. Sci Rep 2019; 9(1): 173.
[http://dx.doi.org/10.1038/s41598-018-37407-8] [PMID: 30655579]
[83]
Weglarczyk S. Kernel density estimation and its application. ITM Web Conf 2018; 23.
[http://dx.doi.org/10.1051/itmconf/20182300037]
[84]
Bock C, Paulsen M, Tierling S, Mikeska T, Lengauer T, Walter J. CpG island methylation in human lymphocytes is highly correlated with DNA sequence, repeats, and predicted DNA structure. PLoS Genet 2006; 2(3): e26.
[http://dx.doi.org/10.1371/journal.pgen.0020026] [PMID: 16520826]
[85]
Fan S, Zhang MQ, Zhang X. Histone methylation marks play important roles in predicting the methylation status of CpG islands. Biochem Biophys Res Commun 2008; 374(3): 559-64.
[http://dx.doi.org/10.1016/j.bbrc.2008.07.077] [PMID: 18656446]
[86]
Zheng H, Wu H, Li J, Jiang SW. CpGIMethPred: computational model for predicting methylation status of CpG islands in human genome. BMC Med Genomics 2013; 6(S1) (Suppl. 1): S13.
[http://dx.doi.org/10.1186/1755-8794-6-S1-S13] [PMID: 23369266]
[87]
Previti C, Harari O, Zwir I, del Val C. Profile analysis and prediction of tissue-specific CpG island methylation classes. BMC Bioinformatics 2009; 10(1): 116.
[http://dx.doi.org/10.1186/1471-2105-10-116] [PMID: 19383127]
[88]
Wang Y, Liu T, Xu D, et al. Predicting DNA methylation state of cpg dinucleotide using genome topological features and deep networks. Sci Rep 2016; 6(1): 19598.
[http://dx.doi.org/10.1038/srep19598] [PMID: 26797014]
[89]
Wang Z, Cao R, Taylor K, Briley A, Caldwell C, Cheng J. The properties of genome conformation and spatial gene interaction and regulation networks of normal and malignant human cell types. PLoS One 2013; 8(3): e58793.
[http://dx.doi.org/10.1371/journal.pone.0058793] [PMID: 23536826]
[90]
Joachims T. Making large-scale SVM learning practical: University of Dortmund Technical Report No 1998; 28: 1998.
[91]
Levy JJ, Titus AJ, Petersen CL, Chen Y, Salas LA, Christensen BC. MethylNet: an automated and modular deep learning approach for DNA methylation analysis. BMC Bioinformatics 2020; 21(1): 108.
[http://dx.doi.org/10.1186/s12859-020-3443-8] [PMID: 32183722]
[92]
Chakraborty A, Ravi SP, Shamiya Y, Cui C, Paul A. Harnessing the physicochemical properties of DNA as a multifunctional biomaterial for biomedical and other applications. Chem Soc Rev 2021; 50(13): 7779-819.
[http://dx.doi.org/10.1039/D0CS01387K] [PMID: 34036968]
[93]
Feng P, Chen W, Lin H. Prediction of CpG island methylation status by integrating DNA physicochemical properties. Genomics 2014; 104(4): 229-33.
[http://dx.doi.org/10.1016/j.ygeno.2014.08.011] [PMID: 25172426]
[94]
Uroshlev LA, Abdullaev ET, Umarova IR, et al. A Method for identification of the methylation level of CpG islands from NGS data. Sci Rep 2020; 10(1): 8635.
[http://dx.doi.org/10.1038/s41598-020-65406-1] [PMID: 32451390]
[95]
Bibikova M, Barnes B, Tsan C, et al. High density DNA methylation array with single CpG site resolution. Genomics 2011; 98(4): 288-95.
[http://dx.doi.org/10.1016/j.ygeno.2011.07.007] [PMID: 21839163]
[96]
Zhang W, Spector TD, Deloukas P, Bell JT, Engelhardt BE. Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol 2015; 16(1): 14.
[http://dx.doi.org/10.1186/s13059-015-0581-9] [PMID: 25616342]
[97]
Jiang L, Wang C, Tang J, Guo F. LightCpG: a multi-view CpG sites detection on single-cell whole genome sequence data. BMC Genomics 2019; 20(1): 306.
[http://dx.doi.org/10.1186/s12864-019-5654-9] [PMID: 31014252]
[98]
Wang H, He C, Kushwaha G, Xu D, Qiu J. A full Bayesian partition model for identifying hypo- and hyper-methylated loci from single nucleotide resolution sequencing data. BMC Bioinformatics 2016; 17(S1) (Suppl. 1): S7.
[http://dx.doi.org/10.1186/s12859-015-0850-3] [PMID: 26818685]
[99]
Tang J, Zou J, Zhang X, et al. PretiMeth: precise prediction models for DNA methylation based on single methylation mark. BMC Genomics 2020; 21(1): 364.
[http://dx.doi.org/10.1186/s12864-020-6768-9] [PMID: 32414326]
[100]
Wei T, Nie J, Larson NB, et al. CpGtools: A python package for DNA methylation analysis. Bioinformatics 2021; 37(11): 1598-9.
[http://dx.doi.org/10.1093/bioinformatics/btz916] [PMID: 31808791]
[101]
Cutler A, Cutler D, Stevens J. Random Forests 2011; 45: 157-76.
[102]
Friedman J. Greedy function approximation: A gradient boosting machine. Ann Stat 2000; 29.
[103]
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016; 785-94.
[http://dx.doi.org/10.1145/2939672.2939785]
[104]
Angermueller C, Lee HJ, Reik W, Stegle O. DeepCpG: Accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol 2017; 18(1): 67.
[http://dx.doi.org/10.1186/s13059-017-1189-z] [PMID: 28395661]
[105]
Dong S, Wang P, Abbas K. A survey on deep learning and its applications. Comput Sci Rev 2021; 40: 100379.
[http://dx.doi.org/10.1016/j.cosrev.2021.100379]
[106]
Nam D, Yoon SH, Kim JF. Ensemble learning of genetic networks from time-series expression data. Bioinformatics 2007; 23(23): 3225-31.
[http://dx.doi.org/10.1093/bioinformatics/btm514] [PMID: 17977884]
[107]
Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med 2019; 112: 103375.
[http://dx.doi.org/10.1016/j.compbiomed.2019.103375] [PMID: 31382212]
[108]
Li L, Ching WK, Liu ZP. Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods. Comput Biol Chem 2022; 100: 107747.
[http://dx.doi.org/10.1016/j.compbiolchem.2022.107747] [PMID: 35932551]
[109]
Tutsoy O. Pharmacological, non-pharmacological policies and mutation: An artificial intelligence based multi-dimensional policy making algorithm for controlling the casualties of the pandemic diseases. IEEE Trans Pattern Anal Mach Intell 2021; 44(12): 9477-88.
[110]
Tutsoy O, Balikci K, Ozdil NF. Unknown uncertainties in the COVID-19 pandemic: Multi-dimensional identification and mathematical modelling for the analysis and estimation of the casualties. Digit Signal Process 2021; 114: 103058.
[111]
Affinito O, Palumbo D, Fierro A, et al. Nucleotide distance influences co-methylation between nearby CpG sites. Genomics 2020; 112(1): 144-50.
[http://dx.doi.org/10.1016/j.ygeno.2019.05.007] [PMID: 31078719]
[112]
Lövkvist C, Dodd IB, Sneppen K, Haerter JO. DNA methylation in human epigenomes depends on local topology of CpG sites. Nucleic Acids Res 2016; 44(11): 5123-32.
[http://dx.doi.org/10.1093/nar/gkw124] [PMID: 26932361]
[113]
El-Maarri O, Olek A, Balabau B, et al. Methylation levels at selected CpG sites in the factor VIII and FGFR3 genes, in mature female and male germ cells: Implications for male-driven evolution. Am J Hum Genet 1998; 63(4): 1001-8.
[114]
Acton RJ, Yuan W, Gao F, et al. The genomic loci of specific human tRNA genes exhibit ageing-related DNA hypermethylation. Nat Commun 2021; 12(1): 2655.
[http://dx.doi.org/10.1038/s41467-021-22639-6] [PMID: 33976121]
[115]
Liu B, Du Q, Chen L, et al. CpG methylation patterns of human mitochondrial DNA. Sci Rep 2016; 6(1): 23421.
[http://dx.doi.org/10.1038/srep23421] [PMID: 26996456]
[116]
Jiang B, Dai W, Khaliq A, Carey M, Zhou X, Zhang L. Novel 3D GPU based numerical parallel diffusion algorithms in cylindrical coordinates for health care simulation. Math Comput Simul 2015; 109: 1-19.
[http://dx.doi.org/10.1016/j.matcom.2014.07.003]
[117]
Jiang B, Struthers A, Sun Z, et al. Employing graphics processing unit technology, alternating direction implicit method and domain decomposition to speed up the numerical diffusion solver for the biomedical engineering research. Int J Numer Methods Biomed Eng 2011; 27(11): 1829-49.
[http://dx.doi.org/10.1002/cnm.1444]
[118]
Zhang L, Jiang B, Wu Y, et al. Developing a multiscale, multi-resolution agent-based brain tumor model by graphics processing units. Theor Biol Med Model 2011; 8(1): 46.
[http://dx.doi.org/10.1186/1742-4682-8-46] [PMID: 22176732]
[119]
Xiao M, Liu G, Xie J, et al. 2019nCoVAS: Developing the web service for epidemic transmission prediction, Genome analysis, and psychological stress assessment for 2019-nCoV. IEEE/ACM Trans Comput Biol Bioinform. 2021; 18: pp. (4)1250-61.
[http://dx.doi.org/10.1109/TCBB.2021.3049617] [PMID: 33406042]
[120]
Xiao M, Yang X, Yu J, Zhang L. CGIDLA: Developing the web server for CpG island related density and LAUPs (lineage-associated underrepresented permutations) study. IEEE/ACM Trans Comput Biol Bioinformatics 2020; 17(6): 2148-54.
[http://dx.doi.org/10.1109/TCBB.2019.2935971] [PMID: 31443042]
[121]
Preis T, Virnau P, Paul W, Schneider JJ. GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model. J Comput Phys 2009; 228(12): 4468-77.
[http://dx.doi.org/10.1016/j.jcp.2009.03.018]
[122]
Stone JE, Hardy DJ, Ufimtsev IS, Schulten K. GPU-accelerated molecular modeling coming of age. J Mol Graph Model 2010; 29(2): 116-25.
[http://dx.doi.org/10.1016/j.jmgm.2010.06.010] [PMID: 20675161]
[123]
Zhang L, Zhang L, Guo Y, et al. MCDB: A comprehensive curated mitotic catastrophe database for retrieval, protein sequence alignment, and target prediction. Acta Pharm Sin B 2021; 11(10): 3092-104.
[http://dx.doi.org/10.1016/j.apsb.2021.05.032] [PMID: 34729303]
[124]
Lee CA, Gasster SD, Plaza A, Chang CI, Huang B. Recent developments in high performance computing for remote sensing: A review. IEEE J Sel Top Appl Earth Obs Remote Sens 2011; 4(3): 508-27.
[http://dx.doi.org/10.1109/JSTARS.2011.2162643]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy