Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Review Article

A Review of DNA Data Storage Technologies Based on Biomolecules

Author(s): Lichao Zhang, Yuanyuan Lv, Lei Xu* and Murong Zhou*

Volume 17, Issue 1, 2022

Published on: 13 August, 2021

Page: [31 - 36] Pages: 6

DOI: 10.2174/1574893616666210813101237

Price: $65

Abstract

In the information age, data storage technology has become the key to improving computer systems. Since traditional storage technologies cannot meet the demand for massive storage, new DNA storage technology based on biomolecules attracts much attention. DNA storage refers to the technology that uses artificially synthesized deoxynucleotide chains to store and read all information, such as documents, pictures, and audio. First, data are encoded into binary number strings. Then, the four types of base, A(Adenine), T(Thymine), C(Cytosine), and G(Guanine), are used to encode the corresponding binary numbers so that the data can be used to construct the target DNA molecules in the form of deoxynucleotide chains. Subsequently, the corresponding DNA molecules are artificially synthesized, enabling the data to be stored within them. Compared with traditional storage systems, DNA storage has major advantages, such as high storage density, long duration, as well as low hardware cost, high access parallelism, and strong scalability, which satisfies the demands for big data storage. This manuscript first reviews the origin and development of DNA storage technology, then the storage principles, contents, and methods are introduced. Finally, the development of DNA storage technology is analyzed. From the initial research to the cutting edge of this field and beyond, the advantages, disadvantages, and practical applications of DNA storage technology require continuous exploration.

Keywords: DNA storage, deoxynucleotide chain, base, binary number. DVDs, CDs.

Graphical Abstract

[1]
Jin Y. Quality of service aware medical CT image transmission anti-collision mechanism based on big data autonomous anti-collision control. Curr Bioinform 2019; 14(7): 676-83.
[http://dx.doi.org/10.2174/1574893613666180502111320]
[2]
Lin H. Development and application of artificial intelligence methods in biological and medical data. Curr Bioinform 2020; 15(6): 515-6.
[http://dx.doi.org/10.2174/157489361506200610112345]
[3]
Zou Q. Editorial: Latest computational techniques for big data era bioinformatics problems. Curr Genomics 2017; 18(4): 305-5.
[http://dx.doi.org/10.2174/138920291804170726143423] [PMID: 29081685]
[4]
Zeng X, Song X, Ma T, et al. Repurpose open data to discover therapeutics for COVID-19 using deep learning. J Proteome Res 2020; 19(11): 4624-36.
[http://dx.doi.org/10.1021/acs.jproteome.0c00316] [PMID: 32654489]
[5]
Liu X, Hong Z, Liu J, et al. Computational methods for identifying the critical nodes in biological networks. Brief Bioinform 2020; 21(2): 486-97.
[http://dx.doi.org/10.1093/bib/bbz011] [PMID: 30753282]
[6]
Zou Q, Li J, Song L, Zeng X, Wang G. Similarity computation strategies in the microRNA-disease network: A survey. Brief Funct Genomics 2016; 15(1): 55-64.
[PMID: 26134276]
[7]
Małysiak-Mrozek B, Baron T, Mrozek D. Spark-IDPP: high-throughput and scalable prediction of intrinsically disordered protein regions with Spark clusters on the Cloud. Cluster Comput 2018; (17): 487-508.
[8]
Mrozek D, Małysiak-Mrozek B, Siążnik A. Search GenBank: Interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information. BMC Bioinformatics 2013; 14: 73.
[http://dx.doi.org/10.1186/1471-2105-14-73] [PMID: 23452691]
[9]
Lipton RJJS. DNA solution of hard computational problems. Science 1995; 268(5210): 542-5.
[http://dx.doi.org/10.1126/science.7725098]
[10]
Adleman LMJS. Molecular computation of solutions to combinatorial problems 1994; 266(5187): 1021-4.
[http://dx.doi.org/10.1126/science.7973651]
[11]
International HapMap Consortium. The international HapMap project. Nature 2003; 426: 789-96.
[12]
Gao B, Bataller RJG. Alcoholic liver disease: Pathogenesis and new therapeutic targets 2011; 141(5): 1572-85.
[http://dx.doi.org/10.1053/j.gastro.2011.09.002]
[13]
Goldman N, Bertone P, Chen S, et al. Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 2013; 494(7435): 77-80.
[http://dx.doi.org/10.1038/nature11875] [PMID: 23354052]
[14]
Mrozek D, Dąbek T, Małysiak-Mrozek B. Scalable extraction of big macromolecular data in azure data lake environment. Molecules 2019; 24(1): 179.
[http://dx.doi.org/10.3390/molecules24010179] [PMID: 30621295]
[15]
Mrozek D. A review of Cloud computing technologies for comprehensive microRNA analyses. Comput Biol Chem 2020; 88107365
[http://dx.doi.org/10.1016/j.compbiolchem.2020.107365] [PMID: 32906056]
[16]
Yazdi SHT, Yuan Y, Ma J, et al. A rewritable, random-access DNA-based storage system. Sci Rep 2015; 5(1): 1-10.
[17]
Limbachiya D, Gupta MK, Aggarwal VJICL. Family of constrained codes for archival DNA data storage. IEEE Commun Lett 2018; 22(10): 1972-5.
[http://dx.doi.org/10.1109/LCOMM.2018.2861867]
[18]
Song T, Zeng X, Zheng P, Jiang M, Rodriguez-Paton A. A parallel workflow pattern modeling using spiking neural P systems with colored spikes. IEEE Trans Cogn Dev Syst 2018; 17(4): 474-84.
[http://dx.doi.org/10.1109/TNB.2018.2873221] [PMID: 30281471]
[19]
Song B. Monodirectional tissue P systems with promoters. IEEE Trans Cybern 2020; 51(1): 438-50.
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286]
[20]
Chen X, Mario J, Perez-jemenez , et al. Computing with viruses. Theor Comput Sci 2016; 623: 146-59.
[http://dx.doi.org/10.1016/j.tcs.2015.12.006]
[21]
Wei L, Zhou C, Chen H, Song J, Su R. ACPred-FL: A sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 2018; 34(23): 4007-16.
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[22]
Wei L, Hu J, Li F, Song J, Su R, Zou Q. Comparative analysis and prediction of quorum-sensing peptides using feature representation learning and machine learning algorithms. Brief Bioinform 2018; 21(1): 106-19.
[http://dx.doi.org/10.1093/bib/bby107] [PMID: 30383239]
[23]
Su R, Liu X, Wei L, Zou Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods 2019; 166: 91-102.
[http://dx.doi.org/10.1016/j.ymeth.2019.02.009] [PMID: 30772464]
[24]
Wei L, Xing P, Zeng J, Chen J, Su R, Guo F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif Intell Med 2017; 83: 67-74.
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[25]
Wei L, Su R, Wang B, et al. Integration of deep feature representations and handcrafted features to improve the prediction of N-6-methyladenosine sites. Neurocomputing 2019; 324: 3-9.
[http://dx.doi.org/10.1016/j.neucom.2018.04.082]
[26]
Li JP, Yuqian T, Jijun JP, Zou Q, Guo F. DeepATT: A hybrid category attention neural network for identifying functional effects of DNA sequences. Brief Bioinform 2020; 22(3)bbaa159
[27]
Li J, Pu Y, Tang J, Zou Q, Guo F. DeepAVP: A dual-channel deep neural network for identifying variable-length antiviral peptides. IEEE J Biomed Health Inform 2020; 24(10): 3012-9.
[http://dx.doi.org/10.1109/JBHI.2020.2977091] [PMID: 32142462]
[28]
Xu H, Zeng W, Zhang D, Zeng X. MOEA/HD: A multiobjective evolutionary algorithm based on hierarchical decomposition. IEEE Trans Cybern 2019; 49(2): 517-26.
[http://dx.doi.org/10.1109/TCYB.2017.2779450] [PMID: 29990272]
[29]
Xu H, Zeng W, Zeng X, Yen GG. An evolutionary algorithm based on Minkowski distance for many-objective optimization. IEEE Trans Cybern 2019; 49(11): 3968-79.
[http://dx.doi.org/10.1109/TCYB.2018.2856208] [PMID: 30059330]
[30]
Zeng X, Wang W, Chen C, Yen GG. A consensus community-based particle swarm optimization for dynamic community detection. IEEE Trans Cybern 2020; 50(6): 2502-13.
[http://dx.doi.org/10.1109/TCYB.2019.2938895] [PMID: 31545758]
[31]
Zhang Z, Guo K, Pan G, Tang J, Guo F. Improvement of phylogenetic method to analyze compositional heterogeneity. BMC Syst Biol 2017; 11(Suppl. 4): 79.
[http://dx.doi.org/10.1186/s12918-017-0453-x] [PMID: 28950863]
[32]
Guo F, Wang D, Wang L. Progressive approach for SNP calling and haplotype assembly using single molecular sequencing data. Bioinformatics 2018; 34(12): 2012-8.
[http://dx.doi.org/10.1093/bioinformatics/bty059] [PMID: 29474523]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy