Abstract
In the information age, data storage technology has become the key to improving computer systems. Since traditional storage technologies cannot meet the demand for massive storage, new DNA storage technology based on biomolecules attracts much attention. DNA storage refers to the technology that uses artificially synthesized deoxynucleotide chains to store and read all information, such as documents, pictures, and audio. First, data are encoded into binary number strings. Then, the four types of base, A(Adenine), T(Thymine), C(Cytosine), and G(Guanine), are used to encode the corresponding binary numbers so that the data can be used to construct the target DNA molecules in the form of deoxynucleotide chains. Subsequently, the corresponding DNA molecules are artificially synthesized, enabling the data to be stored within them. Compared with traditional storage systems, DNA storage has major advantages, such as high storage density, long duration, as well as low hardware cost, high access parallelism, and strong scalability, which satisfies the demands for big data storage. This manuscript first reviews the origin and development of DNA storage technology, then the storage principles, contents, and methods are introduced. Finally, the development of DNA storage technology is analyzed. From the initial research to the cutting edge of this field and beyond, the advantages, disadvantages, and practical applications of DNA storage technology require continuous exploration.
Keywords: DNA storage, deoxynucleotide chain, base, binary number. DVDs, CDs.
Graphical Abstract
[http://dx.doi.org/10.2174/1574893613666180502111320]
[http://dx.doi.org/10.2174/157489361506200610112345]
[http://dx.doi.org/10.2174/138920291804170726143423] [PMID: 29081685]
[http://dx.doi.org/10.1021/acs.jproteome.0c00316] [PMID: 32654489]
[http://dx.doi.org/10.1093/bib/bbz011] [PMID: 30753282]
[PMID: 26134276]
[http://dx.doi.org/10.1186/1471-2105-14-73] [PMID: 23452691]
[http://dx.doi.org/10.1126/science.7725098]
[http://dx.doi.org/10.1126/science.7973651]
[http://dx.doi.org/10.1053/j.gastro.2011.09.002]
[http://dx.doi.org/10.1038/nature11875] [PMID: 23354052]
[http://dx.doi.org/10.3390/molecules24010179] [PMID: 30621295]
[http://dx.doi.org/10.1016/j.compbiolchem.2020.107365] [PMID: 32906056]
[http://dx.doi.org/10.1109/LCOMM.2018.2861867]
[http://dx.doi.org/10.1109/TNB.2018.2873221] [PMID: 30281471]
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286]
[http://dx.doi.org/10.1016/j.tcs.2015.12.006]
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[http://dx.doi.org/10.1093/bib/bby107] [PMID: 30383239]
[http://dx.doi.org/10.1016/j.ymeth.2019.02.009] [PMID: 30772464]
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[http://dx.doi.org/10.1016/j.neucom.2018.04.082]
[http://dx.doi.org/10.1109/JBHI.2020.2977091] [PMID: 32142462]
[http://dx.doi.org/10.1109/TCYB.2017.2779450] [PMID: 29990272]
[http://dx.doi.org/10.1109/TCYB.2018.2856208] [PMID: 30059330]
[http://dx.doi.org/10.1109/TCYB.2019.2938895] [PMID: 31545758]
[http://dx.doi.org/10.1186/s12918-017-0453-x] [PMID: 28950863]
[http://dx.doi.org/10.1093/bioinformatics/bty059] [PMID: 29474523]