Disclaimer: This blog post produces data, reading this blog post produces data, sharing this blog post produces data – while every cell in your body produces data.
DNA Data Storage – how we can use biology for technology
The need is huge!
We produce data every day. So much data that we expect to grow to 250 zettabytes worldwide by 2025. 66% of this data will need to remain accessible for the next 20 years, while 27% will need to be archived for longer than 100 years. This is a major challenge for technology because the requirements for storage media are constantly growing in terms of storage capacity and longevity .
What’s the role of DNA?
The idea of using DNA as a storage medium emerged as early as the late 1950s and has been researched and further developed by many scientists ever since . The advantages of DNA data storage are obvious. Organized into 23 pairs of chromosomes, we humans carry DNA in every diploid cell of our bodies. If the DNA of a single cell is unfolded, it reaches a length of 2 meters. Since it is perfectly organized and compact, however, the diameter of a nucleus containing the entire human genome measures no more than 6 µm.
On top of all that, the DNA manages to encode data for 20,000 to 25,000 proteins in only 4 letters. Converted to digital media, a diploid genome can store 1.5 gigabytes of data.  And now consider that the human body consists of 100 Billion cells! Since DNA has the ability to encode 2 bits per nucleotide, one gram of dried DNA can store 455 exabytes of data .
less waste, extreme durability, minimal size – no problem!
When you think about current storage media, you can’t help but worry about their durability. Data must be constantly migrated to compensate for equipment failures and technology upgrades, which is costly and creates a lot of electronic waste. In addition, most long-term mass storage requires a constant supply of energy (cooling), which makes data centers big emitters of CO2 .
As one of the the most robust biomolecules found in nature, DNA provides a welcome solution to this problem. It is non-volatile and, unlike silicon, does not require lithography, which makes it unbeatable as a storage medium in terms of price/performance ratio .
With a degradation rate of 1bp per 6,830,000 years at a temperature of -5°C, which is equivalent to the degradation rate of mitochondrial DNA in the bones of the Moa (a flightless species that lived in the forests of New Zealand until 1300 AD) , no modern storage medium can even come close to the DNA.
There’s got to be a catch!
The biggest obstacle that needs still to be overcome is to generate the DNA strand storing our data. Generating or writing DNA from scratch is still time and cost intensive because of the lack of automatization and writing speed.
Writing speed has been tendentiously increased by the development of ‘chip’-based DNA synthesis due to the parallelization of DNA synthesis on a small space. This technique makes Twist Bioscience the fastest short oligo writer. However, the final assembling of short oligos to the exabytes storing DNA strand is very laboratory intense because of many additional steps for ligation and error correction. And these steps are far away of being automatized and user-friendly.
And who is going to solve this problem?
We from Kilobaser already made DNA and RNA synthesis available for everyone by applying microfluidic that handles all required chemical reactions to write oligos fully-automatically as well as fast and easy. Now we are going one step further. We aim to make one device to synthesize and ligate oligos into longer DNA strands fully automatized providing finally a synthesizer that generates kilo base long DNA, a true KILOBASER.
The integration of ligation is our first step towards DNA-based data storage. After that, we just need to increase parallelization to allow any IT department to write DNA on demand. For these developments, we are collaborating with Imperial College London.
 [HLK+19] Shuichi Hoshika, Nicole A Leal, Myong-Jung Kim, Myong-Sang Kim, Nilesh B Karalkar, HyoJoong Kim, Alison M Bates, Jr Watkins, Norman E, Holly A SantaLucia, Adam J Meyer, Saurja DasGupta, Joseph A Piccirilli, Andrew D Ellington, Jr SantaLucia, John, Millie M Georgiadis, and Steven A Benner. Hachimoji DNA and RNA: A genetic system with eight building blocks. Science, 363(6429):884–887, 02 2019.
 Rebrova IM, Rebrova OY (2020). “Storage devices based on artificial DNA: the birth of an idea and the first publications”. Voprosy istorii estestvoznaniia i tekhniki. 41 (4): 666–76 (in Russ.). doi:10.31857/S020596060013006-8. S2CID 234420446.
 Grigoryev Y (2012) How much information is stored in the human genome? Technical report from BitesizeBio http://bitesizebio.com/8378/how-much-information-is-stored-in-the-human-genome/
Church GM, Gao Y, Kosuri S. Digitale Informationsspeicherung der nächsten Generation in DNA. Wissenschaft. 2012; 337 : 1628. doi: 10.1126/science.1226355
 Kim S, Soltis DE, Soltis PS, Suh Y. DNA-Sequenzen aus Miozän-Fossilien: eine ndhF-Sequenz von Magnolia latahensis (Magnoliaceae) und eine rbcL-Sequenz von Persea pseudocarolinensis (Lauraceae) Am J Bot. 2004; 91 :615–620. doi: 10.3732/ajb.91.4.615.
 Allentoft ME, Collins M, Harker D, Haile J, Oskam CL. Die Halbwertszeit von DNA in Knochen: Messung der Zerfallskinetik in 158 datierten Fossilien. Proc R Soc Lond B Bio. 2012
Are you interested in DNA data storage and don’t want to miss a thing? Subscribe to our newsletter and stay up to date on what’s going on!