Just when you thought efforts to find better ways to store large amounts of data couldn’t get more outre, researchers in the UK have cooked up a way to put data on human DNA at a density of 2.2 petabytes per gram.
Molecular biologists Nick Goldman and Ewan Birney at the European Bioinformatics Institute (EBI) published their research in a recent edition of Nature. While some Harvard researchers did something similar last year, they were only able to cram a measly 700 terabits per gram of information onto the double helix. The EBI team tripled that capacity.
The researchers converted their data into binary code, then converted it into trinary code (0s, 1s, and 2s). The data was then rewritten as strings of DNA chemical bases (As, Gs, Cs, and Ts). By encoding the information multiple times, they ensured it could be read back with 100% accuracy. As a test, they encoded several of Shakespeare’s sonnets, some photos, and Martin Luther King’s “I Have a Dream Speech.”
“We already know that DNA is a robust way to store information because we can extract it from wooly mammoth bones, which date back tens of thousands of years, and make sense of it,” Goldman said. “It’s also incredibly small, dense and does not need any power for storage, so shipping and keeping it is easy.”
It’s still incredibly expensive, largely because of the costs of DNA synthesis, but those costs are dropping and the technology could be cost effective within the next 50 years. (At current prices, it would only make sense to use the technology if you planned to store the data for 500 years or more.) Two other problems: you can only write the data once, and you have to sequence large chunks of DNA to find any given piece of information.