DNA IT, the next revolution

I’ve been writing about DNA computing and storage for quite awhile now (see DNA computing and the end of natural evolution, DNA storage and the end of evolution part 2, & Random access DNA object storage system). But in the last few months there’s been a flurry of activity in this space that seems worthy of note.

DNA programing language

First up, A logic programing language for computational nucleic acid devices, a research article in ACS Synthetic Biology magazine. The research describes a new approach to programming DNA computers, that’s uniquely designed to mimic molecular algorithmic capabilities for DNA devices. T\

The language uses logical statements and predicates (reminds me of Prolog). Indeed, the language was modeled after Prolog with equational and molecular extensions to represent DNA functionality. As with Prolog, output is a function of declarative, predicate logic rather than control flow and assignment in normal programming languages. Logic programming takes a different mind set and demands an understanding of formal logic.

The article talks about applications for DNA computing for in vitro (chemical/protien) manufacturing, diagnosis, and therapeutics (operating inside living cells) devices (cells).

DNA storage device

Next up, a recent article in Scientific Reports, Demonstration of end-to-end automation of DNA data storage.

The intent here is to create a fully automated data storage device that uses DNA as its recording media. The current device (seen in the bottom right above) is a lab prototype, that fits on a bench and costs $10K that can store 5 bytes of data with error correction.

The system has three hardware modules: synthesis (writing), storage and sequencing (reading). It also includes encoding and decoding software that translates bits to nucleic acid bases and adds error correction to it. They need to add more bases to be compatible with the sequencing (reading) process.

The limits to storage may have something to do with the size of the storage vessel as well as the size of the DNA string that can be synthesized/sequenced. . Error correction is based on a 6 base (bit) hashing code (less than a byte for 5 bytes). The systems write to read-back time is ~21 hrs.

The device creates many copies of the DNA (data) strand. The 5 byte (“HELLO”) string took 4 micrograms of liquid and yielded 3469 DNA strands, 1973 of which aligned properly to their adapter sequence. Of those properly aligned DNA strands, 30 had extractable payload regions of which 1 was correct, the other 29 were corrupted.

This is a very poor BER (bit error rate). For comparison LTO-7/8 has a BER of 1:10**19 bits, and enterprise disk has a BER of 1:10**15 bits. This DNA storage device has a BER of 3469:1 or ~99.9% of all bits written were lost.

To get a better understanding of the BER, they stored a 100 base (~12 byte) data payload. Of the 25,592 strands created, 286 aligned properly and of those 251 were corrupted, 11 had invalid hashes, and 8 were corrupted but correctable (valid hashes invalid data) and 16 were perfect reads. So 25592 strands had 24 proper reads ~1K:1 BER (not entirely correct because the correctable strands actually had bit errors but we can give them that).

DNA computer architecture

Last up, an IEEE Spectrum article, discussing CalTech Research, DNA computer shows programmable chemical machines are possible, reporting on an article in Nature, Diverse and robust molecular algorithms using reprogrammable DNA self-assembly (paywall). This DNA computer system is made of just DNA and salt water. It computes algorithms on 6 bits of input and uses DNA logic gates.

The Caltech team created 2 input-2-output boolean gates out of DAN sequences, five of these gates are connected to form a computation layer. It supports 6 input and 6 output bits. But you can layer multiple computational levels on top of one another where the output of one layer can be fed in as input to the layer on top of it.

One key, is that the DNA computer self assemblies the computational layer. They use a seed layer as a starter DNA strand and then the input (mixed inside a vial) is attached to this seed layer and then the computational layers are attached one by one until the output is generated.

Each computational layer is made up of DNA computational tiles that attach together sort of like a circuit. they were able to create a 355 instruction set for their DNA computer. In comparison the IBM 360 had a one byte op code (at most 256 instructions).

They have a compiler that allows researchers to write a software algorithm and this translates code into DNA circuit tiles, computational layers and ultimately into a DNA computer.

According to the article, it takes 1-2 hours to grow the computational DNA crystal and another day or so for the computation to complete.

An interesting approach to DNA computation but it’s unclear if they have any branching mechanisms in their “instruction set”. And 6 bit input/output seems a bit limiting. However, by creating boolean gates with DNA, they could recreate any type of electronic computer that exists today.


Put it all together and someday you could have a DNA compute server and storage.

One thing that’s missing is a (packet switched or token ring) network for transferring data between cells (and maybe into and out of DNA storage). They could probably use some sort of vascular (network) system with a way to transfer data from inside a cell to the network and into another cell .

That way they could gang a number of DNA compute servers (cells) together and maybe create a cellular automata machine.

The future of computation looks wetter now.

Photo Credit(s):

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.