DNA IT, the next revolution

I’ve been writing about DNA computing and storage for quite awhile now (see DNA computing and the end of natural evolution, DNA storage and the end of evolution part 2, & Random access DNA object storage system). But in the last few months there’s been a flurry of activity in this space that seems worthy of note.

DNA programing language

First up, A logic programing language for computational nucleic acid devices, a research article in ACS Synthetic Biology magazine. The research describes a new approach to programming DNA computers, that’s uniquely designed to mimic molecular algorithmic capabilities for DNA devices. T\

The language uses logical statements and predicates (reminds me of Prolog). Indeed, the language was modeled after Prolog with equational and molecular extensions to represent DNA functionality. As with Prolog, output is a function of declarative, predicate logic rather than control flow and assignment in normal programming languages. Logic programming takes a different mind set and demands an understanding of formal logic.

The article talks about applications for DNA computing for in vitro (chemical/protien) manufacturing, diagnosis, and therapeutics (operating inside living cells) devices (cells).

DNA storage device

Next up, a recent article in Scientific Reports, Demonstration of end-to-end automation of DNA data storage.

The intent here is to create a fully automated data storage device that uses DNA as its recording media. The current device (seen in the bottom right above) is a lab prototype, that fits on a bench and costs $10K that can store 5 bytes of data with error correction.

The system has three hardware modules: synthesis (writing), storage and sequencing (reading). It also includes encoding and decoding software that translates bits to nucleic acid bases and adds error correction to it. They need to add more bases to be compatible with the sequencing (reading) process.

The limits to storage may have something to do with the size of the storage vessel as well as the size of the DNA string that can be synthesized/sequenced. . Error correction is based on a 6 base (bit) hashing code (less than a byte for 5 bytes). The systems write to read-back time is ~21 hrs.

The device creates many copies of the DNA (data) strand. The 5 byte (“HELLO”) string took 4 micrograms of liquid and yielded 3469 DNA strands, 1973 of which aligned properly to their adapter sequence. Of those properly aligned DNA strands, 30 had extractable payload regions of which 1 was correct, the other 29 were corrupted.

This is a very poor BER (bit error rate). For comparison LTO-7/8 has a BER of 1:10**19 bits, and enterprise disk has a BER of 1:10**15 bits. This DNA storage device has a BER of 3469:1 or ~99.9% of all bits written were lost.

To get a better understanding of the BER, they stored a 100 base (~12 byte) data payload. Of the 25,592 strands created, 286 aligned properly and of those 251 were corrupted, 11 had invalid hashes, and 8 were corrupted but correctable (valid hashes invalid data) and 16 were perfect reads. So 25592 strands had 24 proper reads ~1K:1 BER (not entirely correct because the correctable strands actually had bit errors but we can give them that).

DNA computer architecture

Last up, an IEEE Spectrum article, discussing CalTech Research, DNA computer shows programmable chemical machines are possible, reporting on an article in Nature, Diverse and robust molecular algorithms using reprogrammable DNA self-assembly (paywall). This DNA computer system is made of just DNA and salt water. It computes algorithms on 6 bits of input and uses DNA logic gates.

The Caltech team created 2 input-2-output boolean gates out of DAN sequences, five of these gates are connected to form a computation layer. It supports 6 input and 6 output bits. But you can layer multiple computational levels on top of one another where the output of one layer can be fed in as input to the layer on top of it.

One key, is that the DNA computer self assemblies the computational layer. They use a seed layer as a starter DNA strand and then the input (mixed inside a vial) is attached to this seed layer and then the computational layers are attached one by one until the output is generated.

Each computational layer is made up of DNA computational tiles that attach together sort of like a circuit. they were able to create a 355 instruction set for their DNA computer. In comparison the IBM 360 had a one byte op code (at most 256 instructions).

They have a compiler that allows researchers to write a software algorithm and this translates code into DNA circuit tiles, computational layers and ultimately into a DNA computer.

According to the article, it takes 1-2 hours to grow the computational DNA crystal and another day or so for the computation to complete.

An interesting approach to DNA computation but it’s unclear if they have any branching mechanisms in their “instruction set”. And 6 bit input/output seems a bit limiting. However, by creating boolean gates with DNA, they could recreate any type of electronic computer that exists today.


Put it all together and someday you could have a DNA compute server and storage.

One thing that’s missing is a (packet switched or token ring) network for transferring data between cells (and maybe into and out of DNA storage). They could probably use some sort of vascular (network) system with a way to transfer data from inside a cell to the network and into another cell .

That way they could gang a number of DNA compute servers (cells) together and maybe create a cellular automata machine.

The future of computation looks wetter now.

Photo Credit(s):

DNA as storage, the end of evolution – part 2

I had talked about DNA programming/computing previously (see my DNA computing and the end of natural evolution post) and today we have an example of  another step along this journey.  A new story in today’s Science News titled DNA used as rewriteable data storage in cells discusses another capability needed for computation, namely information storage.

The new synthetic biology “logic” is able to record, erase and overwrite (DNA) data in an E. coli cell.  DNA information storage like this brings us one step closer to a universal biologic Turing machine or computational engine.

Apparently the new process uses enzymes to “flip” a small segment of DNA to read backwards and then with another set of enzymes, flip it back again.  With another application of synthetic biology, they were able to have the cell fluoresce in different colors depending on whether the DNA segment was reversed or in its normal orientation.

To top it all off, the DNA data storage device was inheritable.   Scientists showed that the data device was still present in the 100th generation of the cell they originally modified.  How’s that for persistent storage.

The universal biological Turing machine

Let’s see, my universal Turing machine parts list includes:

  • Tape or infinite memory device = DNA memory device – Check (todays post, well maybe not infinite, but certainly single bits today, bytes next year, so it’s only a matter of time before it’s KB)
  • Read head or ability to read out memory information = biological read head – Check (todays post, it can fluoresce, therefore it can be read)
  • State register = biologic counter  – Check (seems to have been discovered in 2009, see Science News article Engineered DNA counts it out, don’t know how I missed that)
  • State transition table or program = biological programming – Check (previous post plus today’s post, able to compute a new state from a given previous state and current data and write or rewrite data).

As far as I can tell this means we could construct an equivalent to a universal turing machine with today’s synthetic biology. Which of course means we could perform  just about any computation ever conceived within a single cell AND all generations of the cell would inherit this ability.

End of natural evolution, …

Gosh the possibilities of this new synthetic biological turing machine are both frightening and astonishing.  My original post talked about how adding ECC like functionality plus a ECC codeword to human DNA strand would spell the end of natural evolution for our species.

I suppose the one comforting thought is that flipping DNA segments takes hours rather than nano-seconds which means biological computation will never displace electronic/optronic computation.  But biological computation really doesn’t have to.  All it has to do is repair DNA mutations over the course of days, weeks and/or years, before it has a chance to propagate in order to end natural evolution.

…,  the dawn of un-natural evolution

Of course with such capabilities, “un-natural” or programmed evolution is quite possible but is it entirely desireable.  With such capabilities we could readily change a cell’s DNA to whatever we desire it to be.

My real problem is its inheritability.  It’s one thing to muck with a persons genome, it’s another thing to muck with their children’s, children’s, children’s, … DNA.

Let’s say you were able to change someone’s DNA to become a super-athelete, super-brain or super-beautiful/handsome person.  (Moving from a single cell’s DNA to a whole person’s is a leap, but not outside the realm of possibility).   Over time, any such changes would accumulate and could confer an seemingly un-assailable advantage to an individual’s gene line.

There’s probably some time to think these things through and set up some sort of policies, guidelines, and/or regulations environment around the use of the technology before capabilities get out of hand.

In my mind this goes well beyond genetically modified organisms (GMO) organisms that are just static changes to a gene line.  Programming gene lines to repair DNA, alter DNA, or even to make better copies, seems to me to be an order of magnitude increase in new capabilities taking us to genetically programmed organisms that has the potential to end evolution itself.

We need to have some serious discussions before it goes that far.


Image: E. coli GFP by KitKor

DNA computing and the end of natural evolution

DNA Molecule Arrangement in the Chip (from http://dnacomputing.design.officelive.com)
DNA Molecule Arrangement in the Chip (from http://dnacomputing.design.officelive.com)

Read an article the other day in the Economist on how researchers are now performing computation using DNA.  The intent is to someday come up with small biologic computers that can be inserted into cells/organisms which can cure or kill cells that are in trouble and leave the rest alone.

Computing soup?!

Research in the area of molecular computing has been going on since 1994, when a scientist created a DNA based solution to compute an answer to a specified traveling salesman problem.

In those days the answer was derived from running a centrifuge on the end-product soup of DNA strings and extracting the answer from the resultant gel matrix.

Molecular computing redefined

Since then, there has been significant improvements in DNA computing.  Currently, most are based on DNA strand displacement.  Today’s molecular computers consists of free floating DNA or RNA snippets.  A logic gate is made up of two strands, one of which is the “computational logic” and the other an “output signal”.  In addition to the logic gate there is another DNA/RNA strand which is an “input signal” or almost like input data.  Input signals are matched up to a specific logic gate and cause the output signal snippet to be detached creating yet another input signal for other computations cascading down the pipeline.

DNA-RNA based digital logic

2-bit_ALU (from wikimedia.org)
2-bit_ALU (from wikimedia.org)

By doing all this, researchers have been able to create DNA snippets that perform various logical computing operations such as AND, OR and NOT logic gates and producing the signal pathways to connect them in a computational sequence or “program”.

The molecular automata all looks like elementary electronic circuits made up of base level logic gates logic to me but just as in electronic digital logic it seems to gets the job done.  One gets a computation done by adding 1000’s of copies of the logic gates and input sequences together and some how assaying the end result many hours later.

Using these capabilities, they have created DNA programs made up of 74 different DNA strands that could calculate the square roots of 4 digit numbers.

Next, they tied an artificial neuron to fire when input signals hit a certain level together with a soup of 114 different DNA strands to do rudimentary pattern recognition.  They used then “programed” their DNA neural net to recognize Yes/No answers provided by different  scientists.  The report said that the neural net, was able to get the correct answer every time but took 8 hours to perform the calculations.

There are a couple of groups working on a programming language and a simulator tool for DNA or molecular computing called the DNA Strand Displacement (DSD) tool.

The report went on to say how another set of researchers were fabricating synthetic genes which when introduced into cell could be used to trick the cell into producing the cellular computer itself.

The end of natural evolution?

The end game for all this is to create a computational device that can somehow be injected into tissue cells which would identify “sick” cells then cure or destroy them.

A couple of years ago, I was waiting in a doctor’s office for something or another and penned a poem on the end of human evolution involving ECC combined with DNA.  (No, you can’t see the poem.)

You see in computers today there is a computational device called an ECC or error correcting code which is a circuit and a special code word that can be appended to a sequence of data that together can then be used to correct for errors in transmission or storage of that data.

Once someone can build digital logic out of DNA-RNA, it’s not a big leap to have build an ECC circuit.  Once the circuit is ready, anyone could potentially have their DNA modified to have an appropriate ECC codeword appended to it.  With DNA + ECC code word and an active ECC circuit in the cell, it’s quite possible than any single, double, or triple mutation could be detected and fixed inside a cell.  Of course ECC can go beyond triple error detection if needed.  Also, Reed-Solomon and other erasure codes can even go much beyond that.

After such a device was incorporated into the human genome, it would seem to signal the end to natural evolution, at least for humans.