DNA IT, the next revolution

I’ve been writing about DNA computing and storage for quite awhile now (see DNA computing and the end of natural evolution, DNA storage and the end of evolution part 2, & Random access DNA object storage system). But in the last few months there’s been a flurry of activity in this space that seems worthy of note.

DNA programing language

First up, A logic programing language for computational nucleic acid devices, a research article in ACS Synthetic Biology magazine. The research describes a new approach to programming DNA computers, that’s uniquely designed to mimic molecular algorithmic capabilities for DNA devices. T\

The language uses logical statements and predicates (reminds me of Prolog). Indeed, the language was modeled after Prolog with equational and molecular extensions to represent DNA functionality. As with Prolog, output is a function of declarative, predicate logic rather than control flow and assignment in normal programming languages. Logic programming takes a different mind set and demands an understanding of formal logic.

The article talks about applications for DNA computing for in vitro (chemical/protien) manufacturing, diagnosis, and therapeutics (operating inside living cells) devices (cells).

DNA storage device

Next up, a recent article in Scientific Reports, Demonstration of end-to-end automation of DNA data storage.

The intent here is to create a fully automated data storage device that uses DNA as its recording media. The current device (seen in the bottom right above) is a lab prototype, that fits on a bench and costs $10K that can store 5 bytes of data with error correction.

The system has three hardware modules: synthesis (writing), storage and sequencing (reading). It also includes encoding and decoding software that translates bits to nucleic acid bases and adds error correction to it. They need to add more bases to be compatible with the sequencing (reading) process.

The limits to storage may have something to do with the size of the storage vessel as well as the size of the DNA string that can be synthesized/sequenced. . Error correction is based on a 6 base (bit) hashing code (less than a byte for 5 bytes). The systems write to read-back time is ~21 hrs.

The device creates many copies of the DNA (data) strand. The 5 byte (“HELLO”) string took 4 micrograms of liquid and yielded 3469 DNA strands, 1973 of which aligned properly to their adapter sequence. Of those properly aligned DNA strands, 30 had extractable payload regions of which 1 was correct, the other 29 were corrupted.

This is a very poor BER (bit error rate). For comparison LTO-7/8 has a BER of 1:10**19 bits, and enterprise disk has a BER of 1:10**15 bits. This DNA storage device has a BER of 3469:1 or ~99.9% of all bits written were lost.

To get a better understanding of the BER, they stored a 100 base (~12 byte) data payload. Of the 25,592 strands created, 286 aligned properly and of those 251 were corrupted, 11 had invalid hashes, and 8 were corrupted but correctable (valid hashes invalid data) and 16 were perfect reads. So 25592 strands had 24 proper reads ~1K:1 BER (not entirely correct because the correctable strands actually had bit errors but we can give them that).

DNA computer architecture

Last up, an IEEE Spectrum article, discussing CalTech Research, DNA computer shows programmable chemical machines are possible, reporting on an article in Nature, Diverse and robust molecular algorithms using reprogrammable DNA self-assembly (paywall). This DNA computer system is made of just DNA and salt water. It computes algorithms on 6 bits of input and uses DNA logic gates.

The Caltech team created 2 input-2-output boolean gates out of DAN sequences, five of these gates are connected to form a computation layer. It supports 6 input and 6 output bits. But you can layer multiple computational levels on top of one another where the output of one layer can be fed in as input to the layer on top of it.

One key, is that the DNA computer self assemblies the computational layer. They use a seed layer as a starter DNA strand and then the input (mixed inside a vial) is attached to this seed layer and then the computational layers are attached one by one until the output is generated.

Each computational layer is made up of DNA computational tiles that attach together sort of like a circuit. they were able to create a 355 instruction set for their DNA computer. In comparison the IBM 360 had a one byte op code (at most 256 instructions).

They have a compiler that allows researchers to write a software algorithm and this translates code into DNA circuit tiles, computational layers and ultimately into a DNA computer.

According to the article, it takes 1-2 hours to grow the computational DNA crystal and another day or so for the computation to complete.

An interesting approach to DNA computation but it’s unclear if they have any branching mechanisms in their “instruction set”. And 6 bit input/output seems a bit limiting. However, by creating boolean gates with DNA, they could recreate any type of electronic computer that exists today.

~~~~

Put it all together and someday you could have a DNA compute server and storage.

One thing that’s missing is a (packet switched or token ring) network for transferring data between cells (and maybe into and out of DNA storage). They could probably use some sort of vascular (network) system with a way to transfer data from inside a cell to the network and into another cell .

That way they could gang a number of DNA compute servers (cells) together and maybe create a cellular automata machine.

The future of computation looks wetter now.

Photo Credit(s):

DNA as storage, the end of evolution – part 2

I had talked about DNA programming/computing previously (see my DNA computing and the end of natural evolution post) and today we have an example of  another step along this journey.  A new story in today’s Science News titled DNA used as rewriteable data storage in cells discusses another capability needed for computation, namely information storage.

The new synthetic biology “logic” is able to record, erase and overwrite (DNA) data in an E. coli cell.  DNA information storage like this brings us one step closer to a universal biologic Turing machine or computational engine.

Apparently the new process uses enzymes to “flip” a small segment of DNA to read backwards and then with another set of enzymes, flip it back again.  With another application of synthetic biology, they were able to have the cell fluoresce in different colors depending on whether the DNA segment was reversed or in its normal orientation.

To top it all off, the DNA data storage device was inheritable.   Scientists showed that the data device was still present in the 100th generation of the cell they originally modified.  How’s that for persistent storage.

The universal biological Turing machine

Let’s see, my universal Turing machine parts list includes:

  • Tape or infinite memory device = DNA memory device – Check (todays post, well maybe not infinite, but certainly single bits today, bytes next year, so it’s only a matter of time before it’s KB)
  • Read head or ability to read out memory information = biological read head – Check (todays post, it can fluoresce, therefore it can be read)
  • State register = biologic counter  – Check (seems to have been discovered in 2009, see Science News article Engineered DNA counts it out, don’t know how I missed that)
  • State transition table or program = biological programming – Check (previous post plus today’s post, able to compute a new state from a given previous state and current data and write or rewrite data).

As far as I can tell this means we could construct an equivalent to a universal turing machine with today’s synthetic biology. Which of course means we could perform  just about any computation ever conceived within a single cell AND all generations of the cell would inherit this ability.

End of natural evolution, …

Gosh the possibilities of this new synthetic biological turing machine are both frightening and astonishing.  My original post talked about how adding ECC like functionality plus a ECC codeword to human DNA strand would spell the end of natural evolution for our species.

I suppose the one comforting thought is that flipping DNA segments takes hours rather than nano-seconds which means biological computation will never displace electronic/optronic computation.  But biological computation really doesn’t have to.  All it has to do is repair DNA mutations over the course of days, weeks and/or years, before it has a chance to propagate in order to end natural evolution.

…,  the dawn of un-natural evolution

Of course with such capabilities, “un-natural” or programmed evolution is quite possible but is it entirely desireable.  With such capabilities we could readily change a cell’s DNA to whatever we desire it to be.

My real problem is its inheritability.  It’s one thing to muck with a persons genome, it’s another thing to muck with their children’s, children’s, children’s, … DNA.

Let’s say you were able to change someone’s DNA to become a super-athelete, super-brain or super-beautiful/handsome person.  (Moving from a single cell’s DNA to a whole person’s is a leap, but not outside the realm of possibility).   Over time, any such changes would accumulate and could confer an seemingly un-assailable advantage to an individual’s gene line.

There’s probably some time to think these things through and set up some sort of policies, guidelines, and/or regulations environment around the use of the technology before capabilities get out of hand.

In my mind this goes well beyond genetically modified organisms (GMO) organisms that are just static changes to a gene line.  Programming gene lines to repair DNA, alter DNA, or even to make better copies, seems to me to be an order of magnitude increase in new capabilities taking us to genetically programmed organisms that has the potential to end evolution itself.

We need to have some serious discussions before it goes that far.

Comments?

Image: E. coli GFP by KitKor