Random access, DNA object storage system

Read a couple of articles this week Inching closer to a DNA-based file system in ArsTechnica and DNA storage gets random access in IEEE Spectrum. Both of these seem to be citing an article in Nature, Random access in large-scale DNA storage (paywall).

We’ve known for some time now that we can encode data into DNA strings (see my DNA as storage … and Genomic informatics takes off posts).

However, accessing DNA data has been sequential and reading and writing DNA data has been glacial. Researchers have started to attack the sequentiality of DNA data access. The prize, DNA can store 215PB of data in one gram and DNA data can conceivably last millions of years.

Researchers at Microsoft and the University of Washington have come up with a solution to the sequential access limitation. They have used polymerase chain reaction (PCR) primers as a unique identifier for files. They can construct a complementary PCR primer that can be used to extract just DNA segments that match this primer and amplify (replicate) all DNA sequences matching this primer tag that exist in the cell.

DNA data format

The researchers used a Reed-Solomon (R-S) erasure coding mechanism for data protection and encode the DNA data into many DNA strings, each with multiple (metadata) tags on them. One of tags is the PCR primer tag header, another tag indicates the position of the DNA data segment in the file and an end of data tag that is the same PCR primer tag.

The PCR primer tag was used as sort of a file address. They could configure a complementary PCR tag to match the primer tag of the file they wanted to access and then use the PCR process to replicate (amplify) only those DNA segments that matched the searched for primer tag.

Apparently the researchers chunk file data into a block of 150 base pairs. As there are 2 complementary base pairs, I assume one bit to one base pair mapping. As such, 150 base pairs or bits of data per segment means ~18 bytes of data per segment. Presumably this is to allow for more efficient/effective encoding of data into DNA strings.

DNA strings don’t work well with replicated sequences of base pairs, such as all zeros. So the researchers created a random sequence of 150 base pairs and XOR the file DNA data with this random sequence to determine the actual DNA sequence to use to encode the data. Reading the DNA data back they need to XOR the data segment with the random string again to reconstruct the actual file data segment.

Not clear how PCR replicated DNA segments are isolated and where they are originally decoded (with a read head). But presumably once you have thousands to millions of copies of a DNA segment,  it’s pretty straightforward to decode them.

Once decoded and XORed, they use the R-S erasure coding scheme to ensure that the all the DNA data segments represent the actual data that was encoded in them. They can then use the position of the DNA data segment tag to indicate how to put the file data back together again.

What’s missing?

I am assuming the cellular data storage system has multiple distinct cells of data, which are clustered together into some sort of organism.

Each cell in the cellular data storage system would hold unique file data and could be extracted and a file read out individually from the cell and then the cell could be placed back in the organism. Cells of data could be replicated within an organism or to other organisms.

To be a true storage system, I would think we need to add:

  • DNA data parity – inside each DNA data segment, every eighth base pair would be a parity for the eight preceding base pairs, used to indicate when a particular base pair in eight has mutated.
  • DNA data segment (block) and file checksums –  standard data checksums, used to verify and correct for double and triple base pair (bit) corruption in DNA data segments and in the whole file.
  • Cell directory – used to indicate the unique Cell ID of the cell, a file [name] to PCR primer tag mapping table, a version of DNA file metadata tags, a version of the DNA file XOR string, a DNA file data R-S version/level, the DNA file length or number of DNA data segments, the DNA data creation data time stamp, the DNA last access date-time stamp,and DNA data modification data-time stamp (these last two could be omited)
  • Organism directory – used to indicate unique organism ID, organism metadata version number, organism unique cell count,  unique cell ID to file list mapping, cell ID creation data-time stamp and cell ID replication count.

The problem with an organism cell-ID file list is that this could be quite long. It might be better to somehow indicate a range or list of ranges of PCR primer tags that are in the cell-ID. I can see other alternatives using a segmented organism directory or indirect organism cell to file lists b-tree, which could hold file name lists to cell-ID mapping.

It’s unclear whether DNA data storage should support a multi-level hierarchy, like file system  directories structures or a flat hierarchy like object storage data, which just has buckets of objects data. Considering the cellular structure of DNA data it appears to me more like buckets and the glacial access seems to be more useful to archive systems. So I would lean to a flat hierarchy and an object storage structure.

Is DNA data is WORM or modifiable? Given the effort required to encode and create DNA data segment storage, it would seem it’s more WORM like than modifiable storage.

How will the DNA data storage system persist or be kept alive, if that’s the right word for it. There must be some standard internal cell mechanisms to maintain its existence. Perhaps, the researchers have just inserted file data DNA into a standard cell as sort of junk DNA.

If this were the case, you’d almost want to create a separate, data  nucleus inside a cell, that would just hold file data and wouldn’t interfere with normal cellular operations.

But doesn’t the PCR primer tag approach lend itself better to a  key-value store data base?

Photo Credit(s): Cell structure National Cancer Institute

Prentice Hall textbook

Guide to Open VMS file applications

Unix Inodes CSE410 Washington.edu

Key Value Databases, Wikipedia By ClescopOwn work, CC BY-SA 4.0, Link

Blockchains go mainstream…

 

I read an article a while back on Finland’s use of blockchain technology to provide bank accounts and identity services to immigrants (see  MIT TechReview article about Finland).

Blockchains were originally invented as a way of supporting financial transactions outside the current, government monitored, financial marketplace. With Finland’s experiment, the government is starting to use blockchains to support the unbanked and monitoring their financial activity – go figure.

Debit cards on blockchain

Finland’s using a Helsinki based startup MONI, to assign a MONI card, essentially a prepaid MasterCard, to all immigrants. An immigrant can use their MONI card to pay for anything online or in real life, use it as a direct deposit account or to receive and track the use of government assistance.

Underlying the MONI card is public blockchain technology. That is MONI  is not using normal credit card services to support it’s bank accounts, MONI money transfers are done through the use of public blockchains.

MONI accounts are essentially (crypto currency) wallets but used as a debit card. The user merely enters a series of numbers into web forms or uses their MONI card at a credit card terminals throughout Europe. Transferring money between MONI users anywhere in the World is also free and instantaneous.

Finland also sees an immutable record of all immigrant financial transactions,  that can be monitored to track immigrant (financial) integration into the country.

MONI is intending to make this service more broadly available. A MONI card account costs €2/month and MONI take’s a small cut out of each monetary transaction.

IDs on blockchain

I read another article the other day “Microsoft to implement blockchain-based ID system” in CoinTelegraph about using blockchains as a universal digital ID.

India has over the last decade, implemented a digital government ID using biometrics (see Aadhaar wikipedia article). Other countries have been moving to e-government where use of government services is implemented over the Internet (see EU article on eGovernment in Lithuania). Such eGovernment services depend on a digitized population registry.

Although it’s unclear whether Aadhaar and Lithuania make use of blockchain technology for their ID services, Microsoft’s definitely looking to blockchains to provide unique accounts/digital IDs to it’s population of users.

User signon’s has been a prevalent problem of the web for years. Each and every web and mobile App requires a person to signon to personalize their App. Nowadays, many Apps support using Google ID or Facebook ID for a single signon and there are other technologies being offered that provide similar services. Using a blockchain ID could easily support a single signon service.

The blockchain ID (wallet) public key could easily be used to encrypt an authentication transaction, identifying the App and the user. This authentication transaction would be processed by the blockchain digital ID service would use the private key to decrypt the transaction and use a backend ID App repository for the user to check to see that the user loging in, is the person that opened the account, acting as a sort of “proof of who you are”

Storage on blockchain

Filecoin and StorJ are storage providers that use blockchain services to allow others to use your local (or networked) storage to provide storage to the world.

A while back I had written about (free) peer to peer storage and compute services  (see my Free P2P cloud storage … post). But the problem was how do people benefit from hosting the P2P storage or compute. Filecoin and Storj solved this by paying in cryptocurrencies for storage hosted on your hardware.

Filecoin offers a storage auction and hosting service that anyone worldwide can log into and use. The data stored is encrypted end-to-end so that no one can see what’s being stored and the data is also erasure coded so that it  is protected and accessible even with having one or more hosting sites be offline.

Filecoin uses “proofs of storage“, “proofs of space”, “proofs of data possession“, and “proofs of retrievability” as a way to guarantee their storage service works properly. They also use chained “proofs of replication” as “proofs of spacetime” as service validation checks. Proofs of Replication are a way of insuring that storage providers are not deduplicating data copies and charging for non-deduped storage. (See Filecoin’s Proof of Replication paper for more info).

Storj looks somewhat similar to Filecoin, but without as much sophistication behind it.

Compute on blockchain

Ethereum was invented to support smart contracts that run on blockchain technology. IBM’s HyperLegder OpenLedger project (see our GreyBeardsOnStorga Podcast and RayOnStorage post on Hyperledger) is another example.

Smart contracts are essentially applications that run in a blockchains virtualized environment. Blockchain services are used to run an application and validate that’s it’s run only once. In some cases smart contracts use  external oracles to query as a way to verify something or some action has occurred outside the blockchain. Other oracles can be entirely digital entities that check on a particular commodity price, weather pattern, account value, etc. The oracle becomes a critical step in determining the go no go status of a smartcontract.

Advertisements vs. crypto mining

Salon, a news providing website, offers readers an option to see advertisements or to allow Salon to use their computer (browser) to mine crypto coins. (See Salon offers… article in CoinDesk).

I believe this offer is made when the website detects a viewer is using  ad blockers.

~~~~

Tthe trend is clear, people, organizations and even governments are looking at blockchain technology to provide basic and advanced services around the world.

If anyone would is interested in providing a pre-paid Visa card via blockchains, please contact me. I’d like to help.

Now if I could just find my GPU’s at a decent price somewhere…

Speaking of advertising… RayOnStorage doesn’t use advertising. But blogging like this takes time and money. If anyone’s interested in helping fund this blog, please consider sending some BTC our way, even 0.0001 BTC would help.

Our BTC wallet address is:

1MqBbAvMo6QbCVD6ZwtbLaPxmcUZGj9Ghw

Photo Credit(s): Blockchain and the public sector on OpenGovAsia.com

Unleash your design teams with single signon on Unifilabs.com

Understanding the difference between P2P and Client-server networks on LinkedIN

Blockgeek’s guide to smart contracts

A knowledge ark, the Arch project

Read an article last week on the Arch Mission Foundation project, which is a non-profit, organization that intends “to continuously preserve and disseminate human knowledge throughout time and space”.

The way I read this is they want to capture, preserve  and replicate all mankind’s knowledge onto (semi-)permanent media and store this information  at various locations around the globe and wherever we may go.

Interesting way to go about doing this. There are plenty of questions and considerations to capturing all of mankind’s knowledge.

Google’s way

 Google has electronically scanned every book in a number of library partners to help provide a searchable database of literature, check out the Google Books Library Project.

There’s over 40 library partners around the globe and the intent of the project was to digitize their collections. The library partners can then provide access to their digital copies. Google will provide full access to books in the public domain and will provide search results for all the rest, with pointers as to where the books can be found in libraries, purchased and otherwise obtained.

Google Books can be searched at Google Books. Last I heard they had digitized over 30M books from their library partners, which is pretty impressive since the Library of Congress has around 37M books. Google Books is starting to scan magazines as well.

Arch’s way

The intent is to create Arch’s (pronounced Ark’s) that can last billions of years. The organization is funding R&D into long lived storage technologies.

Some of these technologies include:

  • 5D laser optical data storage in quartz, I wrote about this before (see my 5D storage … post). Essentially, they are able to record two-tone scans of documents in transparent quartz that can last eons. Data is recorded in 5 dimensions, size of dot, polarity of dot  and 3 layers of dot locations through the media. 5D media lasts for 1000s of years.
  • Nickel ion-beam atomic scale storage, couldn’t find much on this online but we suppose this technology uses ion-beams to etch a nickel surface with nano-scale information.
  • Molecular storage on DNA molecules, I wrote about this before as well (see my DNA as storage… post) but there’s been plenty of research on this more recently. A group from Padua, IT  shows the way forward to use bacteria as a read/write head for DNA storage and there are claims that a gram of DNA could hold a ZB (zettabyte, 10**21 bytes) of data. For some reason Microsoft has been very active in researching this technology and plan to add it to Azure someday.
  • Durable space based flash drives, couldn’t find anything on this technology but assume this is some variant of NAND storage optimized for long duration.  Current NAND loses charge over time. Alternatively, this could be a version of other NVM storage, such as, MRAM, 3DX, ReRAM, Graphene Flash, and  Memristor all of which I have written about
  • Long duration DVD technology, this is sort of old school but there exists archive class WORM DVDs out and available on the market today, (see my post on M[illeniata]-Disc…).
  • Quantum information storage, current quantum memory lifetimes don’t much over exceed 180 seconds, but this is storage not memory. Couldn’t find much else on this, but it might be referring to permanent data storage with light.
M-Disc (c) 2011 Millenniata (from their website)
M-Disc (c) 2011 Millenniata (from their website)

They seem technology agnostic but want something that will last forever.

But what knowledge do they plan to store

In Arch’s FAQ they talk about open data sets like Wikipedia and the Internet Archive. But they have an interesting perspective on which knowledge to save. From an advanced future civilization perspective, they are probably not as interested in our science and technology but rather more interested in our history, art and culture.

They believe that science and technology should be roughly the same in every advanced civilization. But history, art and culture are going to be vastly different across different civilizations. As such, history, art and culture are uniquely valuable to some future version of ourselves or any other advanced scientific civilization.

~~~~

Arch intends to have multiple libraries positioned on the Earth, on the Moon and Mars over time. And they are actively looking for donations and participation (see link above).

Although, I agree that culture, art and history will be most beneficial to any advanced civilization. But there’s always a small but distinct probability that we may not continue to exist as an advanced scientific civilization. In that case, I would think, science and technology would also be needed to boot strap civilization.

To the Wikipedia, I would add GitHub, probably Google Books, and PLOS as well as any other publicly available scientific or humanities journals that available.

And don’t get me started on what format to record the data with. Needless to say, out-dated formats are going to be a major concern for anything but a 2D scan of information after about ten years or so.

In any case, humanity and universanity needs something like this.

Photo Credit(s): The Arch Mission Foundation web page

Google Books Library search on Republic results

“Five dimensional glass disks …” from The Verge

M-disk web page

New techniques shed light on ancient codex & palimpsests

Read an article the other day from New York Times, A fragile biblical text gets a virtual read about an approach to use detailed CT scans combined with X-rays to read text on a codex (double sided, hand bound book) that’s been mashed together for ~1500 years.

How to read a codex

Dr. Seales created the technology and has used it successfully to read a small charred chunk of material that was a copy of the earliest known version of the Masoretic text, the authoritative Hebrew bible.

However, that only had text on one side. A codex is double sided and being able to distinguish between which side of a piece of papyrus or parchment was yet another level of granularity.

The approach uses X-ray scanning to triangulate where sides of the codex pages are with respect to the material and then uses detailed CT scans to read the ink of the letters of the text in space. Together, the two techniques can read letters and place them on sides of a codex.

Apparently the key to the technique was in creating software could model the surfaces of a codex or other contorted pieces of papyrus/parchment and combining that with the X-ray scans to determine where in space the sides of the papyrus/parchment resided. Then when the CT scans revealed letters in planar scans (space), they could be properly placed on sides of the codex and in sequence to be literally read.

M.910, an unreadable codex

In the article, Dr. Seales and team were testing the technique on a codex written sometime between 400 and 600AD that contained the Acts of the Apostles and one of the books of the New Testament and possibly another book.

The pages had been merged together by a cinder that burned through much of the book. Most famous codexes are named but this one was only known as M.910 for the 910th acquisition of the Morgan Library.

M.910 was so fragile that it couldn’t be moved from the library. So the team had to use a portable CT scanner and X-ray machine to scan the codex.

The scans for M.910 were completed this past December and the team should start producing (Coptic) readable pages later this month.

Reading Palimpsests

A palimpsests is a manuscript on which the original writing has been obscured or erased. Another article from UCLA Library News, Lost ancient texts recovered and published online,  that talks about the use of multi-wave length spectral imaging to reveal text and figures that have been erased or obscured from Sinai Palimpsests.  The texts can be read at Sinaipalimpsests.org and total 6800 pages in 10 languages.

In this case the text had been deliberately erased or obscured to reuse parchment or papyrus. The writings are from the 5th to 12th centuries.  The texts were located in St. Catherine’s Monastary and access to it’s collection of ancient and medieval manuscripts is considered 2nd only to that in the Vatican Library.

~~~~

There are many damaged codexes scurried away in libraries throughout the world today but up until now they were mere curiosities. If successful, this new technique will enable scholars to read their text, translate them and make them available for researchers and the rest of the world to read and understand.

Now if someone could just read my WordPerfect files from 1990’s and SCRIPT/VS files from 1980’s I’d be happy.

Comments:

Picture credit(s): From NY Times article by Nicole Craine 

Acts of apostles codex

From Sinai Palimpsests Project website

Blockchain, open source and trusted data lead to better SDG impacts

Read an article today in Bitcoin magazine IXO Foundation: A blockchain based response to UN call for [better] data which discusses how the UN can use blockchains to improve their development projects.

The UN introduced the 17 Global Goals for Sustainable Development (SDG) to be achieved in the world by 2030. The previous 8 Millennial Development Goals (MDG) expire this year.

Although significant progress has been made on the MDGs, one ongoing determent to  MDG attainment has been that progress has been very uneven, “with the poorest and economically disadvantaged often bypassed”.  (See WEF, What are Sustainable Development Goals).

Throughout the UN 17 SDG, the underlying objective is to end global poverty  in a sustainable way.

Impact claims

In the past organizations performing services for the UN under the MDG mandate, indicated they were performing work toward the goals by stating, for example, that they planted 1K acres of trees, taught 2K underage children or distributed 20 tons of food aid.

The problem with such organizational claims is they were left mostly unverified. So the UN, NGOs and other charities funding these projects were dependent on trusting the delivering organization to tell the truth about what they were doing on the ground.

However, impact claims such as these can be independently validated and by doing so the UN and other funding agencies can determine if their money is being spent properly.

Proving impact

Proofs of Impact Claims can be done by an automated bot, an independent evaluator or some combination of the two . For instance, a bot could be used to analyze periodic satellite imagery to determine whether 1K acres of trees were actually planted or not; an independent evaluator can determine if 2K students are attending class or not, and both bots and evaluators can determine if 20 tons of food aid has been distributed or not.

Such Proofs of Impact Claims then become a important check on what organizations performing services are actually doing.  With over $1T spent every year on UN’s SDG activities, understanding which organizations actually perform the work and which don’t is a major step towards optimizing the SDG process. But for Impact Claims and Proofs of Impact Claims to provide such feedback but they must be adequately traced back to identified parties, certified as trustworthy and be widely available.

The ixo Foundation

The ixo Foundation is using open source, smart contract blockchains, personalized data privacy, and other technologies in the ixo Protocol for UN and other organizations to use to manage and provide trustworthy data on SDG projects from start to completion.

Trustworthy data seems a great application for blockchain technology. Blockchains have a number of features used to create trusted data:

  1. Any impact claim and proofs of impacts become inherently immutable, once entered into a blockchain.
  2. All parties to a project, funders, services and evaluators can be clearly identified and traced using the blockchain public key infrastructure.
  3. Any data can be stored in a blockchain. So, any satellite imagery used, the automated analysis bot/program used, as well as any derived analysis result could all be stored in an intelligent blockchain.
  4. Blockchain data is inherently widely available and distributed, in fact, blockchain data needs to be widely distributed in order to work properly.

 

The ixo Protocol

The ixo Protocol is a method to manage (SDG) Impact projects. It starts with 3 main participants: funding agencies, service agents and evaluation agents.

  • Funding agencies create and digitally sign new Impact Projects with pre-defined criteria to identify appropriate service  agencies which can do the work of the project and evaluation agencies which can evaluate the work being performed. Funding agencies also identify Impact Claim Template(s) for the project which identify standard ways to assess whether the project is being performed properly used by service agencies doing the work. Funding agencies also specify the evaluation criteria used by evaluation agencies to validate claims.
  • Service agencies select among the open Impact Projects whichever ones they want to perform.  As the service agencies perform the work, impact claims are created according to templates defined by funders, digitally signed, recorded and collected into an Impact Claim Set underthe IXO protocol.  For example Impact Claims could be barcode scans off of food being distributed which are digitally signed by the servicing agent and agency. Impact claims can be constructed to not hold personal identification data but still cryptographically identify the appropriate parties performing the work.
  • Evaluation agencies then take the impact claim set and perform the  evaluation process as specified by funding agencies. The evaluation insures that the Impact Claims reflect that the work is being done correctly and that the Impact Project is being executed properly. Impact claim evaluations are also digitally signed by the evaluation agency and agent(s), recorded and widely distributed.

The Impact Project definition, Impact Claim Templates, Impact Claim sets, Impact Claim Evaluations are all available worldwide, in an Global Impact Ledger and accessible to any and all funding agencies, service agencies and evaluation agencies.  At project completion, funding agencies should now have a granular record of all claims made by service agency’s agents for the project and what the evaluation agency says was actually done or not.

Such information can then be used to guide the next round of Impact Project awards to further advance the UN SDGs.

Ambly project

The Ambly Project is using the ixo Protocol to supply childhood education to underprivileged children in South Africa.

It combines mobile apps with blockchain smart contracts to replace an existing paper based school attendance system.

The mobile app is used to record attendance each day which creates an impact claim which can then be validated by evaluators to insure children are being educated and properly attending class.

~~~

Blockchains have the potential to revolutionize financial services, provide supply chain provenance (e.g., diamonds with Blockchains at IBM), validate company to company contracts (Ethereum enters the enterprise) and now improve UN SDG attainment.

Welcome to the new blockchain world.

Photo Credit(s): What are Sustainable Development Goals, World Economic Forum;

IXO Foundation website

Ambly Project webpage

Magnonics for configurable electronics

Read an article today in ScienceDaily on [a] New way to write magnetic info … that discusses research done at Imperial College Of London that used a magnetic force microscope (small magnetic probe) to write magnetic fields onto a dense array of nanowires.

Frustrated metamaterials needed

The original research is written up in a Nature article Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing  (paywall). Unclear what that means but the paper abstract discusses geometrically frustrated magnetic metamaterials.  This is where the physical size or geometrical properties of the materials at the nanometer scale restricts or limits the magnetic states that material can exhibit.

Magnetic storage deals with magnetic material but there are a number of unique interactions of magnetic material when in close (nm) proximity to one another and the way nanowire geometrically frustrated magnetic metamaterials can be magnetized to different magnetic moments which can be exploited for other uses.  These interactions and magnetic moments can be combined to provide electronic circuitry and data storage.

I believe the research provides a proof point that such materials can be written, in close proximity to one another using a magnetic force microscope.

Why it’s important

The key is the potential to create  magnonic circuitry based on the pattern of moments writen into an array of nanowires. In doing so, one can fabricate any electrical circuit. It’s almost like photolithography but without fabs, chemicals, or laser scanners.

At first I thought this could be a denser storage device, but the potential is much greater if electronic circuitry could be constructed without having to fabricate semiconductors. It would seem ideal for testing out circuitry before manufacturing. And ultimately if it could be scaled up, the manufacture/fabrication of electronic circuitry itself could be done using these techniques.

Speed, endurance, write limits?

There was no information in the public article about the speed of writing the “frustrated magnetic metamaterials”. But an atomic force microscope can scan 150×150 micrometers in several minutes. If we assume that a typical chip size today is 150×150 mm, then this would take 1E6 times several minutes, or ~2K days. With multiple scanning force microscopes operating concurrently we could cut this down by a factor of 10 or 100 and maybe someday 1000. 2 days to write any electronic circuit on the order of todays 23nm devices with nanowires and magnetic force microscopes would be a significant advance

Also there was no mention of endurance, write limits or other characteristics we have learned to love with Flash storage. But the assumption is that it can be written multiple times and that the pattern stays around for some amount of time.

How magnetics generate electronic circuits

Neither Wikipedia page, the public article or the paywall articles’ abstract describes how Magnonics can supply electronic circuitry. However both the abstract and the public article discuss applications for this new technology in hardware based neural networks using arrays of densely packed nanowires.

Presumably, by writing different magnetic patterns in these nanowire metamaterials, such patterns can be used to simulate hardware connected neurons. This means that the magnetic information can be overwritten because it can be trained. Also, such magnetic circuits can be constructed to: a) can create different path for electrons to flow through the material; b) can restrict or enhance this electronic flow, and c) can integrate across a number of inputs and determine how electronic flow will proceed from a simulated neuron.

If magnonics can do all that,  it’s very similar to electronic gates today in CPU, GPUs and other electronic circuitry. Maybe it cannot simulate every gate or electronic device that’s found in todays CPUs but it’s a step in the right direction. And magnonics is relatively new. Silicon transistors are over 70 years old and the integrated circuit is almost 60 years old. So in time, magnonics could very well become the next generation of chip technology.

Writing speed is a problem. Maybe if they spun the nanowire array around the magnetic force microscope…

Comments?

Photo Credits:  Real space observation of emergent magnetic monopoles … Nature article

Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing, Nature article

 

Scratch file use in HPC @ORNL, a statistical analysis

Attended SC17 (Supercomputing Conference) this past week and I received a copy of the accompanying research proceedings. There are a number of interesting papers in the research and I came across one, Scientific User Behavior and Data Sharing Trends in a Peta Scale File System by Seung-Hwan Lim, et al from Oak Ridge National Laboratory (ORNL) and the use of files at the Oak Ridge Leadership Computing Facility (OLCF) which was very interesting.

The paper statistically describes the use of a Scratch files in a multi PB file system (Lustre) at OLCF from January 2015 to August 2016. The OLCF supports over 32PB of storage, has a peak aggregate of over 1TB/s and Spider II (current Lustre file system) consists of 288 Lustre Object Storage Servers, all interconnected and connected to all the supercomputing cluster of  servers via an InfiniBand network. Spider II supports all scratch storage requirements for active/queued jobs for the Titan (#4 in Top 500 [super computer clusters worldwide] list) and other clusters at ORNL.

ORNL uses an HPSS (High Performance Storage System) archive for permanent storage but uses the Spider II file system for all scratch files generated and used during supercomputing applications.  ORNL is expecting Spider III (2018-2023) to host 10 billion files.

Scratch files are purged from Spider II after 90 days of no access.The paper is based on metadata analysis captured during scratch purging process for 500 days of access.

The paper displays a number of statistics and metrics on the use of Spider II:

  • Less than 3% of projects have a directory depth >15, the maximum directory depth was recorded at 432, with most projects having a shallow (<10) directory depth.
  • A project typically has 10X the files that a specific researcher has and a median file count/researcher is 2000 files with a median project having 20,000 files.
  • Storage system performance is actively managed by many projects. For instance, 20 out of 35 science domains manually managed their Lustre cluster configuration to improve throughput.
  • File count continues to grow and reached a peak of 1B files during the time being analyzed.
  • On average only 3% of files were accessed readonly, 10% of files updated (read-write) and 76% of files were untouched during a week period. However, median and maximum file age was 138 and 214 days respectively, which means that these scratch files can continue to be accessed over the course of 200+ days.

There was more information in the paper but one item missing is statistics on scratch file size distribution a concern.

Nonetheless, in paints an interesting picture of scratch file use in HPC application/supercluster environments today.

Comments?

A steampunk Venusian rover

Read an article last week in theEngineer on “Designing a mechanical rover to explore … Venus“, on a group at JPL, led by Jonathon Sauder who are working on a mechanical rover to study Venus.

Venus has a temperature of ~470c, hot enough to melt lead, which will fry most electronics in seconds. Moreover, the Venusian surface is under a lot of pressure, roughly equivalent to a mile under water or ~160X the air pressure at Earth’s surface (from NASA Venus in depth). Extreme conditions for any rover.

Going mobile

Sauder and his team were brainstorming mechanical rovers, that operated similar to Theo Jansen’s StrandBeest which walks using wind energy alone. (Checkout the video of the BEEST walking).

Jansen had told Sauder’s team that his devices work much better on smooth surfaces and that uneven, beach like surfaces presented problems.

So, Sauder’s team started looking at using something with tracks instead of legs/feet, sort of like a World War 1 tank. That could operate upside down as well as rightside up.

Rather than sails (as the StrandBeest), they plan to use multiple vertical axis wind turbines, called Sarvonius rotors, located inside the tank to create energy and store that energy in springs for future use.

Getting data

They’re not planning to ditch electronics all together but need to minimize the rovers reliance on electronics.

There are some electronics that can operate at 450C based on silicon carbide and gallium carbide which have a very low level of integration at this time, just a 100 transistors per chip.  And they could use this to add electronic processing and control to their mechanical rover.

Solar panels can supply electricity to the high temperature electronics and can operate at 450C.

But to get information off the rover and back to the Earth, they plan to use a highly radio reflective spot on the rover and a mechanical shutter mechanism. The mechanism can be closed and opened and together with an orbiting satellite generating radio pulses and recording the rover’s reflectivity or not, send Morse code from rover to satellite. The orbiting satellite could record this information and then transmit it to Earth.

The rover will make use of simple chemical reactions to measure soil, rock and atmospheric chemistry. Soil and rocks suitable for analysis can be scooped up, drilled out and moved to the analysis chamber(s) via mechanical devices. Wind speed and direction can be sensed with simple mechanical devices.

In order to avoid obstacles wihile roving around the planet, they  plan to use a mechanical probe out othe front (and back?) of the rover with control systems attached to this to avoid obstacles. This way the rover can move around more of the planets surface.

Such a mechanical rover with high temperature electronics might also be suitable for other worlds in the solar system, Mercury for sure but moons of the Jovian planets, also have extreme pressure environments.

And such a electrical-mechanical rover also might work great to probe volcano’s on earth, although the temperatures are 700 to 1200C, ~2 to 3X Venus. Maybe such a rover could be used in highly radioactive environments to record information and send this back to personnel outside the environment or even effect some preprogrammed repairs. Ocean vents could also be another potential place where such a rover might work well.

Possible improvements

Mechanical probes would need to be moved vertically and swing horizontally to be effective and would necessarily have to poke outside the tanks envelope to read obstacles ahead.

Sonar could work better. Sounds or clicks could be produced mechanically and their reflections could be also received mechanically (a mic is just a mechanical transducer). At the pressures on Venus, sound should travel far.

Morse code was designed to efficiently send alpha-numerics and not much else. It would seem that another codec could be designed to send scientific information faster. And if one mechanical spot is good, multiple spots would be better assuming the satellite could detect multiple radio reflective spots located in close proximity to one another on the rover.

Radio works but why not use infrared. If there were some way to read an infrared signal from the probe, it could present more information per pass.

For instance, an infrared photo of the rover’s bottom or top, using with a flat surface, could encode information in cold and hot spots located across that surface.

This could work at whatever infrared resolution available from the satellite orbiting overhead and would send much more information per orbital pass.

In fact, such an infrared surface readout might allow the rover to send B&W pictures up to the satellite. Sonar could provide a mechanism to record a (sound) picture of the environment being scanned. The infrared information could be encoded across the surface via pipes of cool and hot liquids, sort of like core memory of old.

What about steam power. With 450C there ought to be more than enough heat to boil some liquid and have it cool via expansion. Having cool liquid could be used to cool electronics, chemical and solar devices.  And as the high temperatures on Venus seem constant, steam power and liquid cooling would be available all the time and eliminating any need for springs to hold energy.

And the cooling liquid from steam engines could be used to support an infrared signaling mechanism.

Still not sure why we need any electronics. A suitably configured, shrunken, analytical engine could provide the rudimentary information processing necessary to work the shutter or other transmitter mechanisms, initiate, readout and store mechanical/chemical/sonar sensors and control the other items on the rover.

And with a suitably complex analytical engine there might be some way to mechanically program it with various modes using something like punched tape or cards. Such a device could be used to hold and load information for separate programs in minimal space and could also be used to store information for later transmission, supplying a 100% mechanical storage device.

Going 100% mechanical could also lead to a potentially longer lived rover than something using some electronics and mostly mechanical devices on a planet like Venus. Mechanical devices can fail, but their failure modes are normally less catastrophic, well understood. Perhaps with sufficient mechanical redundancy and concern for tribology, such a 100% mechanical rover could last an awful long time, without any maintenance, e.g., like swiss watches.

Comments?

Photo Credit(s): World War One tank – mark 1 by Photos of the Past

Vintage Philmor morse code practice … by Joe Haupt

Accompanied by an instructor… by vy pham;

Core memory more detail by Kenneth Moore;

Model of the Analytical Engine By Bruno Barral (ByB), CC BY-SA 2.5;

Punched tape by Rositslav Lisovy

Steam locomotives by Jim Phillips