Western Digital at SFD15: ActiveScale object storage

Phill Bullinger and his staff from Western Digital presented at Storage Field Day 15 (SFD15) on a number of their enterprise products including Tegile and IntelliFlash but the one that caught my interest was their ActiveScale object store acquired from Amplidata back in 2015.

ActiveScale is an onprem, object storage system that provides cloud-like  economics for customer data.

ActiveScale Hardware

ActiveScale systems can both scale up and scale out within a single site. ActiveScale systems have both  storage and system nodes. Storage nodes perform erasure coding and System nodes are control points and metadata managers for the object store.

ActiveScale comes in two appliance configurations that contain both storage and system nodes and storage required.  The two appliances are:

  • ActiveScale P100 is a 7U 720TB pod system and A full rack of P100s can read 8GB/sec and can have 17-9s data availability. The P100 can scale up to 2.1PB in a single rack and up to 18PB in the same namespace. The P100 is a higher performing solution with better performing storage and system nodes
  • ActiveScale X100 is a 42U rack scale solution that holds up to 588 12TB drives or 5.8PB per rack. The X100 can scale up to 9 racks or 52PB in the same namespace. The X100 is a denser configuration with only 6 storage nodes and as such, has a better $/GB than the P100 above.

As WDC is both the supplier of the ActiveScale appliance and a supplier of disk storage they can be fairly aggressive with pricing on appliance systems.

Data integrity in ActiveScale

They make a point of saying that ActiveScale object metadata and data are stored separately. By separating data and metadata, they claim to be  more resilient to system failures. Object metadata is 3 way replicated, in a replicated database, residing in system nodes. Other object systems often store metadata and object data in the same way.

Object data can be erasure coded. That is, object data is chunked, erasure coding protected and then spread across multiple disk drives for data protection. ActiveScale erasure coding is called BitSpread. With BitSpread customers identify the number of disk drives to spread object data across and the number of drive failures the system should recover from without data loss.

A typical BitSpread configuration splits object data into 18 chunks and spreads these chunks across storage columns. A storage column is from 6-18 storage nodes. There’s no pre-allocated space in BitSpread. Object data chunks are allocated to disk storage based on current capacity and performance of the system, within redundancy constraints.

In addition, ActiveScale has a background task called BitDynamics that scans  erasure coded chunks and does a mathematical health check on the object data. If a chunk is bad, the object data chunk can be recovered and re-erasure coded back to proper health.

WDC performance testing shows that BitDynamics has 0 performance degradation when performing re-erasure coding. Indeed, they took out 98 drives in an ActiveScale cluster and BitDynamics re-coded all that data onto other disk drives and detected no performance impact. No indication how long  re-encoding 98 disk drives of data took nor the % of object store capacity utilization at the time of the test but presumably there’s a report someplace to back this up

Unlike many public cloud based object storage systems, ActiveScale is strongly consistent. That is object puts (writes) are not responded back to the entity doing the put,  until the object metadata and object data are properly and safely recorded in the object store.

ActiveScale also supports 3 site erasure coding. GeoSpread is their approach to erasure coding across sites. In this case, object metadata is replicated across 3 system nodes across the sites. Object data and erasure coded information is split into 20 chunks which are then spread across the three sites.  This way if any one site goes down, the other two sites have sufficient metadata, object data chunks and erasure coded information to reconstruct the data.

ActiveScale 5.2 now supports asynch replication. That is any one ActiveScale cluster can replicate to any other ActiveScale cluster located continent distances away.

Unclear how GeoSpread and asynch replication would interact together, but my guess is that each of the 3 GeoSpread sites could be asynchronously replicated to 3 other sites for maximum redundancy.

Both GeoSpread and ActiveScale replication impact performance,  depending on how far the sites are from one another and the speed and bandwidth of the links between sites.

ActiveScale markets

ActiveScale’s biggest market is media and entertainment (M&E), mostly used for media archive or tape replacement/augmentation. WDC showed one customer case study for the Montreaux Jazz Festival, which migrated 49 years of performance videos up to ActiveScale and can now stream any performance, on request, without delay. Montreax media is GeoSpread across 3 sites in France. Another option is to perform transcoding on the object media in realtime and stream the transcoded media.

Another large market is Bio/Life Sciences. Medical & biological scanners are transitioning to higher resolution scans which take more data space. And this sort of medical information needs to be kept a long time

Data analytics on ActiveScale

One other emerging market is data analytics. With the new S3A (S3 adapter), Hadoop clusters can now support object storage as a 2nd tier. One problem with data analytics is that they have lots of data and storing it in triplicate, costs an awful lot.

In big data world, datasets can get very large very quickly. Indeed PB sizes data sets aren’t that unusual. And with triple replication (in native HDFS). When HDFS runs out of space you have to delete data. Before S3A, the only way you could increase storage you had to scale out (with compute and storage and networking) in order to add capacity.

Using Hadoop’s S3A, ActiveScale’s can provide cold archive for data analytics.  From a Hadoop user/application perspective, S3A ActiveScale storage looks like just another directory under HDFS (Hadoop Data File System). You can run MapReduce or other Hadoop application directly against object buckets. But a more realistic approach is to move inactive or cold data from an disk resident HDFS directory to a S3A directory

HDFS and MapReduce are tightly coupled and were designed to have data close to where computation happens. So,  as long as the active data or working set data is on HDFS disk storage or directly in memory the rest of the (inactive) data could all be placed on S3A object storage. Inactive data is normally historical data no longer being actively analyzed while newer data would be actively analyzed. Older, inactive data can be manually or automatically archived off to S3A. With HIVE you can partition your database to have active data in HDFS disk storage and inactive data in S3A.

Another approach is if the active, working set data can all fit directly in memory then the data can reside on S3A object storage. This way the data is read from S3A storage into memory, analyzed there and output be done back to object store or HDFS disk. Because the data is only read (loaded) once, there’s only a minimal performance penalty to use S3A storage.

Western Digital is an active contributor to Hadoop S3A and have recently added performance improvements to S3A, such as better caching, partial object reading, and core XML performance tuning options.

~~~~
If your interested in learning more about Western Digital ActiveScale, check out the videos referenced earlier and their website.

Also you may be interested in these other posts on the WD sessions at SFD15:

The A is for Active, The S is for Scale by Dan Firth (@PenguinPunk)

Comments?

Random access, DNA object storage system

Read a couple of articles this week Inching closer to a DNA-based file system in ArsTechnica and DNA storage gets random access in IEEE Spectrum. Both of these seem to be citing an article in Nature, Random access in large-scale DNA storage (paywall).

We’ve known for some time now that we can encode data into DNA strings (see my DNA as storage … and Genomic informatics takes off posts).

However, accessing DNA data has been sequential and reading and writing DNA data has been glacial. Researchers have started to attack the sequentiality of DNA data access. The prize, DNA can store 215PB of data in one gram and DNA data can conceivably last millions of years.

Researchers at Microsoft and the University of Washington have come up with a solution to the sequential access limitation. They have used polymerase chain reaction (PCR) primers as a unique identifier for files. They can construct a complementary PCR primer that can be used to extract just DNA segments that match this primer and amplify (replicate) all DNA sequences matching this primer tag that exist in the cell.

DNA data format

The researchers used a Reed-Solomon (R-S) erasure coding mechanism for data protection and encode the DNA data into many DNA strings, each with multiple (metadata) tags on them. One of tags is the PCR primer tag header, another tag indicates the position of the DNA data segment in the file and an end of data tag that is the same PCR primer tag.

The PCR primer tag was used as sort of a file address. They could configure a complementary PCR tag to match the primer tag of the file they wanted to access and then use the PCR process to replicate (amplify) only those DNA segments that matched the searched for primer tag.

Apparently the researchers chunk file data into a block of 150 base pairs. As there are 2 complementary base pairs, I assume one bit to one base pair mapping. As such, 150 base pairs or bits of data per segment means ~18 bytes of data per segment. Presumably this is to allow for more efficient/effective encoding of data into DNA strings.

DNA strings don’t work well with replicated sequences of base pairs, such as all zeros. So the researchers created a random sequence of 150 base pairs and XOR the file DNA data with this random sequence to determine the actual DNA sequence to use to encode the data. Reading the DNA data back they need to XOR the data segment with the random string again to reconstruct the actual file data segment.

Not clear how PCR replicated DNA segments are isolated and where they are originally decoded (with a read head). But presumably once you have thousands to millions of copies of a DNA segment,  it’s pretty straightforward to decode them.

Once decoded and XORed, they use the R-S erasure coding scheme to ensure that the all the DNA data segments represent the actual data that was encoded in them. They can then use the position of the DNA data segment tag to indicate how to put the file data back together again.

What’s missing?

I am assuming the cellular data storage system has multiple distinct cells of data, which are clustered together into some sort of organism.

Each cell in the cellular data storage system would hold unique file data and could be extracted and a file read out individually from the cell and then the cell could be placed back in the organism. Cells of data could be replicated within an organism or to other organisms.

To be a true storage system, I would think we need to add:

  • DNA data parity – inside each DNA data segment, every eighth base pair would be a parity for the eight preceding base pairs, used to indicate when a particular base pair in eight has mutated.
  • DNA data segment (block) and file checksums –  standard data checksums, used to verify and correct for double and triple base pair (bit) corruption in DNA data segments and in the whole file.
  • Cell directory – used to indicate the unique Cell ID of the cell, a file [name] to PCR primer tag mapping table, a version of DNA file metadata tags, a version of the DNA file XOR string, a DNA file data R-S version/level, the DNA file length or number of DNA data segments, the DNA data creation data time stamp, the DNA last access date-time stamp,and DNA data modification data-time stamp (these last two could be omited)
  • Organism directory – used to indicate unique organism ID, organism metadata version number, organism unique cell count,  unique cell ID to file list mapping, cell ID creation data-time stamp and cell ID replication count.

The problem with an organism cell-ID file list is that this could be quite long. It might be better to somehow indicate a range or list of ranges of PCR primer tags that are in the cell-ID. I can see other alternatives using a segmented organism directory or indirect organism cell to file lists b-tree, which could hold file name lists to cell-ID mapping.

It’s unclear whether DNA data storage should support a multi-level hierarchy, like file system  directories structures or a flat hierarchy like object storage data, which just has buckets of objects data. Considering the cellular structure of DNA data it appears to me more like buckets and the glacial access seems to be more useful to archive systems. So I would lean to a flat hierarchy and an object storage structure.

Is DNA data is WORM or modifiable? Given the effort required to encode and create DNA data segment storage, it would seem it’s more WORM like than modifiable storage.

How will the DNA data storage system persist or be kept alive, if that’s the right word for it. There must be some standard internal cell mechanisms to maintain its existence. Perhaps, the researchers have just inserted file data DNA into a standard cell as sort of junk DNA.

If this were the case, you’d almost want to create a separate, data  nucleus inside a cell, that would just hold file data and wouldn’t interfere with normal cellular operations.

But doesn’t the PCR primer tag approach lend itself better to a  key-value store data base?

Photo Credit(s): Cell structure National Cancer Institute

Prentice Hall textbook

Guide to Open VMS file applications

Unix Inodes CSE410 Washington.edu

Key Value Databases, Wikipedia By ClescopOwn work, CC BY-SA 4.0, Link

Blockchains go mainstream…

 

I read an article a while back on Finland’s use of blockchain technology to provide bank accounts and identity services to immigrants (see  MIT TechReview article about Finland).

Blockchains were originally invented as a way of supporting financial transactions outside the current, government monitored, financial marketplace. With Finland’s experiment, the government is starting to use blockchains to support the unbanked and monitoring their financial activity – go figure.

Debit cards on blockchain

Finland’s using a Helsinki based startup MONI, to assign a MONI card, essentially a prepaid MasterCard, to all immigrants. An immigrant can use their MONI card to pay for anything online or in real life, use it as a direct deposit account or to receive and track the use of government assistance.

Underlying the MONI card is public blockchain technology. That is MONI  is not using normal credit card services to support it’s bank accounts, MONI money transfers are done through the use of public blockchains.

MONI accounts are essentially (crypto currency) wallets but used as a debit card. The user merely enters a series of numbers into web forms or uses their MONI card at a credit card terminals throughout Europe. Transferring money between MONI users anywhere in the World is also free and instantaneous.

Finland also sees an immutable record of all immigrant financial transactions,  that can be monitored to track immigrant (financial) integration into the country.

MONI is intending to make this service more broadly available. A MONI card account costs €2/month and MONI take’s a small cut out of each monetary transaction.

IDs on blockchain

I read another article the other day “Microsoft to implement blockchain-based ID system” in CoinTelegraph about using blockchains as a universal digital ID.

India has over the last decade, implemented a digital government ID using biometrics (see Aadhaar wikipedia article). Other countries have been moving to e-government where use of government services is implemented over the Internet (see EU article on eGovernment in Lithuania). Such eGovernment services depend on a digitized population registry.

Although it’s unclear whether Aadhaar and Lithuania make use of blockchain technology for their ID services, Microsoft’s definitely looking to blockchains to provide unique accounts/digital IDs to it’s population of users.

User signon’s has been a prevalent problem of the web for years. Each and every web and mobile App requires a person to signon to personalize their App. Nowadays, many Apps support using Google ID or Facebook ID for a single signon and there are other technologies being offered that provide similar services. Using a blockchain ID could easily support a single signon service.

The blockchain ID (wallet) public key could easily be used to encrypt an authentication transaction, identifying the App and the user. This authentication transaction would be processed by the blockchain digital ID service would use the private key to decrypt the transaction and use a backend ID App repository for the user to check to see that the user loging in, is the person that opened the account, acting as a sort of “proof of who you are”

Storage on blockchain

Filecoin and StorJ are storage providers that use blockchain services to allow others to use your local (or networked) storage to provide storage to the world.

A while back I had written about (free) peer to peer storage and compute services  (see my Free P2P cloud storage … post). But the problem was how do people benefit from hosting the P2P storage or compute. Filecoin and Storj solved this by paying in cryptocurrencies for storage hosted on your hardware.

Filecoin offers a storage auction and hosting service that anyone worldwide can log into and use. The data stored is encrypted end-to-end so that no one can see what’s being stored and the data is also erasure coded so that it  is protected and accessible even with having one or more hosting sites be offline.

Filecoin uses “proofs of storage“, “proofs of space”, “proofs of data possession“, and “proofs of retrievability” as a way to guarantee their storage service works properly. They also use chained “proofs of replication” as “proofs of spacetime” as service validation checks. Proofs of Replication are a way of insuring that storage providers are not deduplicating data copies and charging for non-deduped storage. (See Filecoin’s Proof of Replication paper for more info).

Storj looks somewhat similar to Filecoin, but without as much sophistication behind it.

Compute on blockchain

Ethereum was invented to support smart contracts that run on blockchain technology. IBM’s HyperLegder OpenLedger project (see our GreyBeardsOnStorga Podcast and RayOnStorage post on Hyperledger) is another example.

Smart contracts are essentially applications that run in a blockchains virtualized environment. Blockchain services are used to run an application and validate that’s it’s run only once. In some cases smart contracts use  external oracles to query as a way to verify something or some action has occurred outside the blockchain. Other oracles can be entirely digital entities that check on a particular commodity price, weather pattern, account value, etc. The oracle becomes a critical step in determining the go no go status of a smartcontract.

Advertisements vs. crypto mining

Salon, a news providing website, offers readers an option to see advertisements or to allow Salon to use their computer (browser) to mine crypto coins. (See Salon offers… article in CoinDesk).

I believe this offer is made when the website detects a viewer is using  ad blockers.

~~~~

Tthe trend is clear, people, organizations and even governments are looking at blockchain technology to provide basic and advanced services around the world.

If anyone would is interested in providing a pre-paid Visa card via blockchains, please contact me. I’d like to help.

Now if I could just find my GPU’s at a decent price somewhere…

Speaking of advertising… RayOnStorage doesn’t use advertising. But blogging like this takes time and money. If anyone’s interested in helping fund this blog, please consider sending some BTC our way, even 0.0001 BTC would help.

Our BTC wallet address is:

1MqBbAvMo6QbCVD6ZwtbLaPxmcUZGj9Ghw

Photo Credit(s): Blockchain and the public sector on OpenGovAsia.com

Unleash your design teams with single signon on Unifilabs.com

Understanding the difference between P2P and Client-server networks on LinkedIN

Blockgeek’s guide to smart contracts

A knowledge ark, the Arch project

Read an article last week on the Arch Mission Foundation project, which is a non-profit, organization that intends “to continuously preserve and disseminate human knowledge throughout time and space”.

The way I read this is they want to capture, preserve  and replicate all mankind’s knowledge onto (semi-)permanent media and store this information  at various locations around the globe and wherever we may go.

Interesting way to go about doing this. There are plenty of questions and considerations to capturing all of mankind’s knowledge.

Google’s way

 Google has electronically scanned every book in a number of library partners to help provide a searchable database of literature, check out the Google Books Library Project.

There’s over 40 library partners around the globe and the intent of the project was to digitize their collections. The library partners can then provide access to their digital copies. Google will provide full access to books in the public domain and will provide search results for all the rest, with pointers as to where the books can be found in libraries, purchased and otherwise obtained.

Google Books can be searched at Google Books. Last I heard they had digitized over 30M books from their library partners, which is pretty impressive since the Library of Congress has around 37M books. Google Books is starting to scan magazines as well.

Arch’s way

The intent is to create Arch’s (pronounced Ark’s) that can last billions of years. The organization is funding R&D into long lived storage technologies.

Some of these technologies include:

  • 5D laser optical data storage in quartz, I wrote about this before (see my 5D storage … post). Essentially, they are able to record two-tone scans of documents in transparent quartz that can last eons. Data is recorded in 5 dimensions, size of dot, polarity of dot  and 3 layers of dot locations through the media. 5D media lasts for 1000s of years.
  • Nickel ion-beam atomic scale storage, couldn’t find much on this online but we suppose this technology uses ion-beams to etch a nickel surface with nano-scale information.
  • Molecular storage on DNA molecules, I wrote about this before as well (see my DNA as storage… post) but there’s been plenty of research on this more recently. A group from Padua, IT  shows the way forward to use bacteria as a read/write head for DNA storage and there are claims that a gram of DNA could hold a ZB (zettabyte, 10**21 bytes) of data. For some reason Microsoft has been very active in researching this technology and plan to add it to Azure someday.
  • Durable space based flash drives, couldn’t find anything on this technology but assume this is some variant of NAND storage optimized for long duration.  Current NAND loses charge over time. Alternatively, this could be a version of other NVM storage, such as, MRAM, 3DX, ReRAM, Graphene Flash, and  Memristor all of which I have written about
  • Long duration DVD technology, this is sort of old school but there exists archive class WORM DVDs out and available on the market today, (see my post on M[illeniata]-Disc…).
  • Quantum information storage, current quantum memory lifetimes don’t much over exceed 180 seconds, but this is storage not memory. Couldn’t find much else on this, but it might be referring to permanent data storage with light.
M-Disc (c) 2011 Millenniata (from their website)
M-Disc (c) 2011 Millenniata (from their website)

They seem technology agnostic but want something that will last forever.

But what knowledge do they plan to store

In Arch’s FAQ they talk about open data sets like Wikipedia and the Internet Archive. But they have an interesting perspective on which knowledge to save. From an advanced future civilization perspective, they are probably not as interested in our science and technology but rather more interested in our history, art and culture.

They believe that science and technology should be roughly the same in every advanced civilization. But history, art and culture are going to be vastly different across different civilizations. As such, history, art and culture are uniquely valuable to some future version of ourselves or any other advanced scientific civilization.

~~~~

Arch intends to have multiple libraries positioned on the Earth, on the Moon and Mars over time. And they are actively looking for donations and participation (see link above).

Although, I agree that culture, art and history will be most beneficial to any advanced civilization. But there’s always a small but distinct probability that we may not continue to exist as an advanced scientific civilization. In that case, I would think, science and technology would also be needed to boot strap civilization.

To the Wikipedia, I would add GitHub, probably Google Books, and PLOS as well as any other publicly available scientific or humanities journals that available.

And don’t get me started on what format to record the data with. Needless to say, out-dated formats are going to be a major concern for anything but a 2D scan of information after about ten years or so.

In any case, humanity and universanity needs something like this.

Photo Credit(s): The Arch Mission Foundation web page

Google Books Library search on Republic results

“Five dimensional glass disks …” from The Verge

M-disk web page

New techniques shed light on ancient codex & palimpsests

Read an article the other day from New York Times, A fragile biblical text gets a virtual read about an approach to use detailed CT scans combined with X-rays to read text on a codex (double sided, hand bound book) that’s been mashed together for ~1500 years.

How to read a codex

Dr. Seales created the technology and has used it successfully to read a small charred chunk of material that was a copy of the earliest known version of the Masoretic text, the authoritative Hebrew bible.

However, that only had text on one side. A codex is double sided and being able to distinguish between which side of a piece of papyrus or parchment was yet another level of granularity.

The approach uses X-ray scanning to triangulate where sides of the codex pages are with respect to the material and then uses detailed CT scans to read the ink of the letters of the text in space. Together, the two techniques can read letters and place them on sides of a codex.

Apparently the key to the technique was in creating software could model the surfaces of a codex or other contorted pieces of papyrus/parchment and combining that with the X-ray scans to determine where in space the sides of the papyrus/parchment resided. Then when the CT scans revealed letters in planar scans (space), they could be properly placed on sides of the codex and in sequence to be literally read.

M.910, an unreadable codex

In the article, Dr. Seales and team were testing the technique on a codex written sometime between 400 and 600AD that contained the Acts of the Apostles and one of the books of the New Testament and possibly another book.

The pages had been merged together by a cinder that burned through much of the book. Most famous codexes are named but this one was only known as M.910 for the 910th acquisition of the Morgan Library.

M.910 was so fragile that it couldn’t be moved from the library. So the team had to use a portable CT scanner and X-ray machine to scan the codex.

The scans for M.910 were completed this past December and the team should start producing (Coptic) readable pages later this month.

Reading Palimpsests

A palimpsests is a manuscript on which the original writing has been obscured or erased. Another article from UCLA Library News, Lost ancient texts recovered and published online,  that talks about the use of multi-wave length spectral imaging to reveal text and figures that have been erased or obscured from Sinai Palimpsests.  The texts can be read at Sinaipalimpsests.org and total 6800 pages in 10 languages.

In this case the text had been deliberately erased or obscured to reuse parchment or papyrus. The writings are from the 5th to 12th centuries.  The texts were located in St. Catherine’s Monastary and access to it’s collection of ancient and medieval manuscripts is considered 2nd only to that in the Vatican Library.

~~~~

There are many damaged codexes scurried away in libraries throughout the world today but up until now they were mere curiosities. If successful, this new technique will enable scholars to read their text, translate them and make them available for researchers and the rest of the world to read and understand.

Now if someone could just read my WordPerfect files from 1990’s and SCRIPT/VS files from 1980’s I’d be happy.

Comments:

Picture credit(s): From NY Times article by Nicole Craine 

Acts of apostles codex

From Sinai Palimpsests Project website

Blockchain, open source and trusted data lead to better SDG impacts

Read an article today in Bitcoin magazine IXO Foundation: A blockchain based response to UN call for [better] data which discusses how the UN can use blockchains to improve their development projects.

The UN introduced the 17 Global Goals for Sustainable Development (SDG) to be achieved in the world by 2030. The previous 8 Millennial Development Goals (MDG) expire this year.

Although significant progress has been made on the MDGs, one ongoing determent to  MDG attainment has been that progress has been very uneven, “with the poorest and economically disadvantaged often bypassed”.  (See WEF, What are Sustainable Development Goals).

Throughout the UN 17 SDG, the underlying objective is to end global poverty  in a sustainable way.

Impact claims

In the past organizations performing services for the UN under the MDG mandate, indicated they were performing work toward the goals by stating, for example, that they planted 1K acres of trees, taught 2K underage children or distributed 20 tons of food aid.

The problem with such organizational claims is they were left mostly unverified. So the UN, NGOs and other charities funding these projects were dependent on trusting the delivering organization to tell the truth about what they were doing on the ground.

However, impact claims such as these can be independently validated and by doing so the UN and other funding agencies can determine if their money is being spent properly.

Proving impact

Proofs of Impact Claims can be done by an automated bot, an independent evaluator or some combination of the two . For instance, a bot could be used to analyze periodic satellite imagery to determine whether 1K acres of trees were actually planted or not; an independent evaluator can determine if 2K students are attending class or not, and both bots and evaluators can determine if 20 tons of food aid has been distributed or not.

Such Proofs of Impact Claims then become a important check on what organizations performing services are actually doing.  With over $1T spent every year on UN’s SDG activities, understanding which organizations actually perform the work and which don’t is a major step towards optimizing the SDG process. But for Impact Claims and Proofs of Impact Claims to provide such feedback but they must be adequately traced back to identified parties, certified as trustworthy and be widely available.

The ixo Foundation

The ixo Foundation is using open source, smart contract blockchains, personalized data privacy, and other technologies in the ixo Protocol for UN and other organizations to use to manage and provide trustworthy data on SDG projects from start to completion.

Trustworthy data seems a great application for blockchain technology. Blockchains have a number of features used to create trusted data:

  1. Any impact claim and proofs of impacts become inherently immutable, once entered into a blockchain.
  2. All parties to a project, funders, services and evaluators can be clearly identified and traced using the blockchain public key infrastructure.
  3. Any data can be stored in a blockchain. So, any satellite imagery used, the automated analysis bot/program used, as well as any derived analysis result could all be stored in an intelligent blockchain.
  4. Blockchain data is inherently widely available and distributed, in fact, blockchain data needs to be widely distributed in order to work properly.

 

The ixo Protocol

The ixo Protocol is a method to manage (SDG) Impact projects. It starts with 3 main participants: funding agencies, service agents and evaluation agents.

  • Funding agencies create and digitally sign new Impact Projects with pre-defined criteria to identify appropriate service  agencies which can do the work of the project and evaluation agencies which can evaluate the work being performed. Funding agencies also identify Impact Claim Template(s) for the project which identify standard ways to assess whether the project is being performed properly used by service agencies doing the work. Funding agencies also specify the evaluation criteria used by evaluation agencies to validate claims.
  • Service agencies select among the open Impact Projects whichever ones they want to perform.  As the service agencies perform the work, impact claims are created according to templates defined by funders, digitally signed, recorded and collected into an Impact Claim Set underthe IXO protocol.  For example Impact Claims could be barcode scans off of food being distributed which are digitally signed by the servicing agent and agency. Impact claims can be constructed to not hold personal identification data but still cryptographically identify the appropriate parties performing the work.
  • Evaluation agencies then take the impact claim set and perform the  evaluation process as specified by funding agencies. The evaluation insures that the Impact Claims reflect that the work is being done correctly and that the Impact Project is being executed properly. Impact claim evaluations are also digitally signed by the evaluation agency and agent(s), recorded and widely distributed.

The Impact Project definition, Impact Claim Templates, Impact Claim sets, Impact Claim Evaluations are all available worldwide, in an Global Impact Ledger and accessible to any and all funding agencies, service agencies and evaluation agencies.  At project completion, funding agencies should now have a granular record of all claims made by service agency’s agents for the project and what the evaluation agency says was actually done or not.

Such information can then be used to guide the next round of Impact Project awards to further advance the UN SDGs.

Ambly project

The Ambly Project is using the ixo Protocol to supply childhood education to underprivileged children in South Africa.

It combines mobile apps with blockchain smart contracts to replace an existing paper based school attendance system.

The mobile app is used to record attendance each day which creates an impact claim which can then be validated by evaluators to insure children are being educated and properly attending class.

~~~

Blockchains have the potential to revolutionize financial services, provide supply chain provenance (e.g., diamonds with Blockchains at IBM), validate company to company contracts (Ethereum enters the enterprise) and now improve UN SDG attainment.

Welcome to the new blockchain world.

Photo Credit(s): What are Sustainable Development Goals, World Economic Forum;

IXO Foundation website

Ambly Project webpage

Magnonics for configurable electronics

Read an article today in ScienceDaily on [a] New way to write magnetic info … that discusses research done at Imperial College Of London that used a magnetic force microscope (small magnetic probe) to write magnetic fields onto a dense array of nanowires.

Frustrated metamaterials needed

The original research is written up in a Nature article Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing  (paywall). Unclear what that means but the paper abstract discusses geometrically frustrated magnetic metamaterials.  This is where the physical size or geometrical properties of the materials at the nanometer scale restricts or limits the magnetic states that material can exhibit.

Magnetic storage deals with magnetic material but there are a number of unique interactions of magnetic material when in close (nm) proximity to one another and the way nanowire geometrically frustrated magnetic metamaterials can be magnetized to different magnetic moments which can be exploited for other uses.  These interactions and magnetic moments can be combined to provide electronic circuitry and data storage.

I believe the research provides a proof point that such materials can be written, in close proximity to one another using a magnetic force microscope.

Why it’s important

The key is the potential to create  magnonic circuitry based on the pattern of moments writen into an array of nanowires. In doing so, one can fabricate any electrical circuit. It’s almost like photolithography but without fabs, chemicals, or laser scanners.

At first I thought this could be a denser storage device, but the potential is much greater if electronic circuitry could be constructed without having to fabricate semiconductors. It would seem ideal for testing out circuitry before manufacturing. And ultimately if it could be scaled up, the manufacture/fabrication of electronic circuitry itself could be done using these techniques.

Speed, endurance, write limits?

There was no information in the public article about the speed of writing the “frustrated magnetic metamaterials”. But an atomic force microscope can scan 150×150 micrometers in several minutes. If we assume that a typical chip size today is 150×150 mm, then this would take 1E6 times several minutes, or ~2K days. With multiple scanning force microscopes operating concurrently we could cut this down by a factor of 10 or 100 and maybe someday 1000. 2 days to write any electronic circuit on the order of todays 23nm devices with nanowires and magnetic force microscopes would be a significant advance

Also there was no mention of endurance, write limits or other characteristics we have learned to love with Flash storage. But the assumption is that it can be written multiple times and that the pattern stays around for some amount of time.

How magnetics generate electronic circuits

Neither Wikipedia page, the public article or the paywall articles’ abstract describes how Magnonics can supply electronic circuitry. However both the abstract and the public article discuss applications for this new technology in hardware based neural networks using arrays of densely packed nanowires.

Presumably, by writing different magnetic patterns in these nanowire metamaterials, such patterns can be used to simulate hardware connected neurons. This means that the magnetic information can be overwritten because it can be trained. Also, such magnetic circuits can be constructed to: a) can create different path for electrons to flow through the material; b) can restrict or enhance this electronic flow, and c) can integrate across a number of inputs and determine how electronic flow will proceed from a simulated neuron.

If magnonics can do all that,  it’s very similar to electronic gates today in CPU, GPUs and other electronic circuitry. Maybe it cannot simulate every gate or electronic device that’s found in todays CPUs but it’s a step in the right direction. And magnonics is relatively new. Silicon transistors are over 70 years old and the integrated circuit is almost 60 years old. So in time, magnonics could very well become the next generation of chip technology.

Writing speed is a problem. Maybe if they spun the nanowire array around the magnetic force microscope…

Comments?

Photo Credits:  Real space observation of emergent magnetic monopoles … Nature article

Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing, Nature article