Living forever – the end of evolution part-3

Read an article yesterday on researchers who had been studying various mammals and trying to determine the number of DNA mutations they accumulate at about the time they die. The researchers found that after about 800 mutations for mole rats, they die, see Nature article Somatic mutation rates scale with lifespan across mammals and Telegraph article reporting on the research, Mystery of why humans die around 80 may finally be solved.

Similarly, at around 3500 mutations humans die, at around 3000 mutations dogs die and at around 1500 mutations mice die. But the real interesting thing is that the DNA mutation rates and mammal lifespan are highly (negatively) correlated. That is higher mutation rates lead to mammals with shorter life spans.

C. Linear regression of somatic substitution burden (corrected for analysable genome size) on individual age for dog, human, mouse and naked mole-rat samples. Samples from the same individual are shown in the same colour. Regression was performed using mean mutation burdens per individual. Shaded areas indicate 95% confidence intervals of the regression line. A shows microscopic images of sample mammalian cels and the DNA strands examined and B shows the distribution of different types of DNA mutations (substitutions or indels [insertion/deletions of DNA]).

The Telegraph article seems to imply that at 800 mutations all mammals die. But the Nature Article clearly indicates that death is at different mutation counts for each different type of mammal.

Such research show one way on how to live forever. We have talked about similar topics in the distant past see …-the end of evolution part 1 & part 2

But in any case it turns out that one of the leading factors that explains the average age of a mammal at death is its DNA mutation rate. Again, mammals with lower DNA mutation rates live longer on average and mammals with higher DNA mutation rates live shorter lives on average.

Moral of the story

if you want to live longer reduce your DNA mutation rates.

c, Zero-intercept LME regression of somatic mutation rate on inverse lifespan (1/lifespan), presented on the scale of untransformed lifespan (axis). For simplicity, the axis shows mean mutation rates per species, although rates per crypt were used in the regression. The darker shaded area indicates 95% CI of the regression line, and the lighter shaded area marks a twofold deviation from the line. Point estimate and 95% CI of the regression slope (k), FVE and range of end-of-lifespan burden are indicated.

All astronauts are subject to significant forms of cosmic radiation which can’t help but accelerate DNA mutations. So one would have to say that the risk of being an astronaut is that you will die younger.

Moon and Martian colonists will also have the same problem. People traveling, living and working there will have an increased risk of dying young. And of course anyone that works around radiation has the same risk.

Note, the mutation counts/mutation rates, that seem to govern life span are averages. Some individuals have lower mutation rates than their species and some (no doubt) have higher rates. These should have shorter and longer lives on average, respectively.

Given this variability in DNA mutation rates, I would propose that space agencies use as one selection criteria, the astronauts/colonists DNA mutation rate. So that humans which have lower than average DNA mutation rates have a higher priority of being selected to become astronauts/extra-earth colonists. One could using this research and assaying astronauts as they come back to earth for their DNA mutation counts, could theoretically determine the impact to their average life span.

In addition, most life extension research is focused on rejuvenating cellular or organism functionality, mainly through the use of young blood, other select nutrients, stem cells that target specific organs, etc. For example, see MIT Scientists Say They’ve Invented a Treatment That Reverses Hearing Loss which involves taking human cells, transform them into stem cells (at a certain maturity) and injecting them into the ear drum.

Living forever

In prior posts on this topic (see parts 1 &2 linked above) we suggested that with DNA computation and DNA storage (see or listen rather, to our GBoS podcast with CTO of Catalog) now becoming viable, one could potentially come up with a DNA program that could

  • Store an individuals DNA using some very reliable and long lived coding fashion (inside a cell or external to the cell) and
  • Craft a DNA program that could periodically be activated (cellular crontab) to access the stored DNA for the individual(in the cell would be easiest) and use this copy to replace/correct any DNA mutation throughout an individuals cells.

And we would need a very reliable and correct copy of that person’s DNA (using SHA256 hashing, CRCs, ECC, Parity and every other way to insure the DNA as captured is stored correctly forever). And the earlier we obtained the DNA copy for an individual human, the better.

Also, we would need a copy of the program (and probably the DNA) to be present in every cell in a human for this to work effectively. .

However, if we could capture a good copy of a person’s DNA early in their life we could, perhaps, sometime later, incorporate DNA code/program into the individual to use this copy and sweep through a person’s body (at that point in time) and correct any mutations that have accumulated to date. Ultimately, one could schedule this activity to occur like an annual checkup.

So yeah, life extension research can continue along the lines they are going and you can have a bunch of point solutions for cellular/organism malfunction OR it can focus on correctly copying and storing DNA forever and creating a DNA program that can correct DNA defects in every individual cell, using the stored DNA.

End of evolution

Yes mammals and that means any human could live forever this way. But it would signify the start of the end of evolution for the human species. That is whenever we captured their DNA copy, from that point on evolution (by mutating DNA) of that individual and any offspring of that individual could no longer take place. And if enough humans do this, throughout their lifespan, it means the end of evolution for humanity as a species

This assumes that evolution (which is natural variation driven by genetic mutation & survival of the fittest) requires DNA variation (essentially mutation) to drive the species forward.

~~~~

So my guess, is either we can live forever and stagnate as a species OR live normal lifespans and evolve as a species into something better over time. I believe nature has made it’s choice.

The surprising thing is that we are at a point in humanities existence where we can conceive of doing away with this natural process – evolution, forever.

Photo Credit(s):

Blockchain, open source and trusted data lead to better SDG impacts

Read an article today in Bitcoin magazine IXO Foundation: A blockchain based response to UN call for [better] data which discusses how the UN can use blockchains to improve their development projects.

The UN introduced the 17 Global Goals for Sustainable Development (SDG) to be achieved in the world by 2030. The previous 8 Millennial Development Goals (MDG) expire this year.

Although significant progress has been made on the MDGs, one ongoing determent to  MDG attainment has been that progress has been very uneven, “with the poorest and economically disadvantaged often bypassed”.  (See WEF, What are Sustainable Development Goals).

Throughout the UN 17 SDG, the underlying objective is to end global poverty  in a sustainable way.

Impact claims

In the past organizations performing services for the UN under the MDG mandate, indicated they were performing work toward the goals by stating, for example, that they planted 1K acres of trees, taught 2K underage children or distributed 20 tons of food aid.

The problem with such organizational claims is they were left mostly unverified. So the UN, NGOs and other charities funding these projects were dependent on trusting the delivering organization to tell the truth about what they were doing on the ground.

However, impact claims such as these can be independently validated and by doing so the UN and other funding agencies can determine if their money is being spent properly.

Proving impact

Proofs of Impact Claims can be done by an automated bot, an independent evaluator or some combination of the two . For instance, a bot could be used to analyze periodic satellite imagery to determine whether 1K acres of trees were actually planted or not; an independent evaluator can determine if 2K students are attending class or not, and both bots and evaluators can determine if 20 tons of food aid has been distributed or not.

Such Proofs of Impact Claims then become a important check on what organizations performing services are actually doing.  With over $1T spent every year on UN’s SDG activities, understanding which organizations actually perform the work and which don’t is a major step towards optimizing the SDG process. But for Impact Claims and Proofs of Impact Claims to provide such feedback but they must be adequately traced back to identified parties, certified as trustworthy and be widely available.

The ixo Foundation

The ixo Foundation is using open source, smart contract blockchains, personalized data privacy, and other technologies in the ixo Protocol for UN and other organizations to use to manage and provide trustworthy data on SDG projects from start to completion.

Trustworthy data seems a great application for blockchain technology. Blockchains have a number of features used to create trusted data:

  1. Any impact claim and proofs of impacts become inherently immutable, once entered into a blockchain.
  2. All parties to a project, funders, services and evaluators can be clearly identified and traced using the blockchain public key infrastructure.
  3. Any data can be stored in a blockchain. So, any satellite imagery used, the automated analysis bot/program used, as well as any derived analysis result could all be stored in an intelligent blockchain.
  4. Blockchain data is inherently widely available and distributed, in fact, blockchain data needs to be widely distributed in order to work properly.

 

The ixo Protocol

The ixo Protocol is a method to manage (SDG) Impact projects. It starts with 3 main participants: funding agencies, service agents and evaluation agents.

  • Funding agencies create and digitally sign new Impact Projects with pre-defined criteria to identify appropriate service  agencies which can do the work of the project and evaluation agencies which can evaluate the work being performed. Funding agencies also identify Impact Claim Template(s) for the project which identify standard ways to assess whether the project is being performed properly used by service agencies doing the work. Funding agencies also specify the evaluation criteria used by evaluation agencies to validate claims.
  • Service agencies select among the open Impact Projects whichever ones they want to perform.  As the service agencies perform the work, impact claims are created according to templates defined by funders, digitally signed, recorded and collected into an Impact Claim Set underthe IXO protocol.  For example Impact Claims could be barcode scans off of food being distributed which are digitally signed by the servicing agent and agency. Impact claims can be constructed to not hold personal identification data but still cryptographically identify the appropriate parties performing the work.
  • Evaluation agencies then take the impact claim set and perform the  evaluation process as specified by funding agencies. The evaluation insures that the Impact Claims reflect that the work is being done correctly and that the Impact Project is being executed properly. Impact claim evaluations are also digitally signed by the evaluation agency and agent(s), recorded and widely distributed.

The Impact Project definition, Impact Claim Templates, Impact Claim sets, Impact Claim Evaluations are all available worldwide, in an Global Impact Ledger and accessible to any and all funding agencies, service agencies and evaluation agencies.  At project completion, funding agencies should now have a granular record of all claims made by service agency’s agents for the project and what the evaluation agency says was actually done or not.

Such information can then be used to guide the next round of Impact Project awards to further advance the UN SDGs.

Ambly project

The Ambly Project is using the ixo Protocol to supply childhood education to underprivileged children in South Africa.

It combines mobile apps with blockchain smart contracts to replace an existing paper based school attendance system.

The mobile app is used to record attendance each day which creates an impact claim which can then be validated by evaluators to insure children are being educated and properly attending class.

~~~

Blockchains have the potential to revolutionize financial services, provide supply chain provenance (e.g., diamonds with Blockchains at IBM), validate company to company contracts (Ethereum enters the enterprise) and now improve UN SDG attainment.

Welcome to the new blockchain world.

Photo Credit(s): What are Sustainable Development Goals, World Economic Forum;

IXO Foundation website

Ambly Project webpage

A college course on identifying BS

Read an article the other day from Recode (These University of Washington professors teaching a course on Calling BS) that seems very timely. The syllabus is online (Calling Bullshit — Syllabus) and it looks like a great start on identifying falsehood wherever it can be found.

In the beginning, what’s BS?

The course syllabus starts out referencing Brandolini’s Bullshit Asymmetry Principal (Law): the amount of energy needed to refute BS is an order of magnitude bigger than to produce it.

Then it goes into a rather lengthy definition of BS from Harry Frankfort’s 1986 On Bullshit article. In sum, it starts out reviewing a previous author’s discussions on Humbug and ends up at the OED. Suffice it to say Frankfurt’s description of BS runs the gamut from: Deceptive misrepresentation to short of lying.

They course syllabus goes on to reference two lengthy discussions/comments on Frankfurt’s seminal On Bullshit article, but both Cohen’s response, Deeper into BS and Eubank & Schaeffer’s A kind word for BS: …  are focused more on academic research rather than everyday life and news.

How to mathematically test for BS

The course then goes into mathematical tests for BS that range from Fermi’s questions, the Grim Test and Benford’s 1936 Law of Anomalous Numbers. These tests are all ways of looking at data and numbers and estimating whether they are bogus or not. Benford’s paper/book talks about how the first page of logarithms is always more used than others because numbers that start with 1 are more frequent than any other number.

How rumors propagate

The next section of the course (week 4) talks about the natural ecology of BS.

Here there’s reference to an article by Friggeri, et al, on Rumor Cascades, which discusses the frequency with which patently both true, false and partially true/partially false rumors are “shared” on social media (Facebook).

The professors look at a website called Snopes.com which evaluates the veracity of publishes rumors uses this to classify the veracity of rumors. Next they examine how these rumors are shared over time on Facebook.

Summarizing their research, both false and true rumors propagate sporadically on Facebook. But even verified false or mixed true/mixed false rumors (identified by Snopes.com) continue to propagate on Facebook. This seems to indicate that rumor sharers are ignoring the rumor’s truthfulness or are just unaware of the Snopes.com assessment of the rumor.

Other topics on calling BS

The course syllabus goes on to causality (correlation is not causation, a common misconception used in BS), statistical traps and trickery (used to create BS), data visualization (which can be used to hide BS), big data (GiGo leads to BS), publication bias (e.g., most published research presents positive results, where’s all the negative results research…), predatory publishing and scientific misconduct (organizations that work to create BS for others), the ethics of calling BS (the line between criticism and harassment), fake news and refuting BS.

Fake news

The section on Fake News is very interesting. They reference an article in the NYT, The Agency about how a group in Russia have been reaping havoc across the internet with fake news and bogus news sites.

But there’s more another article on NYT website, Inside a fake news sausage factory, details how multiple websites started publishing bogus news and then used advertisement revenue to tell them which bogus news generated more ad revenue – apparently there’s money to be made in advertising fake news. (Sigh, probably explains why I can’t seem to get any sponsors for my websites…).

Improving the course

How to improve their course? I’d certainly take a look at what Facebook and others are doing to identify BS/fake news and see if these are working effectively.

Another area to add might be a historical review of fake rumors, news or information. This is not a new phenomenon. It’s been going on since time began.

In addition, there’s little discussion of the consequences of BS on life, politics, war, etc. The world has been irrevocably changed in the past  on account of false information. Knowing how bad this has been this might lend some urgency to studying how to better identify BS.

There’s a lot of focus on Academia in the course and although this is no doubt needed, most people need to understand whether the news they see every day is fake or not. Focusing more on this would be worthwhile.

~~~~

I admire the University of Washington professors putting this course together. It’s really something that everyone needs to understand  nowadays.

They say the lectures will be recorded and published online – good for them. Also, the current course syllabus is for a one credit hour course but they would like to expand it to a three to four credit hour course – another great idea

Comments?

Photo credit(s): The Donation of ConstantineNew York World – Remember the Maine, Public Domain; Benjamin Franklin’s Bag of Scalps letter;  fake-news-rides-sociales by Portal GDA

BlockStack, a Bitcoin secured global name space for distributed storage

At USENIX ATC conference a couple of weeks ago there was a presentation by a number of researchers on their BlockStack global name space and storage system based on the blockchain based Bitcoin network. Their paper was titled “Blockstack: A global naming and storage system secured by blockchain” (see pg. 181-194, in USENIX ATC’16 proceedings).

Bitcoin blockchain simplified

Blockchain’s like Bitcoin have a number of interesting properties including completely distributed understanding of current state, based on hashing and an always appended to log of transactions.

Blockchain nodes all participate in validating the current block of transactions and some nodes (deemed “miners” in Bitcoin) supply new blocks of transactions for validation.

All blockchain transactions are sent to each node and blockchain software in the node timestamps the transaction and accumulates them in an ordered append log (the “block“) which is then hashed, and each new block contains a hash of the previous block (the “chain” in blockchain) in the blockchain.

The miner’s block is then compared against the non-miners node’s block (hashes are compared) and if equal then, everyone reaches consensus (agrees) that the transaction block is valid. Then the next miner supplies a new block of transactions, and the process repeats. (See wikipedia’s article for more info).

All blockchain transactions are owned by a cryptographic address. Each cryptographic address has a public and private key associated with it.
Continue reading “BlockStack, a Bitcoin secured global name space for distributed storage”

Surprises from 4 years of SSD experience at Google

Flash field experience at Google 

Overview SSDsIn a FAST’16 article I recently read (Flash reliability in production: the expected and unexpected, see p. 67), researchers at Google reported on field experience with flash drives in their data centers, totaling many millions of drive days covering MLC, eMLC and SLC drives with a minimum of 4 years of production use (3 years for eMLC). In some cases, they had 2 generations of the same drive in their field population. SSD reliability in the field is not what I would have expected and was a surprise to Google as well.

The SSDs seem to be used in a number of different application areas but mainly as SSDs with a custom designed PCIe interface (FusionIO drives maybe?). Aside from the technology changes, there were some lithographic changes as well from 50 to 34nm for SLC and 50 to 43nm for MLC drives and from 32 to 25nm for eMLC NAND technology.
Continue reading “Surprises from 4 years of SSD experience at Google”

SCI’s (Storage QoW 15-001) 3D XPoint in next years storage, forecast=NO with 0.62 probability

20147811875_413b041e3f_z
So as to my forecast for the first question of the week: (#Storage-QoW 2015-001) – Will 3D XPoint be GA’d in  enterprise storage systems within 12 months?

I believe the answer will be Yes with a 0.38 probability or conversely, No with a 0.62 probability.

We need to decompose the question to come up with a reasonable answer.

1. How much of an advantage will 3D XPoint provide storage systems?

The claim is 1000X faster than NAND, 1000X endurance of NAND, & 10X density of DRAM. But, I believe the relative advantage of the new technology depends mostly on its price. So now the question is what would 3D XPoint technology cost ($/GB).

It’s probably going to be way more expensive than NAND $/GB (@2.44/64Gb-MLC or ~$0.31/GB). But how will it be priced relative to  DRAM (@$2.23/4Gb DDR4 or ~$4.46/GB) and (asynch) SRAM (@$7.80/ 16Mb or $3900.00/GB)?

More than likely, it’s going to cost more than DRAM because it’s non-volatile and almost as fast to access. As for how it relates to SRAM, the pricing gulf between DRAM and asynch SRAM is so huge, I think pricing it even at 1/10th SRAM costs, would seriously reduce the market. And I don’t think its going to be too close to DRAM, so maybe ~10X the cost of DRAM, or $44.60/GB.  [Probably more like a range of prices with $44.60 at 0.5 probable, $22.30 at 0.25 and $66.90 at 0.1. Unclear how I incorporate such pricing variability into a forecast.]

At $44.60/GB, what could 3D XPoint NVM replace in a storage system: 1) non-volatile cache; 2) DRAM caches, 3) Flash caches; 4) PCIe flash storage or 5) SSD storage in storage control units.

Non-volatile caching uses battery backed DRAM (with or without SSD offload) and SuperCap backed DRAM with SSD offload. Non-volatile caches can be anywhere from 1/16 to 1/2 total system cache size. The average enterprise class storage has ~412GB of cache, so non-volatile caching could be anywhere from 26 to 206GB or lets say ~150GB of 3D XPoint, which at ~$45/GB, would cost $6.8K in chips alone, add in $1K of circuitry and it’s $7.8K

  • For battery backed DRAM – 150GB of DRAM would cost ~$670 in chips, plus an SSD (~300GB) at ~$90, and 2 batteries (8hr lithium battery costs $32) so $64. Add charging/discharging circuitry, battery FRU enclosures, (probably missing something else) but maybe all the extras come to another $500 or ~$1.3K total. So the at $45/GB the 3D Xpoint non-volatile cache would run ~6.0X the cost of battery backed up DRAM.
  • For superCAP backed DRAM – similarly, a SuperCAP cache would have the same DRAM and SSD costs ($670 & $90 respectively). The costs for SuperCAPS in equivalent (Wh) configurations, run 20X the price of batteries, so $1.3K. Charging/discharging circuitry and FRU enclosures would be simpler than batteries, maybe 1/2 as much, so add $250 for all the extras, which means a total SuperCAP backed DRAM cost of ~$2.3K., which puts 3D Xpoint at 3.4X the cost of superCAP backed DRAM.

In these configurations a 3D XPoint non-volatile memory would replace lot’s of circuitry (battery-charging/discharging & other circuitry or SuperCAP-charging/discharging & other circuitry) and the SSD. So, 3D XPoint non-volatile cache could drastically simplify hardware logic and also software coding for power outages/failures. Less parts and coding has some intrinsic value beyond pure cost, difficult to quantify, but substantive, nonetheless.

As for using 3D XPoint to replace volatile DRAM cache another advantage is you wouldn’t need to have a non-volatile cache and systems wouldn’t have to copy data between caches. But at $45/GB, costs would be significant. A 412GB DRAM cache would cost $1.8K in DRAM chips and maybe another $1K in circuitry, so~ $2.8K. Doing one in 3D XPoint would run $18K in chips and the same $1K in circuitry, so $19K.  But we eliminate the non-volatile cache. Factoring that in, the all 3D XPoint cache would run ~$19K vs. DRAM volatile and (SuperCAP backed) non-volatile cache $2.8K+$2.3K= $5.1 or ~3.7X higher costs.

Again, the parts cost differential is not the whole story. But replacing volatile cache AND non-volatile cache would probably require more coding not less.

As for using 3D XPoint as a replacement or FlashCache I don’t think it’s likely because the cost differential at $45/GB is ~100X Flash costs (not counting PCIe controller and other logic) . Ditto for PCIe Flash and SSD storage.

Being 1000X denser than DRAM is great, but board footprint is not a significant storage system cost factor today.

So at a $45/GB price maybe there’s a 0.35 likelihood that storage systems would adopt the technology.

2. How many vendors are likely to GA new enterprise storage hardware in the next 12 months?

We can use major vendors to help estimate this. I used IBM, EMC, HDS, HP and NetApp as representing the major vendors for this analysis.

IBM (2 for 4) 

  • They just released a new DS8880 last fall and their prior version DS8870 came out in Oct. 2013, so the DS8K seems to be on a 24 month development cycle. So, its very unlikely we will see a new DS8K be released in next 12 month. 
  • SVC engine hardware DH8 was introduced in May 2014. SVC CG8 engine was introduced in May 2011. So SVC hardware seems to be on a 36 month cycle. So, its very unlikely we will see a new SVC hardware engine will be released in the next 12 months.
  • FlashSystem 900 hardware was just rolled out 1Q 2015  and FlashSystem 840 was introduced in January of 2014. So FlashSystem hardware is on a ~15 month hardware cycle. So, it is very likely that a new FlashSystem hardware will be released in the next 12 months. 
  • XIV Gen 3 hardware was introduced in July of 2011. Unclear when Gen2 was rolled out but IBM acquired XIV in Jan of 2008 and released an IBM version in August, 2008. So XIV’s on a ~36 month cycle. So, it is very likely that a new generation of XIV will be released in the next 12 months. 

EMC ([4] 3 for 4) 

  • VMAX3 was GA’d in 3Q (Sep) 2014. VMAX2 was available Sep 2012, which puts VMAX on 24 month cycle. So, it’s very likely that a new VMAX will be released in the next 12 months.
  • VNX2 was announced May, 2013 and GA’d Sep 2013. VNX 1 was announced Jan ,2011 and GA’d by May 2011. So that puts VNX on a ~28 month cycle. Which means we have should have already seen a new one, so it’s very likely we will see a new version of VNX in the next 12 months.  
  • XtremIO hardware was introduced in Mar, 2013 with no new significant hardware changes since. With a lack of history to guide us let’s assume a 24 month cycle. So, it’s very likely we will see a new version of XtremIO hardware in the next 12 months.
  • Isilon S200/X200 was introduced April, 2011 and X400 was released in May, 2012. Which put Isilon on a 13 month cycle then but nothing since.  So, it’s very likely we will see a new version of Isilon hardware in the next 12 months. 

However, having EMC’s unlikely to update all their storage hardware in the same 12 moths. That being said, XtremIO could use a HW boost as IBM and the startups are pushing AFA technology pretty hard here. Isilon is getting long in the tooth, so that’s another likely changeover. Since VNX is more overdue than VMAX, I’d have to say it’s likely new VNX, XtremIO & Isilon hardware will be seen over the next year. 

HDS (1 of 3) 

  • Hitachi VSP G1000 came out in Apr of 2014. HDS VSP came out in Sep of 2010. So HDS VSP is on a 43 month cycle. So it’s very unlikely we will see a new VSP in 12 months. 
  • Hitachi HUS VM came out in Sep 2012.  As far as I can tell there were no prior generation systems. But HDS just came out with the G200-G800 series, leaving the HUS VM as the last one not updated so, it’s very likely we will see a new version of HUS VM in the next 12 months.
  • Hitachi VSP G800, G600, G400, G200 series came out in Nov of 2015. Hitachi AMS 2500 series came out in April, 2012. So the mid-range systems seem to be on an 43 month cycle. So it’s very unlikely we will see a new version of HDS G200-G800 series in the next 12 months.

HP (1 of 2) 

  • HP 3PAR 20000 was introduced August, 2015 and the previous generation system, 3PAR 10000 was introduced in June, 2012. This puts the 3PAR on a 38 month cycle. So it’s very unlikely we will see a new version of 3PAR in the next 12 months. 
  • MSA 1040 was introduced in Mar 2014. MSA 2040 was introduced in May 2013. This puts the MSA on ~10 month cycle. So it’s very likely we will see a new version of MSA in the next 12 months. 

NetApp (2 of 2)

  • FAS8080 EX was introduced June, 2014. FAS6200 was introduced in Feb, 2013. Which puts the highend FAS systems on a 16 month cycle. So it’s very likely we will see a new version high-end FAS in the next 12 months.
  • NetApp FAS8040-8060 series scale out systems were introduced in Feb 2014. FAS3200 series was introduced in Nov of 2012. Which puts the FAS systems on a 15 month cycle. A new midrange release seems overdue, so it’s very likely we will see a new version of mid-range FAS in the next 12 months.

Overall the likelihood of new hardware being released by major vendors is 2+3+1+1+2=9/15 or ~0.60 probability of new hardware in the next 12 months.

Applying 0.60 to non-major storage vendors that typically only have one storage system GA’d at a time, which includes Coho Data, DataCore, Data Gravity, Dell, DDN, Fujitsu, Infinidat, NEC, Nexenta, NexGen Storage, Nimble, Pure, Qumulo, Quantum, SolidFire, Tegile, Tintri, Violin Memory, X-IO, and am probably missing a couple more. So of these ~21 non-major/startup vendors, we are likely to see ~13 new (non-major) hardware systems in the next 12 months. 

Some of these non-major systems are based on standard off-the-shelf, Intel server hardware and some vendors (Infinidat, Violin Memory & X-IO) have their own hardware designed systems. Of the 9 major vendor products identified above, six (IBM XIV, EMC VNX, EMC Isilon, EMC XtremIO, HP MSA and NetApp mid-range) use off the shelf, server hardware.

So all told my best guess is we should see (9+13=)22 new enterprise storage systems introduced in next 12 months from major and non-major storage vendors. 

3. How likely is it that Intel-Micron will come out with GA chip products in the next 6 months?

They claimed they were sampling products to vendors back at Flash Summit in August 2015. So it’s very likely (0.85 probability) that Intel-Micron will produce 3D XPoint chips in the next 12 months.

Some systems (IBM FlashSystems, NetApp high-end, and HUS VM) could make use of raw chips or even a new level of storage connected to a memory bus. But all of them could easily take advantage of a 3D XPoint device that was an NVMe PCIe connected storage.

But to be useable for most vendor storage systems being GA’d over the next year, any new chip technology has to be available for use in 6 months at the latest.

4. How likely is it that Intel-Micron will produce servers with 3D XPoint in the next 6 months?

Listening in at Flash Summit this seems to be their preferred technological approach to market. And as most storage vendors use standard Intel Servers this would seem to be an easiest way to adopt it. If the chips are available, I deem it 0.65 probability that Intel will GA server hardware in the next 6 months with 3D XPoint technology. 

Not sure any of the major or non-major vendors above could possible use server hardware introduced later than 6 months but Qumulo uses Agile development and releases GA code every 2 weeks, so they could take this on later than most.

But given the chip pricing, lack of significant advantage, and coding update requirements, I deem it 0.33 probability that vendors will adopt the technology even if it’s in a new server that they can use.

Summary

So there’s a 0.85 probability of chips available within 6 months for 3 potential major system that leaves us with 2.6 systems using 3D XPoint chip technology directly. 

With a 0.65 probability of servers coming out in 6 months using 3D XPoint and a 0.45 of new storage systems adopting the technology for caching. That says there’s a 0.29 probability and with 18 new systems coming out. That says 5.2 systems could potentially adopt the server technology.

For a total of 7.8 systems out of a potential 22 new systems or a 0.35 probability. 

That’s just the known GA non-major and storage startups what about the stealth(ier) startups without GA storage like Primary Data. There’s probably 2 or 3 non-GA storage startups. And if we assume the same 0.6 vendors will have GA hardware next year that is an additional 1.8 systems. More than likely these will depend on standard servers, so the 0.65 probability of Intel servers probability applies. So it’s likely we will see an additional 1.2 systems here or a total of 9.0 new systems that will adopt 3D XPoint tech in the next 12 months.

So it’s 9 systems out of 23.8 or ~0,38 probable. So my forecast is Yes at 0.38 probable. 

Pricing is a key factor here. I assumed a single price but it’s more likely a range of possibilities and factoring in a pricing range would be more accurate but I don’t know how, yet.

~~~~

I could go on for another 1000 words and still be no closer to an estimate. Somebody please check my math.

Comments?

Photo Credit(s): (iTech Androidi) 3D XPoint – Intel’s new Storage chip is 1000 faster than flash memory

Million year optical disk

Read an article the other day about scientists creating an optical disk that would be readable in a million years or so. The article in Science Mag titled A million – year hard disk was intended to warn people about potential dangers in the way future that were being created today.

A while back I wrote about a 1000 year archive which was predominantly about disappearing formats. At the time, I believed given the growth in data density that information could easily be copied and saved over time but the formats for that data would be long gone by the time someone tried to read it.

The million year optical disk eliminates the format problem by using pixelated images etched on media. Which works just dandy if you happen to have a microscope handy.

Why would you need a million year disk

The problem is how do you warn people in the far future not to mess with radioactive waste deposits buried below. If the waste is radioactive for a million years, you need something around to tell people to keep away from it.

Stone markers last for a few thousand years at best but get overgrown and wear down in time. For instance, my grandmother’s tombstone in Northern Italy has already been worn down so much that it’s almost unreadable. And that’s not even 80 yrs old yet.

But a sapphire hard disk that could easily be read with any serviceable microscope might do the job.

How to create a million year disk

This new disk is similar to the old StorageTek 100K year optical tape. Both would depend on microscopic impressions, something like bits physically marked on media.

For the optical disk the bits are created by etching a sapphire platter with platinum. Apparently the prototype costs €25K but they’re hoping the prices go down with production.

There are actually two 20cm (7.9in) wide disks that are molecularly fused together and each disk can store 40K miniaturized pages that can hold text or images. They are doing accelerated life testing on the sapphire disks by bathing them in acid to insure a 10M year life for the media and message.

Presumably the images are grey tone (or in this case platinum tone). If I assume 100Kbytes per page that’s about 4GB, something around a single layer DVD disk in a much larger form factor.

Why sapphire

It appears that sapphire is available from industrial processes and it seems impervious to wear that harms other material. But that’s what they are trying to prove.

Unclear why the decided to “molecularly” fuse two platters together. It seems to me this could easily be a weak link in the technology over the course of dozen millennia or so. On the other hand, more storage is always a good thing.

~~~~

In the end, creating dangers today that last millions of years requires some serious thought about how to warn future generations.

Image: Clock of the Long Now by Arenamontanus

Top 10 blog posts for 2011

Merry Christmas! Buon Natale! Frohe Weihnachten! by Jakob Montrasio (cc) (from Flickr)
Merry Christmas! Buon Natale! Frohe Weihnachten! by Jakob Montrasio (cc) (from Flickr)

Happy Holidays.

I ranked my blog posts using a ratio of hits to post age and have identified with the top 10 most popular posts for 2011 (so far):

  1. Vsphere 5 storage enhancements – We discuss some of the more interesting storage oriented Vsphere 5 announcements that included a new DAS storage appliance, host based (software) replication service, storage DRS and other capabilities.
  2. Intel’s 320 SSD 8MB problem – We discuss a recent bug (since fixed) which left the Intel 320 SSD drive with only 8MB of storage, we presumed the bug was in the load leveling logic/block mapping logic of the drive controller.
  3. Analog neural simulation or digital neuromorphic computing vs AI – We talk about recent advances to providing both analog (MIT) and digital versions (IBM) of neural computation vs. the more traditional AI approaches to intelligent computing.
  4. Potential data loss using SSD RAID groups – We note the possibility for catastrophic data loss when using equally used SSDs in RAID groups.
  5. How has IBM researched changed – We examine some of the changes at IBM research that have occurred over the past 50 years or so which have led to much more productive research results.
  6. HDS buys BlueArc – We consider the implications of the recent acquisition of BlueArc storage systems by their major OEM partner, Hitachi Data Systems.
  7. OCZ’s latest Z-Drive R4 series PCIe SSD – Not sure why this got so much traffic but its OCZ’s latest PCIe SSD device with 500K IOPS performance.
  8. Will Hybrid drives conquer enterprise storage – We discuss the unlikely possibility that Hybrid drives (NAND/Flash cache and disk drive in the same device) will be used as backend storage for enterprise storage systems.
  9. SNIA CDMI plugfest for cloud storage and cloud data services – We were invited to sit in on a recent SNIA Cloud Data Management Initiative (CDMI) plugfest and talk to some of the participants about where CDMI is heading and what it means for cloud storage and data services.
  10. Is FC dead?! – What with the introduction of 40GbE FCoE just around the corner, 10GbE cards coming down in price and Brocade’s poor YoY quarterly storage revenue results, we discuss the potential implications on FC infrastructure and its future in the data center.

~~~~

I would have to say #3, 5, and 9 were the most fun for me to do. Not sure why, but #10 probably generated the most twitter traffic. Why the others were so popular is hard for me to understand.

Comments?