Random access, DNA object storage system

Read a couple of articles this week Inching closer to a DNA-based file system in ArsTechnica and DNA storage gets random access in IEEE Spectrum. Both of these seem to be citing an article in Nature, Random access in large-scale DNA storage (paywall).

We’ve known for some time now that we can encode data into DNA strings (see my DNA as storage … and Genomic informatics takes off posts).

However, accessing DNA data has been sequential and reading and writing DNA data has been glacial. Researchers have started to attack the sequentiality of DNA data access. The prize, DNA can store 215PB of data in one gram and DNA data can conceivably last millions of years.

Researchers at Microsoft and the University of Washington have come up with a solution to the sequential access limitation. They have used polymerase chain reaction (PCR) primers as a unique identifier for files. They can construct a complementary PCR primer that can be used to extract just DNA segments that match this primer and amplify (replicate) all DNA sequences matching this primer tag that exist in the cell.

DNA data format

The researchers used a Reed-Solomon (R-S) erasure coding mechanism for data protection and encode the DNA data into many DNA strings, each with multiple (metadata) tags on them. One of tags is the PCR primer tag header, another tag indicates the position of the DNA data segment in the file and an end of data tag that is the same PCR primer tag.

The PCR primer tag was used as sort of a file address. They could configure a complementary PCR tag to match the primer tag of the file they wanted to access and then use the PCR process to replicate (amplify) only those DNA segments that matched the searched for primer tag.

Apparently the researchers chunk file data into a block of 150 base pairs. As there are 2 complementary base pairs, I assume one bit to one base pair mapping. As such, 150 base pairs or bits of data per segment means ~18 bytes of data per segment. Presumably this is to allow for more efficient/effective encoding of data into DNA strings.

DNA strings don’t work well with replicated sequences of base pairs, such as all zeros. So the researchers created a random sequence of 150 base pairs and XOR the file DNA data with this random sequence to determine the actual DNA sequence to use to encode the data. Reading the DNA data back they need to XOR the data segment with the random string again to reconstruct the actual file data segment.

Not clear how PCR replicated DNA segments are isolated and where they are originally decoded (with a read head). But presumably once you have thousands to millions of copies of a DNA segment,  it’s pretty straightforward to decode them.

Once decoded and XORed, they use the R-S erasure coding scheme to ensure that the all the DNA data segments represent the actual data that was encoded in them. They can then use the position of the DNA data segment tag to indicate how to put the file data back together again.

What’s missing?

I am assuming the cellular data storage system has multiple distinct cells of data, which are clustered together into some sort of organism.

Each cell in the cellular data storage system would hold unique file data and could be extracted and a file read out individually from the cell and then the cell could be placed back in the organism. Cells of data could be replicated within an organism or to other organisms.

To be a true storage system, I would think we need to add:

  • DNA data parity – inside each DNA data segment, every eighth base pair would be a parity for the eight preceding base pairs, used to indicate when a particular base pair in eight has mutated.
  • DNA data segment (block) and file checksums –  standard data checksums, used to verify and correct for double and triple base pair (bit) corruption in DNA data segments and in the whole file.
  • Cell directory – used to indicate the unique Cell ID of the cell, a file [name] to PCR primer tag mapping table, a version of DNA file metadata tags, a version of the DNA file XOR string, a DNA file data R-S version/level, the DNA file length or number of DNA data segments, the DNA data creation data time stamp, the DNA last access date-time stamp,and DNA data modification data-time stamp (these last two could be omited)
  • Organism directory – used to indicate unique organism ID, organism metadata version number, organism unique cell count,  unique cell ID to file list mapping, cell ID creation data-time stamp and cell ID replication count.

The problem with an organism cell-ID file list is that this could be quite long. It might be better to somehow indicate a range or list of ranges of PCR primer tags that are in the cell-ID. I can see other alternatives using a segmented organism directory or indirect organism cell to file lists b-tree, which could hold file name lists to cell-ID mapping.

It’s unclear whether DNA data storage should support a multi-level hierarchy, like file system  directories structures or a flat hierarchy like object storage data, which just has buckets of objects data. Considering the cellular structure of DNA data it appears to me more like buckets and the glacial access seems to be more useful to archive systems. So I would lean to a flat hierarchy and an object storage structure.

Is DNA data is WORM or modifiable? Given the effort required to encode and create DNA data segment storage, it would seem it’s more WORM like than modifiable storage.

How will the DNA data storage system persist or be kept alive, if that’s the right word for it. There must be some standard internal cell mechanisms to maintain its existence. Perhaps, the researchers have just inserted file data DNA into a standard cell as sort of junk DNA.

If this were the case, you’d almost want to create a separate, data  nucleus inside a cell, that would just hold file data and wouldn’t interfere with normal cellular operations.

But doesn’t the PCR primer tag approach lend itself better to a  key-value store data base?

Photo Credit(s): Cell structure National Cancer Institute

Prentice Hall textbook

Guide to Open VMS file applications

Unix Inodes CSE410 Washington.edu

Key Value Databases, Wikipedia By ClescopOwn work, CC BY-SA 4.0, Link

Magnonics for configurable electronics

Read an article today in ScienceDaily on [a] New way to write magnetic info … that discusses research done at Imperial College Of London that used a magnetic force microscope (small magnetic probe) to write magnetic fields onto a dense array of nanowires.

Frustrated metamaterials needed

The original research is written up in a Nature article Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing  (paywall). Unclear what that means but the paper abstract discusses geometrically frustrated magnetic metamaterials.  This is where the physical size or geometrical properties of the materials at the nanometer scale restricts or limits the magnetic states that material can exhibit.

Magnetic storage deals with magnetic material but there are a number of unique interactions of magnetic material when in close (nm) proximity to one another and the way nanowire geometrically frustrated magnetic metamaterials can be magnetized to different magnetic moments which can be exploited for other uses.  These interactions and magnetic moments can be combined to provide electronic circuitry and data storage.

I believe the research provides a proof point that such materials can be written, in close proximity to one another using a magnetic force microscope.

Why it’s important

The key is the potential to create  magnonic circuitry based on the pattern of moments writen into an array of nanowires. In doing so, one can fabricate any electrical circuit. It’s almost like photolithography but without fabs, chemicals, or laser scanners.

At first I thought this could be a denser storage device, but the potential is much greater if electronic circuitry could be constructed without having to fabricate semiconductors. It would seem ideal for testing out circuitry before manufacturing. And ultimately if it could be scaled up, the manufacture/fabrication of electronic circuitry itself could be done using these techniques.

Speed, endurance, write limits?

There was no information in the public article about the speed of writing the “frustrated magnetic metamaterials”. But an atomic force microscope can scan 150×150 micrometers in several minutes. If we assume that a typical chip size today is 150×150 mm, then this would take 1E6 times several minutes, or ~2K days. With multiple scanning force microscopes operating concurrently we could cut this down by a factor of 10 or 100 and maybe someday 1000. 2 days to write any electronic circuit on the order of todays 23nm devices with nanowires and magnetic force microscopes would be a significant advance

Also there was no mention of endurance, write limits or other characteristics we have learned to love with Flash storage. But the assumption is that it can be written multiple times and that the pattern stays around for some amount of time.

How magnetics generate electronic circuits

Neither Wikipedia page, the public article or the paywall articles’ abstract describes how Magnonics can supply electronic circuitry. However both the abstract and the public article discuss applications for this new technology in hardware based neural networks using arrays of densely packed nanowires.

Presumably, by writing different magnetic patterns in these nanowire metamaterials, such patterns can be used to simulate hardware connected neurons. This means that the magnetic information can be overwritten because it can be trained. Also, such magnetic circuits can be constructed to: a) can create different path for electrons to flow through the material; b) can restrict or enhance this electronic flow, and c) can integrate across a number of inputs and determine how electronic flow will proceed from a simulated neuron.

If magnonics can do all that,  it’s very similar to electronic gates today in CPU, GPUs and other electronic circuitry. Maybe it cannot simulate every gate or electronic device that’s found in todays CPUs but it’s a step in the right direction. And magnonics is relatively new. Silicon transistors are over 70 years old and the integrated circuit is almost 60 years old. So in time, magnonics could very well become the next generation of chip technology.

Writing speed is a problem. Maybe if they spun the nanowire array around the magnetic force microscope…

Comments?

Photo Credits:  Real space observation of emergent magnetic monopoles … Nature article

Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing, Nature article

 

Disk rulz, at least for now

Last week WDC announced their next generation technology for hard drives, MAMR or Microwave Assisted Magnetic Recording. This is in contrast to HAMR, Heat (laser) Assisted Magnetic Recording. Both techniques add energy so that data can be written as smaller bits on a track.

Disk density drivers

Current hard drive technology uses PMR or Perpendicular Magnetic Recording with or without SMR (Shingled Magnetic Recording) and TDMR (Two Dimensional Magnetic Recording), both of which we have discussed before in prior posts.

The problem with PMR-SMR-TDMR is that the max achievable disk density is starting to flat line and approaching the “WriteAbility limit” of the head-media combination.

That is even with TDMR, SMR and PMR heads, the highest density that can be achieved is ~1.1Tb/sq.in. The Writeability limit for the current PMR head-media technology is ~1.4Tb/sq.in. As a result most disk density increases over the past years has been accomplished by adding platters-heads to hard drives.

MAMR and HAMR both seem able to get disk drives to >4.0Tb/sq.in. densities by adding energy to the magnetic recording process, which allows the drive to record more data in the same (grain) area.

There are two factors which drive disk drive density (Tb/sq.in.): Bits per inch (BPI) and Tracks per inch (TPI). Both SMR and TDMR were techniques to add more TPI.

I believe MAMR and HAMR increase BPI beyond whats available today by writing data on smaller magnetic grain sizes (pitch in chart) and thus more bits in the same area. At 7nm grain sizes or below PMR becomes unstable, but HAMR and MAMR can record on grain sizes of 4.5nm which would equate to >4.5Tb/sq.in.

HAMR hurdles

It turns out that HAMR as it uses heat to add energy, heats the media drives to much higher temperatures than what’s normal for a disk drive, something like 400C-700C.  Normal operating temperatures for disk drives is  ~50C.  HAMR heat levels will play havoc with drive reliability. The view from WDC is that HAMR has 100X worse reliability than MAMR.

In order to generate that much heat, HAMR needs a laser to expose the area to be written. Of course the laser has to be in the head to be effective. Having to add a laser and optics will increase the cost of the head, increase the steps to manufacture the head, and require new suppliers/sourcing organizations to supply the componentry.

HAMR also requires a different media substrate. Unclear why, but HAMR seems to require a glass substrate, the magnetic media (many layers) is  deposited ontop of the glass substrate. This requires a new media manufacturing line, probably new suppliers and getting glass to disk drive (flatness-bumpiness, rotational integrity, vibrational integrity) specifications will take time.

Probably more than a half dozen more issues with having laser light inside a hard disk drive but suffice it to say that HAMR was going to be a very difficult transition to perform right and continue to provide today’s drive reliability levels.

MAMR merits

MAMR uses microwaves to add energy to the spot being recorded. The microwaves are generated by a Spin Torque Oscilator, (STO), which is a solid state device, compatible with CMOS fabrication techniques. This means that the MAMR head assembly (PMR & STO) can be fabricated on current head lines and within current head mechanisms.

MAMR doesn’t add heat to the recording area, it uses microwaves to add energy. As such, there’s no temperature change in MAMR recording which means the reliability of MAMR disk drives should be about the same as todays disk drives.

MAMR uses todays aluminum substrates. So, current media manufacturing lines and suppliers can be used and media specifications shouldn’t have to change much (?) to support MAMR.

MAMR has just about the same max recording density as HAMR, so there’s no other benefit to going to HAMR, if MAMR works as expected.

WDC’s technology timeline

WDC says they will have sample MAMR drives out next year and production drives out in 2019. They also predict an enterprise 40TB MAMR drive by 2025. They have high confidence in this schedule because MAMR’s compatabilitiy with  current drive media and head manufacturing processes.

WDC discussed their IP position on HAMR and MAMR. They have 400+ issued HAMR patents with another 100+ pending and 75 issued MAMR patents with 46 more pending. Quantity doesn’t necessarily equate to quality, but their current IP position on both MAMR and HAMR looks solid.

WDC believes that by 2020, ~90% of enterprise data will be stored on hard drives. However, this is predicated on achieving a continuing, 10X cost differential between disk drives and (QLC 3D) flash.

What comes after MAMR is subject of much speculation. I’ve written on one alternative which uses liquid Nitrogen temperatures with molecular magnets, I called CAMR (cold assisted magnetic recording) but it’s way to early to tell.

And we have yet to hear from the other big disk drive leader, Seagate. It will be interesting to hear whether they follow WDC’s lead to MAMR, stick with HAMR, or go off in a different direction.

Comments?

 

Photo Credit(s): WDC presentation

Research reveals ~liquid nitrogen temperature molecular magnets with 100X denser storage


Must be on a materials science binge these days. I read another article this week in Phys.org on “Major leap towards data storage at the molecular level” reporting on a Nature article “Molecular magnetic hysteresis at 60K“, where researchers from University of Manchester, led by Dr David Mills and Dr Nicholas Chilton from the School of Chemistry, have come up with a new material that provides molecular level magnetics at almost liquid nitrogen temperatures.

Previously, molecular magnets only operated at from 4 to 14K (degrees Kelvin) from research done over the last 25 years or so, but this new  research shows similar effects operating at ~60K or close to liquid nitrogen temperatures. Nitrogen freezes at 63K and boils at ~77K, and I would guess, is liquid somewhere between those temperatures.

What new material

The new material, “hexa-tert-butyldysprosocenium complex—[Dy(Cpttt)2][B(C6F5)4], with Cpttt = {C5H2tBu3-1,2,4} and tBu = C(CH3)3“, dysprosocenium for short was designed (?) by the researchers at Manchester and was shown to exhibit magnetism at the molecular level at 60K.

The storage effect is hysteresis, which is a materials ability to remember the last (magnetic/electrical/?) field it was exposed to and the magnetic field is measured in oersteds.

The researchers claim the new material provides magnetic hysteresis at a sweep level of 22 oersteds. Not sure what “sweep level of 22 oersteds” means but I assume a molecule of the material is magnetized with a field strength of 22 oersteds and retains this magnetic field over time.

Reports of disk’s death, have been greatly exaggerated

While there seems to be no end in sight for the densities of flash storage these days with 3D NAND (see my 3D NAND, how high can it go post or listen to our GBoS FMS2017 wrap-up with Jim Handy podcast), the disk industry lives on.

Disk industry researchers have been investigating HAMR, ([laser] heat assisted magnetic recording, see my Disk density hits new record … post) for some time now to increase disk storage density. But to my knowledge HAMR has not come out in any generally available disk device on the market yet. HAMR was supposed to provide the next big increase in disk storage densities.

Maybe they should be looking at CAMMR, or cold assisted magnetic molecular recording (heard it here, 1st).

According to Dr Chilton using the new material at 60K in a disk device would increase capacity by 100X. Western Digital just announced a 20TB MyBook Duo disk system for desktop storage and backup. With this new material, at 100X current densities, we could have 2PB Mybook Duo storage system on your desktop.

That should keep my ever increasing video-photo-music library in fine shape and everything else backed up for a little while longer.

Comments?

Photo Credit(s): Molecular magnetic hysteresis at 60K, Nature article

 

Intel’s Optane (3D Xpoint) SSD specs in the wild

Read an article the other day in Ars Technica (Specs for 1st Intel 3DX SSD…) about a preview of the Intel Octane specs for their 375GB 3D Xpoint (3DX) flash card. The device is NVMe compliant, PCIe Gen3 add in card, that’s in a half height, half length, low profile form factor.

Intel’s Optane SSD vs. the competition

A couple of items from the Intel Optane spec sheet of interest to me as a storage guru:

  • 30 Drive writes per day/12.3 PBW (written) – 3DX, at launch, had advertised that it would have 1000 times the endurance of (2D-MLC?) NAND. Current flash cards (see Samsung SSD PRO NVMe 256GB Flash card specs) offer about 200TBW (for 256GB card) or 400TBW (for 512GB card). The Samsung PRO is based on 3D (V-)NAND, so its endurance is much better than  2D-MLC at these densities. That being said, the Octane drive is still ~40X the write endurance of the PRO 950. Not quite 1000 but certainly significantly better.
  • Sequential (bandwidth) performance (R/W) of 2400/2000 MB/sec – 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards (see Samsung SSD PRO above) above offers (R/W) 2200/900 MB/sec for an NVMe device. The Optane’s read bandwidth is a slight improvement but the write bandwidth is a 2.2X improvement over current competitive devices.
  • Random 4KB IOPs performance (R/W) of 550K/500K – Similar to the previous bulleted item, 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards like the Samsung SSD PRO offer Random 4KB IOPs performance  (R/W) of 270K/85K IOPS (@4 threads). Optane’s read random 4KB IOPs performance is 2X the PRO 950 but its write performance is ~5.9X better.
  • IO latency of <10 µsec. – 3DX advertised 10X better latency than the current (2D-MLC, non-NVMe) flash drives. According to storage review (Samsung 950 Pro M.2), the Samsung PRO 950 had a latency of ~22 µsec. Optane has at least 2X better latency than the current competition.
  • Density 375GB/HH-HL-LP – 3DX advertised 1000X the density of (then current DRAM). Today Micron offers a 4GiB DDR4/288 pin DIMM which is probably 1/2 the size of the HH flash drive. So maybe in the same space this could be 8GiB. This says that the Optane is about 100X denser than today’s DRAM.

Please note, when 3DX was launched, ~2 years ago, the then current NAND technology was 2D-MLC and NVMe was just a dream. So comparing launch claims against today’s current 3D-NAND, NVMe drives is not a fair comparison.

Nevertheless, the Optane SSD performs considerably better than current competitive NVMe drives and has significantly better endurance than current 3D (V-)NAND flash drives. All of which is a great step in the right direction.

What about DRAM replacement?

At launch, 3DX was also touted as a higher density, potential replacement for DRAM. But so far we haven’t seen any specs for what 3DX NVM looks like on a memory bus. It has much better density than DRAM, but we would need to see 3DX memory access times under 50ns to have a future as a DRAM replacement. Optane’s NVMe SSD at 10 µsec. is about 200X too slow, but then again it’s not a memory device configuration nor is it attached to a memory bus.

Comments?

Photo Credit(s):  Intel Optane Spec sheet from Ars Technica Article,  DDR4 DRAM from Wikimedia user:Dsimic

5D storage for humanity’s archive

5D data storage.jpg_SIA_JPG_fit_to_width_INLINEA group of researchers at the University of Southhampton in the UK have  invented a new type of optical recording, based on femto-second laser pulses and silica/quartz media that can store up to 300TB per (1″ diameter) disc platter with thermal stability at up to 1000°C or a media life of up to 13.8B years at room temperature (190°C?). The claim is that the memory device could outlive humanity and maybe the universe.

The new media/recording technique was used recently to create copies of text files (Holy Bible, pictured above). Other significant humanitarian, political and scientific treatise have also been stored on the new media. The new device has been nicknamed “Superman Memory Crystal”, due to the memory glass (quartz) likeness to Superman’s memory crystals.

We have written before on long term archives(See Super Long Term Archive and Today’s data and the 1000 year archive posts) but this one beats them all by many orders of magnitude.
Continue reading “5D storage for humanity’s archive”

(Storage QoW 15-003): SMR disks in GA enterprise storage in 12 months? Yes@.85 probability

Hard Disk by Jeff Kubina (cc) (from Flickr)
Hard Disk by Jeff Kubina (cc) (from Flickr)

(Storage QoW 15-003): Will we see SMR (shingled magnetic recording) disks in GA enterprise storage systems over the next 12 months?

Are there two vendors of SMR?

Yes, both Seagate and HGST have announced and currently shipping (?) SMR drives, HGST has a 10TB drive and Seagate has an 8TB drive on the market since last summer.

One other interesting fact is that SMR will be the common format for all future disk head technologies including HAMR, MAMR, & BPMR (see presentation).

What would storage vendors have to do to support SMR drives?

Because of the nature of SMR disks, writes overlap other tracks so they must be written, at least in part, sequentially (see our original post on Sequential only disks). Another post I did reported on recent work by Garth Gibson at CMU (Shingled Magnetic Recording disks) which showed how multiple bands or zones on an SMR disk could be used some of which could be written randomly and others which could be written sequentially but all could be read randomly. With such an approach you could have a reasonable file system on an SMR device with a metadata partition (randomly writeable) and a data partition (sequentially writeable).

In order to support SMR devices, changes have been requested for the T10 SCSI  & T13 ATA command protocols. Such changes would include:

  • SMR devices support a new write cursor for each SMR sequential band.
  • SMR devices support sequential writes within SMR sequential bands at the write cursor.
  • SMR band write cursors can be read, statused and reset to 0. SMR sequential band LBA writes only occur at the band cursor and for each LBA written, the SMR device increments the band cursor by one.
  • SMR devices can report their band map layout.

The presentation refers to multiple approaches to SMR support or SMR drive modes:

  • Restricted SMR devices – where the device will not accept any random writes, all writes occur at a band cursor, random writes are rejected by the device. But performance would be predictable. 
  • Host Aware SMR devices – where the host using the SMR devices is aware of SMR characteristics and actively manages the device using write cursors and band maps to write the most data to the device. However, the device will accept random writes and will perform them for the host. This will result in sub-optimal and non-predictable drive performance.
  • Drive managed SMR devices – where the SMR devices acts like a randomly accessed disk device but maps random writes to sequential writes internally using virtualization of the drive LBA map, not unlike SSDs do today. These devices would be backward compatible to todays disk devices, but drive performance would be bad and non-predictable.

Unclear which of these drive modes are currently shipping, but I believe Restricted SMR device modes are already available and drive manufacturers would be working on Host Aware and Drive managed to help adoption.

So assuming Restricted SMR device mode availability and prototypes of T10/T13 changes are available, then there are significant but known changes for enterprise storage systems to support SMR devices.

Nevertheless, a number of hybrid storage systems already implement Log Structured File (LSF) systems on their backends, which mostly write sequentially to backend devices, so moving to a SMR restricted device modes would be easier for these systems.

Unclear how many storage systems have such a back end, but NetApp uses it for WAFL and just about every other hybrid startup has a LSF format for their backend layout. So being conservative lets say 50% of enterprise hybrid storage vendors use LSF.

The other 60% would have more of a problem implementing SMR restricted mode devices, but it’s only a matter of time before  all will need to go that way. That is assuming they still use disks. So, we are primarily talking about hybrid storage systems.

All major storage vendors support hybrid storage and about 60% of startups support hybrid storage, so adding these to together, maybe about 75% of enterprise storage vendors have hybrid.

Using analysis on QoW 15-001, about 60% of enterprise storage vendors will probably ship new hardware versions of their systems over the next 12 months. So of the 13 likely new hardware systems over the next 12 months, 75% have hybrid solutions and 50% have LSF, or ~4.9 new hardware systems will be released over the next 12 months that are hybrid and have LSF backends already.

What are the advantages of SMR?

SMR devices will have higher storage densities and lower cost. Today disk drives are running 6-8TB and the SMR devices run 8-10TB so a 25-30% step up in storage capacity is possible with SMR devices.

New drive support has in the past been relatively easy because command sets/formats haven’t changed much over the past 7 years or so, but SMR is different and will take more effort to support. The fact that all new drives will be SMR over time gives more emphasis to get on the band wagon as soon as feasible. So, I would give a storage vendor a 80% likelihood of implementing SMR, assuming they have new systems coming out, are already hybrid and are already using LSF.

So of the ~4.9 systems that are LSF/Hybrid/being released *.8, says ~3.9 systems will introduce SMR devices over the next 12 months.

For non-LSF hybrid systems, the effort seems much harder, so I would give the likelihood of implementing SMR about a 40% chance. So of the ~8.1 systems left that will be introduced in next year, 75% are hybrid or ~6.1 systems and they have a 40% likelihood of implementing SMR so ~2.4 of these non-LSF systems will probably introduce SMR devices.

There’s one other category that we need to consider and that would be startups in stealth. These could have been designing their hybrid storage for SMR from the get go. In QoW 15-001 analysis I assumed another ~1.8 startup vendors would emerge to GA over the next 12 months. And if we assume that 0.75% of these were hybrid then there’s ~1.4 startups vendors that could be using SMR technology in their hybrid storage for a (4.9+2.4+1.4(1.8*.75)= 8.7 systems have a high probability of SMR implementation over the next 12 months in GA enterprise storage products.

Forecast

So my forecast of SMR adoption by enterprise storage is Yes for .85 probability (unclear what the probability should be, but it’s highly probable).

~~~~

Comments?

Microsoft Exchange database backup performance – chart of the month

Microsoft Exchange 1001-5000 mailboxes, top 10 database backup per server
In last month’s Storage Intelligence newsletter we discussed the latest Exchange storage system performance for 1001 to 5000 mailboxes. One  charts we updated was the above Exchange database backup on a per server basis. The were two new submissions for this quarter, and both the Dell PowerEdge R730xd (#2 above) and the HP D3600 drive shelf with P441 storage controller (#10) ranked well on this metric.

This ESRP reported metric only measures backup throughput at a server level. However, because these two new submissions only had one server, it’s not as much of a problem here.

The Dell system had a SAS connected JBOD with 14-4TB 7200RPM disks and the HP system had a SAS connected JBOD with 11-6TB 7200RPM disks. The other major difference is that the HP system had 4GB of “flash backed write cache” and the Dell system only had 2GB of  “flash backed cache”.

As far as I can tell the fact that the Dell storage managed ~2.3GB/sec. and the HP storage only managed ~1.1GB/sec is probably mostly due to their respective drive configurations than anything else.

RAID 0 vs. RAID 1

One surprising characteristic of the HP setup is that they used RAID 0 while the Dell system used RAID1. This would offer a significant benefit to the Dell system during heavy read activity, but as I understand it, the database backup activity is run with a standard email stress environment. So in this case, there is a healthy mix of reads/writes going on at the time the backup activity. So the Dell system would have an advantage for reads and a penalty for writes (writing two copies of all data). So Dell’s RAID advantage is probably a wash.

Whether RAID 0 vs. RAID 1 would have made any difference to other ESRP metrics (database transfers per second, read/write/log access latencies, log processing, etc.) is subject for another post.

Of course,  with Exchange DAG’s there’s built in database redundancy so maybe RAID 0 is an OK configuration for some customers. Software based redundancy does seem to be Microsoft’s direction, at least since Exchange 2010, so maybe I’m the one that’s out of touch.

Still for such a small configuration I’m not sure I would have gone with RAID 0…

Comments?