A new storage benchmark – STAC-M3

9761778404_73283cbd17_nA week or so ago I was reading a DDN press release that said they had just published a STAC-M3 benchmark. I had never heard of STAC or their STAC-M3 so thought I would look them up.

STAC stands for Securities Technology Analyst Center and is an organization dedicated to testing system solutions for the financial services industries.

What does STAC-M3 do

It turns out that STAC-M3 simulates processing a time (ticker tape or tick) log of security transactions and identifyies the maximum and weighted bid along with various other statistics for a number (1%) of securities over various time periods (year, quarter, week, and day) in the dataset. They call it high-speed analytics on time-series data. This is a frequent use case for systems in the securities sector.

There are two versions of the STAC-M3 benchmark: Antuco and Kanaga. The Antuco version uses a statically sized dataset and the Kanaga uses more scaleable (varying number of client threads) queries over larger datasets. For example, the Antuco version uses 1 or 10 client threads for their test measurements whereas the Kanaga version scales client threads, in some cases, from 1 to 100 threads and uses more tick data in 3 different sized datasets.

Good, bad and ugly of STAC-M3

Access to STAC-M3 reports requires a login but it’s available for free. Some details are only available after you request them which can be combersome.

One nice thing about the STAC-M3 benchmark information is that it provides a decent summary of the amount of computational time involved in all the queries it performs. From a storage perspective, if one were to take this and just analyze the queries with minimal or light computation that come closer to a pure storage workload than computationally heavy queries.

Another nice thing about the STAC-M3 is that it in some cases it provides detailed statistical information about the distribution of metrics, including mean, median, min, max and standard deviation. Unfortunately, the current version of the STAC-M3 does not provide these statistics for the computational light measures that are of primary interest to me as a storage analyst. It would be very nice to see some of their statistical level reporting be adopted by SPCSPECsfs or Microsoft ESRP for their benchmark metrics.

STAC-M3 also provides a measure of storage efficiency, or how much storage it took to store the database. This is computed as the reference size of the dataset divided by the amount of storage it took to store the dataset. Although this could be interesting most of the benchmark reports I examined all had similar numbers for storage efficiency 171% or 172%.

The STAC-M3 benchmark is a full stack test. That is it measures the time from the point a query is issued to the point the query response is completed. Storage is just one part of this activity, computing the various statistics is another part and the database used to hold the stock tick data is another aspect of their test environment. But what is being measured is the query elapsed time. SPECsfs2014 has also recently changed over to be a full stack test, so it’s not that unusual anymore.

The other problem from a storage perspective (but not a STAC perspective) is that there is minimal write activity during any of the benchmark specific testing. There’s just one query that generates a lot of storage write activity all the rest are heavy read IO only.

Finally, there’s not a lot of description of the actual storage and server configuration available in the basic report. But this might be further detailed in the Configuration Disclosure report which you have to request permission to see.

STAC-M3 storage submissions

As it’s a stack benchmark we don’t find a lot of storage array submissions. Typical submissions include a database running on some servers with SSD-DAS or occasionally a backend storage array. In the case of DDN it was KX system’s kdb 3.2 database, with Intel Enterprise Edition servers for Lustre, with 8 Intel based DDN EXAscaler servers, talking to a DDN SFA12KX-40 storage array. In contrast, a previousr submission used an eXtremeDB Financial Edition 6.0 database running on Lucera Compute™ (16 Core SSD V2, Smart OS) nodes.

Looking back over the last couple of years of  submissions (STAC-M3 goes back to 2010), forstorage arrays, aside from the latest DDN SFA12KX-40, we find an IBM FlashSystem 840, NetApp EF550, IBM FlashSystem 810, a couple of other DDN SFA12K-40 storage solutions, and a Violin V3210 & V3220 submission.  Most of the storage array submissions were all-flash arrays, but the DDN SFA12KX-40 is a hybrid (flash-disk) appliance.

Some metrics from recent STAC-M3 storage array runs

STAC-M3-YRHIBID-MAXMBPS

In the above chart, we show the Maximum MBPS achieved in the year long high bid (YRHIBID) extraction testcase. DDN won this one handily with over 10GB/sec for its extraction result.

STAC-M3-YRHIBID-MEANRTHowever in the next chart, we show the Mean Response (query elapsed) Time (in msec.) for the query that extracts the Year long High Bid data extraction test (YRHIBID). In this case the IBM FlashSystem 810 did much better than the DDN or even the more recent IBM FlashSystem 840.

Unexpectedly, the top MBPS storage didn’t achieve the best mean response time for the YRHIBID query. I would have thought the mean response time and the maximum MBPS would show the same rankings. Not clear to me why this is, it’s the mean response time not minimum or maximum. Although the maximum response time would show the same rankings. An issue with the YRHIBID is that it doesn’t report standard deviation, median or minimum response time results. Having these other metrics might have shed some more light on this anomaly but for now this is all we have.

If anyone knows of other storage (or stack) level benchmarks for other verticals please let me know and I would be glad to dig into them to see if they provide some interesting viwes on storage performance.

Comments?

 Photo Credit(s): Stock market quotes in newspaper by Andreas Poike

Max MBPS and Mean RT Charts (c) 2015 Silverton Consulting, Inc., All Rights Reserved

Next generation NVM, 3D XPoint from Intel + Micron

cross_point_image_for_photo_capsuleEarlier this week Intel-Micron announced (see webcast here and here)  a new, transistor-less NVM with 1000 time the speed (10µsec access time for NAND) of NAND [~10ns (nano-second) access times] and at 10X the density of DRAM (currently 16Gb/DRAM chip). They call the new technology 3D XPoint™ (cross-point) NVM (non-volatile memory).

In addition to the speed and density advantages, 3D XPoint NVM also doesn’t have the endurance problems associated with todays NAND. Intel and Micron say that it has 1000 the endurance of today’s NAND (MLC NAND endurance is ~3000 write (P/E) cycles).

At that 10X current DRAM density it’s roughly equivalent to todays MLC/TLC NAND capacities/chip. And at 1000 times the speed of NAND, it’s roughly equivalent in performance to DDR4 DRAM. Of course, because it’s non-volatile it should take much less power to use than current DRAM technology, no need for power refresh.

We have talked about the end of NAND before (see The end of NAND is here, maybe). If this is truly more scaleable than NAND it seems to me that the it does signal the end of NAND. It’s just a matter of time before endurance and/or density growth of NAND hits a wall and then 3D XPoint can do everything NAND can do but better, faster and more reliably.

3D XPoint technology

The technology comes from a dual layer design which is divided into columns and at the top and bottom of the columns are accessor connections in an orthogonal pattern that together form a grid to access a single bit of memory.  This also means that 3D Xpoint NVM can be read and written a bit at a time (rather than a “page” at a time with NAND) and doesn’t have to be initialized to 0 to be written like NAND.

The 3D nature of the new NVM comes from the fact that you can build up as many layers as you want of these structures to create more and more NVM cells. The microscopic pillar  between the two layers of wiring include a memory cell and a switch component which allows a bit of data to be accessed (via the switch) and stored/read (memory cell). In the photo above the yellow material is a switch and the green material is a memory cell.

A memory cell operates by a using a bulk property change of the material. Unlike DRAM (floating gates of electrons) or NAND (capacitors to hold memory values). As such it uses all of the material to hold a memory value which should allow 3D XPoint memory cells to scale downwards much better than NAND or DRAM.

Intel and Micron are calling the new 3D XPoint NVM storage AND memory. That is suitable for fast access, non-volatile data storage and non-volatile processor memory.

3D XPoint NVM chips in manufacturing today

First chips with the new technology are being manufactured today at Intel-Micron’s joint manufacturing fab in Idaho. The first chips will supply 128Gb of NVM and uses just two layers of 3D XPoint memory.

Intel and Micron will independently produce system products (read SSDs or NVM memory devices) with the new technology during 2016. They mentioned during the webcast that the technology is expected to be attached (as SSDs) to a PCIe bus and use NVMe as an interface to read and write it. Although if it’s used in a memory application, it might be better attached to the processor memory bus.

The expectation is that the 3D XPoint cost/bit will be somewhere in between NAND and DRAM, i.e. more expensive than NAND but less expensive than DRAM. It’s nice to be the only companies in the world with a new, better storage AND memory technology.

~~~~

Over the last 10 years or so, SSDs (solid state devices) all used NAND technologies of one form or another, but after today SSDs can be made from NAND or 3D XPoint technology.

Some expected uses for the new NVM is in gaming applications (currently storage speed and memory constrained) and for in-memory databases (which are memory size constrained).  There was mention on the webcast of edge analytics as well.

Welcome to the dawn of a new age of computer storage AND memory.

Photo Credits: (c) 2015 Intel and Micron, from Intel’s 3D XPoint website

VMware VSAN 6.0 all-flash & hybrid IO performance at SFD7

We visited with VMware’s VSAN team during last Storage Field Day (SFD7, session available here). The presentation was wide ranging but the last two segments dealt with recent changes to VSAN and at the end provided some performance results for both a hybrid VSAN and an all-Flash VSAN.

Some new features in VSAN 6.0 include:

  • More scaleability, up to 64 hosts in a cluster and up to 200VMs per host
  • New higher performance snapshots & clones
  • Rack awareness for better availability
  • Hardware based checksum for T10 DIF (data integrity feature)
  • Support for blade servers with JBODs
  • All-flash configurations
  • Higher IO performance

Even in the all-flash configuration there are two tiers of storage a write cache tier and a capacity tier of SSDs. These are configured with two different classes of SSDs (high endurance/low-capacity and low-endurance/high capacity).

At the end of the session Christos Karamanolis (@Xtosk), Principal Architect for VSAN showed us some performance charts on VSAN 6.0 hybrid and all-flash configurations.

Hybrid VSAN performance

On the chart we see two plots showing IOmeter performance as VSAN scales across multiple nodes (hosts), on the left  we have  a 100% Read workload and on the right a 70%Read:30%Write workload.

The hybrid VSAN configuration has 4-10Krpm disks and 1-400GB SSD on each host and ranges from 8 to 64 hosts. The bars on the chart show IOmeter IOPS and the line shows the average response time (or IO latency) for each VSAN host configuration. I am not a big fan of IOmeter, as it’s an overly simplified, but that’s what VMware used.

The results show that in a 100% read case the hybrid, 64 host VSAN 6.0 cluster was able to sustain ~3.8M IOPS or over 60K IOPS per host.  or the mixed 70:30 R:W workload VSAN 6.0 was able to sustain ~992K IOPs or ~15.5K IOPS per host.

We see a pretty drastic IOPs degradation (~1/4 the 100% read performance) in the plot on the right, when they added write activity to the mix. But with VSAN’s mirrored data protection each VM write represents at least two VSAN backend writes and at a 70:30 IOmeter R:W this would be ~694K IOPS read and ~298K IOPS write frontend IOs but with mirroring this represents 595K writes to the backend storage.

Then of course, there’s destage activity (data written to SSDs need  to be read off SSD and written to HDD) which also multiplies internal IO operations for every external write IOP. Lets say all that activity multiplies each external write by 6 (3 for each mirror: 1 to the write cache SSD, 1 to read it back and 1 to write to HDD) and we multiply that times the ~298K external write IOPS, it would add up to about a total of ~1.8M write derived IOPS  and ~0.7M read derived IOPS or a total of ~2.5M IOPS but this is still far away from the 3.5M IOPS for 100% read activity. I am probably missing another IO or two in the write path (maybe Virtual to physical mapping structures need to be updated) or have failed to account for more inter-cluster IO activity in support of the writes.

In addition, we see the IO latency was roughly flat across the 100% Read workload at ~2.25msec. and got slightly worse over the 70:30 R:W workload, ranging from ~2.5msec. at 4 hosts to a little over 3.0msec. with 64 hosts. Not sure why this got worse but hosts are scaled up it could induce more inter-cluster overhead.

Rays-pix37

In the chart to the right, we can see similar performance data for systems with one or two disk-groups. The message here is that with two disk groups on a host (2X the disk and SSD resources per host) one can potentially double the performance of the systems, to 116K IOPS/host on 100% read and 31K IOPS/host on a 70:30 R:W workload.

All-flash VSAN performance

Rays-pix41

Here we can see performance data for an 8-host, all-flash VSAN configuration. In this case the chart on the left was a single “disk” group and the chart on the right was a dual disk group, all-flash configuration on each of the 8-hosts. The hosts were configured with 1-400GB and 3-800GB SSDs per disk group.

The various bars on the charts represent different VM working set sizes, 100, 200, 400 & 600GB for the single disk group chart and 100, 200, 400, 800 & 1200GB for dual disk group configurations. For the dual disk group, the 1200GB working set size is much bigger than a cache tier on each host.

The chart text is a bit confusing: the title of each plot says 70% read but the text under the two plots says 100% read. I must assume these were 70:30 R:W workloads. If we just look at the 8 hosts, using a 400GB VM working set size, all-flash VSAN 6.0 single disk group cluster was able to achieve ~37.5K IOPS/host and with two disk groups, the all-flash VSAN 6.0  was able to achieve ~68.75K IOPS/host at the 400GB working set size. Both doubling the hybrid performance.

Response times degrade for both the single and dual disk groups as we increase the working set sizes. It’s pretty hard to see on the two charts but it seems to range from 1.8msec to 2.2msec for the single disk group and 1.8msec to 2.5 msec for the dual disk group. The two charts are somewhat misleading because they didn’t use the exact same working group sizes for the two performance runs but just taking the 100|200|400GB working set sizes, for the single disk group it looks like the latency went from ~1.8msec. to ~2.0msec and for the dual disk group from ~1.8msec to ~2.3msec. Why the higher degradation for the dual disk group is anyone’s guess.

The other thing that doesn’t make much sense is that as you increase the working set size the number of IOPS goes down, worse for the dual disk group than the single. Once again taking just the 100|200|400GB working group sizes this ranges from ~350K IOPS to ~300K IOPS (~15% drop) for the single disk group and ~700K IOPS to ~550K IOPS (~22% drop) for the dual disk group.

Increasing working group sizes should cause additional backend IO as the cache effectivity should be proportionately less as you increase working set size. Which I think goes a long way to explain the degradation in IOPS as you increase working set size. But I would have thought the degradation would have been a proportionally similar for both the single and dual disk groups. The fact that the dual disk group did 7% worse seems to indicate more overhead associated with dual disk groups than single disk groups or perhaps, they were running up against some host controller limits (a single controller supporting both disk groups).

~~~~

At the time (3 months ago) this was the first world-wide look at all-flash VSAN 6.0 performance. The charts are a bit more visible in the video than in my photos (?) and if you want to just see and hear Christos’s performance discussion check out ~11 minutes into the final video segment.

For more information you can also read these other SFD7 blogger posts on VMware’s session:

 

DDN unchains Wolfcreek, unleashes IME and updates WOS

16371098088_3b264f5844_zIt’s not every day that we get a vendor claiming 2.5X the top SPC-1 IOPS (currently held by Hitachi G1000 VSP all flash array at ~2M IOPS) as DataDirect Networks (DDN) has claimed for an all-flash version of their new Wolfcreek hyper converged appliance. DDN says their new 4U appliance is capable of 60GB/sec of throughput and over 5M IOPS. (See their press release for more information.) Unclear if these are SPC-1 IOPS or not, but I haven’t seen any SPC-1 report on it yet.

In addition to the new Wolfcreek appliance, DDN announced their new Infinite Memory Engine™ (IME) flash caching software and WOS® 360 V2.0, an enhanced version of their object storage.

DDN if you haven’t heard of them has done well in the Web 2.0 environments and is a leading supplier to high performance computing (HPC) sites. They have object storage system (WOS), all flash block storage (SFA12KXi), hybrid (disk-SSD) block storage (SFA7700X™ & SFA12KX™), Lustre file appliance (EXAScaler), IBM GPFS™ NAS appliance (GRIDScaler), media server appliance (MEDIAScaler™) and  software defined storage (Storage Fusion Accelerator [SFX™] flash caching software).

Wolfcreek hyper converged appliance

The converged solution comes in a 4U appliance using dual Haswell Intel microprocessors (with up to 18 cores each), includes a PCIe fabric which supports 48-NVMe flash cards or 72-SFF SSDs. With the NVMe or SSDs, Wolfcreek will be using their new IME software to accelerate IO activity.

Wolfcreek IME software supports either burst mode IO caching cluster or a storage cluster of nodes. I assume burst mode is a storage caching layer for backend file system stoorage. As a storage cluster I assume this would include some of their scale-out file system software on the nodes. Wolfcreek cluster interconnect is 40Gb Infiniband or 10/40Gb Ethernet and also will support Intel’s Omni-Path. The Wolfcreek appliance is compatible with HPC Lustre and IBM GPFS scale out file systems.

Wolfcreek appliance can be a great platform for OpenStack and Hadoop environments. But it also supports virtual machine hypervisors from VMware, Citrix and Microsoft. DDN says that the Wolfcreek appliance can scale up to support 100K VMs. I’ve been told that IME will not be targeted to work with Hypervisors in the first release.

Recall that with a hyper converged appliance, some portion of the system resources (memory and CPU cores) must be devoted to server and VM application activities and the remainder to storage activity. How this is divided up and whether this split is dynamic (changes over time) or static (fixed over time) in the Wolfcreek appliance is not indicated.

The hyper converged field is getting crowded of late what with VMware EVO:RAIL, Nutanix, ScaleComputing, Simplivity and others coming out with solutions. But there aren’t many that support all-flash storage and it seems unusual that hyper converged customers would have need for that much IO performance. But I could be wrong, especially for HPC customers.

There’s much more to hyper convergence than just having storage and compute in the same node. The software that links it all together, manages, monitors and deploys these combined hypervisor, storage and server systems is almost as important as any of the  hardware. There wasn’t much talk about the software that DDN is putting together for Wolfcreek but it’s still early yet. With their roots in HPC, it’s likely that any DDN hyper converged solution will target this market first and broaden out from there.

Infinite Memory Engine (IME)

IME is an outgrowth of DDN’s SFX software and seem to act as a caching layer for parallel file system IO. It makes use of NVMe or SSDs for its IO caching. And according to DDN can offer up to 1000X IO acceleration to storage or 100X file system acceleration.

It does this primarily by providing an application aware IO caching layer and supplying more effective IO to the file system layer using PCIe NVMe or SSD flash storage for hardware IO acceleration. According to the information provided by DDN, IME can provide 50 GB/sec bandwidth to a host compute cluster while only doing 4GB/sec of throughput to a backend file system, presumably by better caching of file IO.

WOS 360 V2.0

The new WOS 360 V2.0 object storage system features include

  • Higher density storage package with 98-8TB SATA drives or 768TB raw capacity in 4U) supporting 8B objects each and over 100B objects in a cluster.
  • Native SWIFT API support for OpenStack environments  which includes gateway or embedded deployments, up to 5000 concurrent users and 5B objects/namespace.
  • Global ObjectAssure data encoding with lower storage overhead (1.5x or a 20% reduction from their previous encoding option) for highly durable and available object storage usiing a two level hierarchical erasure code for object storage.
  • Enhanced network security with SSL  which provides end-to-end SSL network data transport between clients and WOS and betweenWOS storage nodes.
  • Simplified cluster installation, deployment and maintenance with can now deploy a WOS cluster in minutes, with a simple point and click GUI for installation and cluster deployment with automated non-disruptive software upgrade.
  • Performance improvements for better video streaming, content distribution and large file transfers with improved QoS for latency sensitive applications.

~~~~

Probably more going on with DDN than covered here but this hits the highlights. I wish there was more on their Wolfcreek appliance and its various configurations and performance benchmarks but there’s not.

Comments?

 Photo Credits: wolf-63503+1920 by _Liquid

 

Springpath SDS springs forth

Springpath presented at SFD7 and has a new Software Defined Storage (SDS) that attempts to provide the richness of enterprise storage in a SDS solution running on commodity hardware. I would encourage you to watch the SFD7 video stream if you want to learn more about them.

HALO software

Their core storage architecture is called HALO which stands for Hardware Agnostic Log-structured Object store. We have discussed log-structured file systems before. They are essentially a sequential file that can be randomly accessed (read) but are sequentially written. Springpath HALO was written from scratch, operates in user space and unlike many SDS solutions, has no dependencies on Linux file systems.

HALO supports both data deduplication and compression to reduce storage footprint. The other unusual feature  is that they support both blade servers and standalone (rack) servers as storage/compute nodes.

Tiers of storage

Each storage node can optionally have SSDs as a persistent cache, holding write data and metadata log. Storage nodes can also hold disk drives used as a persistent final tier of storage. For blade servers, with limited drive slots, one can configure blades as part of a caching tier by using SSDs or PCIe Flash.

All data is written to the (replicated) caching tier before the host is signaled the operation is complete. Write data is destaged from the caching tier to capacity tier over time, as the caching tier fills up. Data reduction (compression/deduplication) is done at destage.

The caching tier also holds read cached data that is frequently read. The caching tier also has a non-persistent segment in server RAM.

Write data is distributed across caching nodes via a hashing mechanism which allocates portions of an address space across nodes. But during cache destage, the data can be independently spread and replicated across any capacity node, based on node free space available.  This is made possible by their file system meta-data information.

The capacity tier is split up into data and a meta-data partitions. Meta-data is also present in the caching tier. Data is deduplicated and compressed at destage, but when read back into cache it’s de-compressed only. Both capacity tier and caching tier nodes can have different capacities.

HALO has some specific optimizations for flash writing which includes always writing a full SSD/NAND page and using TRIM commands to free up flash pages that are no longer being used.

HALO SDS packaging under different Hypervisors

In Linux & OpenStack environments they run the whole storage stack in Docker containers primarily for image management/deployment, including rolling upgrade management.

In VMware and HyperVM, Springpath runs as a VM and uses direct path IO to access the storage. For VMware Springpath looks like an NFSv3 datastore with VAAI and VVOL support. In Hyper-V Springpath’s SDS is an SMB storage device.

For KVM its an NFS storage, for OpenStack one can use NFS or they have a CINDER plugin for volume support.

The nice thing about Springpath is you can build a cluster of storage nodes that consists of VMware, HyperV and bare metal Linux nodes that supports all of them. (Does this mean it’s multi protocol, supporting SMB for Hyper-V, NFSv3 for VMware?)

HALO internals

Springpath supports (mostly) file, block (via Cinder driver) and object access protocols. Backend caching and capacity tier all uses a log structured file structure internally to stripe data across all the capacity and caching nodes.  Data compression works very well with log structured file systems.

All customer data is supported internally as objects. HALO has a write-log which is spread across their caching tier and a capacity-log which is spread across the capacity tier.

Data is automatically re-balanced across nodes when new nodes are added or old nodes deleted from the cluster.

Data is protected via replication. The system uses a minimum of 3 SSD nodes and 3 drive (capacity) nodes but these can reside on the same servers to be fully operational. However, the replication factor can be configured to be less than 3 if you’re willing to live with the potential loss of data.

Their system supports both snapshots (2**64 times/object) and storage clones for test dev and backup requirements.

Springpath seems to have quite a lot of functionality for a SDS. Although, native FC & iSCSI support is lacking. For a file based, SDS for hypbervisors, it seems to have a lot of the bases covered.

Comments?

Other SFD7 blogger posts on Springpath:

Picture credit(s): Architectural layout (from SpringpathInc.com) 

Nanterro emerges from stealth with CNT based NRAM

512px-Types_of_Carbon_NanotubesNanterro just came out of stealth this week and bagged $31.5M in a Series E funding round. Apparently, Nanterro has been developing a new form of non-volatile RAM (NRAM), based on Carbon Nanotubes (CNT), which seems to work like an old T-bar switch, only in the NM sphere and using CNT for the wiring.

They were founded in 2001, and are finally  ready to emerge from stealth. Nanterro already has 175+ issued patents, with another 200 patents pending. The NRAM is currently in production at 7 CMOS fabs already and they are sampling 4Mb NRAM chips  to a number of customers.

NRAM vs. NAND

Performance of the NRAM is on a par with DRAM (~100 times faster than NAND), can be configured in 3D and supports MLC (multi-bits per cell) configurations.  NRAM also supports orders of magnitude more (assume they mean writes) accesses and stores data much longer than NAND.

The only question is the capacity, with shipping NAND on the order of 200Gb, NRAM is  about 2**14X behind NAND. Nanterre claims that their CNT-NRAM CMOS process can be scaled down to <5nm. Which is one or two generations below the current NAND scale factor and assuming they can pack as many bits in the same area, should be able to compete well with NAND.They claim that their NRAM technology is capable of Terabit capacities (assumed to be at the 5nm node).

The other nice thing is that Nanterro says the new NRAM uses less power than DRAM, which means that in addition to attaining higher capacities, DRAM like access times, it will also reduce power consumption.

It seems a natural for mobile applications. The press release claims it was already tested in space and there are customers looking at the technology for automobiles. The company claims the total addressable market is ~$170B USD. Which probably includes DRAM and NAND together.

CNT in CMOS chips?

Key to Nanterro’s technology was incorporating the use of CNT in CMOS processes, so that chips can be manufactured on current fab lines. It’s probably just the start of the use of CNT in electronic chips but it’s one that could potentially pay for the technology development many times over. CNT has a number of characteristics which would be beneficial to other electronic circuitry beyond NRAM.

How quickly they can ramp the capacity up from 4Mb seems to be a significant factor. Which is no doubt, why they went out for Series E funding.

So we have another new non-volatile memory technology.On the other hand, these guys seem to be a long ways away from the lab, with something that works today and the potential to go all the way down to 5nm.

It should interesting as the other NV technologies start to emerge to see which one generates sufficient market traction to succeed in the long run. Especially as NAND doesn’t seem to be slowing down much.

Comments?

Picture Credits: Wikimedia.com

EMCWorld2015 day 1 news

We are at EMCWorld2015 in Vegas this week. Day 1 was great with new XtremIO 4.0, “The Beast”, new enhanced Data Protection, and a new VCE VxRACK converged infrastructure solution announcements. Somewhere in all the hoopla I saw an all flash VNXe appliance and VMAX3 with a cloud storage tier but these seemed to be just teasers.

XtremIO 4.0

The new hardware provides 40TB per X-brick and with compression/dedupe and the new 8-Xbrick cluster provides 320TB raw or 1.9PB effective capacity. As XtremIO supports 150K mixed IOPS/XBrick, an 8-Xbrick cluster could do 1.2M IOPS or with 250K read IOPS/Xbrick that’s 2.0M IOPS.

XtremIO 4.0 now also includes RecoverPoint integration. (I assume this means they have integrated the write splitter directly into XtremIO that way you don’t need the host version or the switch version of the write splitter.)

The other thing XtremIO 4.0 introduces is non-disruptive upgrades. This means that they can expand or contract the cluster without taking down IO activity.

There was also some mention of better application consistent snapshots, which I suspect means Microsoft VSS integration.

XtremIO 4.0 is a free software upgrade, so the ability to scale up to 8-Xbricks and non-disruptive cluster changes, and RecoverPoint integration can all be added to current XtremIO systems.

Data Protection

EMC introduced a new top end DataDomain hardware appliance the DataDomain 9500, which has 1.5X the performance (58.7TB/hr) and 4X the capacity (1.7PB) of their nearest competitor solution.

They also added a new software feature (from Maginetics) called CloudBoost™.  CloudBoost allows Networker and Avamar to backup to cloud storage. EMC also added Microsoft Ofc365 cloud backup to Spannings previous Google Apps and SalesForce cloud backups.

VMAX3 Protect Point was also enhanced to provide native backup for Oracle, Microsoft SQL Server, and IBM DB2 application environments. ProtectPoint offers a direct path between VMAX3 and  DataDomain appliances and can speed up backup performance by 20X.

EMC also announced Project Falcon which is a virtual appliance version of DataDomain software

VCE VxRACK

This is a rack sized, stack of VSPEX Blue appliances (a VMware EVO:RAIL solution) with new software to bring the VCE useability and data center scale services to a hyper-converged solution. Each appliance is a 2U rack mounted compute intensive or storage intensive unit. The Blue appliances are configureed in a rack for VxRACK and with version 1 you can use VMware or KVM as a chose your own hypervisor. Version 2 will come out later this year and will be based on a complete VMware stack known as EVO: RACK.

Storage services are supplied by EMC ScaleIO. You can purchase a 1/4 rack, 1/2  rack or full rack which includes top of rack networking. You can also scale out by adding more full racks to the system. EMC said that it can technically support 1000s of racks VSPEX Blue appliances for up to ~38PB of storage.

The significant thing is that the VCE VxRACK supplies the VCE customer experience, in a hyper converged solution. However, the focus for VxRACK is tier 2 applications that don’t have a need for the extremely high availability, low response times and high performance of tier 1 applications that run on their VBLOCK solutions (with VNX, VMAX or XtremIO storage).

VMAX3

They had a 5th grader provision an VMAX3 gold storage (LUN) and convert it to a diamond storage (LUN) in 20.48 seconds. It seemed pretty simple to me but the kid blazed through the screens a bit fast for me to see what was going on. It wasn’t nearly as complex as it used to be.

VMAX3 also introduces CloudArray™, which uses FastX storage tiering to cloud storage (using onboard TwinStrata software). This could be used as a tier 3 or 4 storage. EMC also mentioned that you can have an XtremIO (maybe an Xbrick) behind a VMAX3 storage system. VMAX3’s software rewrite has separated data services from backend storage and one can see EMC rolling out different backend storage (like cloud storage or XtremIO) in future offerings.

Other Notes

There was a lot of discussion about the “Information Generation” a new customer for IT services. This is tied to the 3rd platform transformation that’s happening in the industry today. To address this new world IT needs to have 5 attributes:

  1. Predictively spot new opportunities for services/products
  2. Deliver a personalized experience
  3. Innovate in an agile way
  4. Develop trusted programs/apps Demonstrate transparency & trust
  5. Operate in real time

David Goulden talked a lot about what this all means and I encourage you to take a look at the video stream to learn more.

Speaking of video last year was the first year there were more online viewers of EMCWorld than actual participants. So this year EMC upped their game with more entertainment value. The opening dance sequence was pretty impressive.

A lot of talk today was on 3rd platform and the transition from 2nd platform. EMC says their new products are Platform 2.5 which are enablers for 3rd platform. I asked the question what the 3rd platform storage environment looks like and they said scale-out (read ScaleIO) converged storage environment with flash for meta-data/indexing.

As the 3rd platform transforms IT there will be some customers that will want to own the infrastructure, some that will want to use service providers and some that will use public cloud services. EMC’s hope is to capture those customers that want to own it or use service providers.

Tomorrow the focus will be on the Federation with Pivotal and VMware being up for keynotes and other sessions. Stay tuned.

 

 

Object store and hybrid clouds at Cloudian

IMG_4364Out of Japan comes another object storage provider called Cloudian.  We talked with Cloudian at Storage Field Day 7 (SFD7) last month in San Jose (see the videos of their presentations here).

Cloudian history

Cloudian has been out on the market since March of 2011 but we haven’t heard much about them, probably because their focus has been East Asia.  The same day that the  Tōhoku Earthquake and Tsunami hit the company announced Cloudian, an Amazon S3 Compliant Multi-Tenant Cloud Storage solution.

Their timing couldn’t have been better. Japanese IT organizations were beating down their door over the next two years for a useable and (earthquake and tsunami) resilient storage solution.

Cloudian spent the next 2 years, hardening their object storage system, the HyperStore, and now they are ready to take on the rest of the world.

Currently Cloudian has about 20PB of storage under management and are shipping a HyperStore Appliance or a software only distribution of their solution. Cloudian’s solutions support S3 and NFS access protocols.

Their solution uses Cassandra, a highly scaleable, distributed NoSQL database which came out of FaceBook for their meta-data database. This provides a scaleable, non-sharable meta-data data base for object meta-data repository and lookup.

Cloudian creates virtual storage pools on backend storage which can be optimized for small objects, replication or erasure coding and can include automatic tiering to any Amazon S3 and Glacier compatible cloud storage. I would guess this is how they qualify for Hybrid Cloud status.

The HyperStore appliance

Cloudian creates a HyperStore P2P ring structure. Each appliance has Cloudian management console services as well as the HyperStore engine which supports three different data stores: Cassandra, Replicas, and Erasure coding. Unlike Scality, it appears as if one HyperStore Ring must exist in a region. But it can be split across data centers. Unclear what their definition of a “region” is.

HyperStore hardware come in entry level (HSA-2024: 24TB/1U), capacity optimized (HSA-2048: 48TB/1U), performance optimized (HSA-2060: all flash, 60TB/2U

Replication with Dynamic Consistency

The other thing that Cloudian supports is different levels of consistency for replicated data. Most object stores support eventual consistency (see Eventual Data Consistency and Cloud Storage post).  HyperStore supports 3 (well maybe 5) different levels of consistency:

  1. One – object written to one replica and committed there before responding to client
  2. Quorum – object written to N/2+1 replicas before responding to client
    1. Local Quorum – replicas are written to N/2+1 nodes in same data center  before responding to client
    2. Each Quorum – replicas are written to N/2+1 nodes in each data center before responding to client.
  3. All – all replicas must have received and committed the object write before responding to client

There are corresponding read consistency levels as well. The objects writes have a “coordinator” node which handles this consistency. The implication is that consistency could be established on an object basis. Unclear to me whether Read and Write dynamic consistency can be different?

Apparently small objects are also stored in the  Cassandra datastore.  That way HyperStore optimizes for object size. Also, HyperStore nodes can be added to a ring and the system will auto balance the data across the old and new nodes automatically.

Cloudian also support object versioning, ACL, and QoS services as well.

~~~

I was a bit surprised by Cloudian. I thought I knew all the object storage solutions out on the market. But then again they made their major strides in Asia and as an on-premises Amazon S3 solution, rather than a generic object store.

For more on Cloudian from SFD7 see:

Cloudian – Storage Field Day 7 Preview by @VirtualizedGeek (Keith Townsend)