IBM® has recently announced an addition to their popular XIV system that uses a SSD as a caching layer for the storage.
IBM XIV business
Over 5200 storage systems have been shipped to date and XIV now accounts for over 1300 new clients to IBM storage. There has been a rapid adoption of the latest version with Gen3 systems constituting 80% of all XIV capacity sold in 4Q11. There are now 59 clients with over 1PB of usable XIV storage and 16 clients with over 2PB of usable XIV storage. IBM has now sold ~350PB of XIV storage since acquisition.
XIV’s new SSD caching
The design goals for XIV SSD was that it had to be
- Invisible allowing wide deployment without increasing complexity, allowing customers to focus on capacity and services not disk technologies,
- Affordable supplying cost effective SSD acceleration for the data center, and
- Reliable providing 99.999% reliability.
The Gen 3 XIV storage systems shipped last fall contain an externally accessible PCIe slot now used to hold a 400GB SSD in Gen3.1. SSDs must be added to all nodes in the XIV system providing up to 6TB of SSD caching. XIV SSDs can be ordered direct from the factory or can be non-disruptively upgradeable on customer premises.
The new XIV SSD is used as a read cache only. Specifically,
- For read requests: XIV’s DRAM cache is checked first and if not there then thew SSD cache is checked. If the data is present in SSD (SSD read hit), the IO request is transferred directly from SSD to the front end. If the data is not in SSD (SSD read miss), then the IO request is routed to disk but a copy of the data is placed in an internal (DRAM) buffer. When this buffer reaches 512KB it is copied to SSD. Thus the SSD is written in big contiguous blocks of data.
- For write requests: all XIV write data goes directly to DRAM, SSDs are not used during write IO. But as data is destaged from DRAM to disk it will be also be copied to SSD.
Recall that XIV uses a RAID-1 implementation so all data is replicated to two XIV storage nodes. However, SSD caching is only used for the primary copy, in the node used to service IO operations for the data. Thus, data is never in two SSD caches.
In addition, large sequential read IO (>64KB) bypasses SSDs as it is more efficient to read large amounts of sequential data directly off of disk. As such, large sequential read requests never consume SSD space.
Also SSD space is allocated using a 4K-slot size and is written as a log-structured device. Thus, SSD data is never overwritten which eliminates wear-leveling logic required by other SSD drives and minimizes garbage collection. All this saves on NAND lifetime, which has allowed IBM to use lower cost, MLC NAND based SSDs rather than the more expensive SLC NAND SSD used by other enterprise storage systems.
As for the user interface, the customer can enable or disable SSD caching on a volume- by-volume basis with a simple click in the easy to use XIV GUI.
Other Gen 3.1 enhancements
XIV now provides inter-generational replication, i.e., XIV Gen 2 can replicate to XIV Gen 3 or vice versa.
Gen3 had provided an iPad mobile dashboard for limited administration capabilities, and with Gen3.1 this has been expanded to support iPhone. XIV’s mobile dashboard uses a secure SSL connection (potentially a VPN) that supports real time monitoring for system capacity, health and performance.
- Capacity information includes soft, hard and used storage utilization.
- Performance data includes IOPs, latency and bandwidth per each volume and host graphed for the last 2 minutes.
- Health information supplies overall health status for the XIV system.
Both IOS Apps are available free of charge from Apple’s App store.
XIV is a bit late to the SSD party, but by using SSD as a cache, they have taken steps to make it easy and economical to deploy. However, there is not a lot of tuning provided in this approach. Of course, this may be by design but there are some cache-unfriendly applications that can still need quick response time but may not benefit from SSD caching.
On the other hand, it’s unclear why XIV went with a SSD rather than a PCIe NAND card. Not sure it makes much of a difference as they are both plugged directly into a PCIe slot. However, as an SSD something there is generating IO requests to the SSD rather than accessing it as a memory device.
Better late than never. SSD caching is a proven technology used by NetApp, EMC and others. The advantage of caching is that performance boosts can be readily applied across all the data and there is no need to support auto-tiering. But caching alone may not suffice for all data, which is why NetApp and EMC also support SSDs as data drives.
[This storage announcement dispatch was originally sent out to our newsletter subscribers in Febuary of 2012. If you would like to receive this information via email please consider signing up for our free monthly newsletter (see subscription request, above right) and we will send our current issue along with download instructions for this and other reports.]