IO500 performance report as of September 2020

In 10 node performance, Any node count performance, BeeGFS, DDN, Intel, IO5000, Lustre, NVIDIA, WekaIO by AdministratorLeave a Comment

This Storage Intelligence (StorInt™) dispatch covers the IO500 benchmark , a first for us. The IO500 is a relatively new benchmark focused on HPC (high performance computing) application use of file systems. Many HPC file systems lack support for NFS or SMB and instead use POSIX client software to access data residing on their file systems. The IO500 maintains two lists of submissions, one that allows systems with any (client) node count and another which limits client node counts to 10.

IO500 ranks submissions using a composite score that is a function of 4 bandwidth intensive (IOR easy read, easy write, hard read, and hard write) and 8 IOPs intensive tests (MD easy write, stat, & delete, MD hard write, stat, delete,& read, and easy find). IOR workloads simulate big block, bandwidth intensive IO activity and MD (and find) workloads simulate small block, IOPS intensive IO activity. Both are used to score systems. .

The IO500 reports and ranks systems by computing a composite (geomean) of system scores on bandwidth and IOPs intensive workloads. The bandwidth extensive composite score is a geomean of all 4 bandwidth intensive benchmarks and the IOPS composite score is a geomean of all 8 IOPS intensive benchmarks.

IO500 any number of client node count results

We start our discussion with the overall composite scores as reported by IO500.

Figure 1 Top 10 IO500 any client node count composite score results

In Figure 1, Intel’s DAOS (Distributed Architecture Object Storage) file system came in 1st, 3rd and 4th with 52, 60 and 16 client nodes, and using Omnipath, Infiniband and 100GbE networking, respectively. All these systems use Intel’s Optane PMEM for metadata and small file storage. DAOS is an Intel developed, open source file system designed to take advantage of Intel’s latest technologies. DAOS only supports SSD and PMEM storage and these three systems had 60, 24 and 12 SSDs, and were submitted by Wolf Intel, Frontera TACC (Texas Advanced Computing Center) and Presque Argonne National Laboratory (ANL).

A WekaIO submitted system using WekaIO file system running on AWS came in at #2. It used 345 client nodes, Ethernet networking and had 960 NVMe SSDs. Running on AWS infrastructure probably had some (negative) impact on its performance.

Coming in at #5 was a Tianhe -2E National Supercomputing Center in Changsha submitted Tianhe-FS (Lustre based) file system, with 480 client nodes, Omnipath networking and had 572 SSDs.

Next, in Figure 2, we drill down into one of the two components of the IO500 composite score, the composite score for all the bandwidth (BW) intensive workloads.

Figure 2 IO500 any node count bandwidth only composite score results

In Figure 2, we show the top 10 bandwidth composite score results. Recall that the IO500 score is made up of two composites the bandwidth intensive composite score (shown here) and the IOPS composite score (not shown).

For bandwidth workloads the #1 system is the Nurion KISTI DDN IME storage system, which achieved 516 GiB/Sec using 2048 client nodes, Omnipath networking and 768 SSDs. Another DDN file system came in at #3 with 349 GiB/Sec using 512 client nodes, Omnipath networking and 1200 SSDs.

The Wolf-Intel DAOS solution ranked #2 on bandwidth with 372 GiB/sec. We discussed it’s configuration above.

The #4 ranked system with 293 GiB/Sec, was a BeeGFS submitted system running on Oracle cloud infrastructure using a BeeGFS file system with 270 client nodes, Ethernet networking and had 270 (we believe) SSDs.

The #5 system with 209 GiB/Sec in bandwidth was the Tianhe-2E submitted Tianhe-FS (Lustre based) file system, discussed previously.

We would show the MD (IOPs) composite score rankings but it would show the exact same rankings as Figure 1 with different numbers.
IO500 10 client node results

Figure 3 IO500 10 node count composite score results

In Figure 3, we show the 10-client node IO500 composite score rankings. This time Intel DAOS systems took #1, 2 and 3 rankings in results all with 10 client nodes, submitted by Wolf, TACC and ANL, respectively.

The #4 10-node score was from a DGX-2H Superpod NVIDIA DDN Lustre file system with 10 client nodes (we have no other configuration information).

The #5 ranked system is the WekaIO file system with 10 client nodes (we believe not running in AWS).

Significance

IO500 represents a unique set of benchmarks for HPC file systems that differs substantially from the SPEC sfs benchmarks we have been tracked for 13 years now. The problem with SPEC sfs is that there have been few new submissions each quarter since they updated to sfs2014. The IO500 updates their list twice a year ( during ISC & SC conferences). So we will be adding IO500 to our standard round of benchmarks which we report on and analyze.

Unclear why the IO500 uses so many (4 bandwidth and 8 IOP) workloads. SPEC sfs seems to get by with just 5. It’s quite possible that HPC has a more diverse workload environment than standard IT. Nonetheless, we like to see storage performance results that measures both bandwidth and IOPS. It’s almost like a combined SPC-1 and SPC-2 benchmarks that use the same hardware.

This is our first attempt at analyzing IO500 results. We thought it best to report on standard IO500 metrics. We have identified a few different ways to cut their data but decided to save those for a later report.

We found the IO500 performance data pretty specific but would like to see more consistent configuration information. For some submissions it was impossible to understand their file system, SSD/disk drive counts, and other information. We believe this will be corrected in time.

[This storage performance was originally sent out to our newsletter subscribers in September of 2020.  If you would like to receive this information via email please consider signing up for our free monthly newsletter (see subscription request, above right) and we will send our current issue along with download instructions for this and other reports. Dispatches are posted to our website at least a quarter or more after they are sent to our subscribers. ]

Silverton Consulting, Inc., is a U.S.-based Storage, Strategy & Systems consulting firm offering products and services to the data storage community

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.