This Storage Intelligence (StorInt™) dispatch covers the IO500 benchmark. The IO500 is a benchmark focused on HPC (high performance computing) workloads file systems. Many HPC file systems lack support for NFS or SMB and instead use POSIX to access file data . The IO500 maintains two lists of submissions, one that allows systems with any number of (client) nodes and the other which limits systems to only 10 client nodes.
The Virtual Institute for IO (VI4IO), the group that organizes the IO500 benchmarks, ranks submissions using a composite score that is a function of 4 bandwidth (IOR) intensive workloads (easy read, easy write, hard read, and hard write) and 8 metadata (mdtest) intensive workloads (easy write, stat, & delete, hard write, stat, delete,& read, and easy find). IO500 IOR benchmark simulates big block, bandwidth intensive traditional HPC file IO activity and mdtest simulate small block, IO activity. Both are used to score systems.
The IO500’s composite score is a geomean of system scores on IOR bandwidth and mdtest metadata intensive workloads. The IOR bandwidth composite score is a geomean of all 4 bandwidth intensive benchmarks and the MD composite score is a geomean of all 8 metadata intensive benchmarks.
IO500 performance results
We start our discussion with the overall composite scores as reported by IO500 for submissions with any number of client nodes, in Figure 1
Figure 1 Top 10 IO500 any client node count composite score results
In Figure 1, Pengcheng CloudBrain-II on Atlas 900 using MadFS clobbered all the competition by almost 4X as the new number 1 with a composite score of 7044 running 255 client nodes, 254 storage nodes with 6 2TB NVMe SSDs each (1524 total SSDS) and 100GbE networking.
Intel’s DAOS (Distributed Architecture Object Storage) file system running at a number of labs/locations, placed at #2-3, 5 and 7 running 52, 60, 16 and 10 client node counts respectively, using various networking technology. DAOS’s 10 node count submission was new. The only other recent submission was the #10 Oakforest-PACS JCAHPC DDN IME submission, with a composite score of 254 using 2048 client nodes.
There’s not much information about Pengcheng’s CloudBrain-II and its MadFS file system. It was developed by Tsinghua University in Beijing and their website describes MadFS as “a supercomputing burst buffer file system”. Wikipedia defines burst buffer systems as being used in HPC for an intermediate layer between front-end processes and backend storage systems. It almost sounds like MadFS is using the NVMe SSDs or DRAM memory as a caching layer for some other backend file storage.
Pengcheng’s submission didn’t appear to have any information on metadata servers. So our assumption is that they didn’t have any separate metadata systems.
Next, in Figure 2, we drill down into one of the two components of the IO500 composite score, the composite score for all 4 IOR bandwidth (BW) intensive workloads.
Figure 2 IO500 any node count bandwidth only composite score results
In Figure 2, we show the top 10 bandwidth composite score results one of the two components of the IO500 score. For IOR bandwidth workloads, the #1 system is once again the Pengcheng CloudBrain-II MadFS system which achieved 1476 GiB/Sec. The new Oakforest DDN IME system came in at #2 with 697 GiB/Sec. Although the Pengcheng system was once again dominant in the IOR bandwidth workloads, it only achieved a little over 2X its nearest competition.
The only other new submission that appears on the chart in Figure 2 is the #9 ranked hyperwall4 NASA Intel, SK-Hynx SSD system which was running the BeeGFS file system across 128 client nodes, 132 storage nodes with 2 NVMe SSDS each (total 264 SSDs) using InfiniBAND networking. They also supplied no information on metadata nodes, so we conclude they didn’t have any separate metadata nodes.
Although we do not show the chart, one can conclude that the Pengcheng CloudBrain-II MadFS system did extremely well on the mdtest workloads. You would be correct having achieved another #1 ranking with over 33.6K composite score on mdtest 8 benchmarks. The nearest competitor (previously tested, #2 Wolf Intel DAOS) only managed 8.7K score on mdtest or ~4X lower than the Pengcheng system.
IO500 10 client node results
Figure 3 IO500 10 node count composite score results
In Figure 3, we show the 10-client node IO500 composite score rankings. Again the Pengcheng CloudBrain-II MadFS system came in first, with a composite score of 1130. Aside from the 10 client nodes, the Pengcheng submission was running 50 storage nodes with 6 2TB NVMe SSDs (300 SSDs total) and again used 100GbE networking. The #2 Wolf INTEL DAOS 10 client node system achieved a score of 759. Note, at 10 client node counts, the Pengcheng system was not as dominant, achieving only a little under 1.5X the score as the best competitor system.
There were 4 other new submissions here, previously mentioned #2 ranked, Intel DAOS 10 node system with a score of 759, ranked at #7 & 8 two new GekkoFS systems (NextGENIO EPPC BSC & JGU, MOGONII Johannes Gutenberg University Mainz JGU & BSC [NextGENIO[) with scores of 239 and 168 respectively and ranked at #9 another new DIME DDN IME system with a score of 162.
Both NextGENIO GekkoFS systems had separate metadata servers. And GekkoFS is also described as a burst buffer system. The #7 ranked system used NVMe SSDs in their metadata and data storage servers and similarly, the #8 ranked system used SATA SSDs for metadata and data storage. They both used Omnipath networking.
IO500 unique benchmarks for HPC differs substantially from other file system benchmarks we report on. The IO500 updates their list twice a year (during ISC & SC conferences).
This is only our second attempt at analyzing IO500 results. We thought it best to report on standard IO500 metrics. We have identified a few different ways to cut their data creating SCI computed metrics but decided to save those for a later report.
[This storage performance was originally sent out to our newsletter subscribers in November of 2020. If you would like to receive this information via email please consider signing up for our free monthly newsletter (see subscription request, above right) and we will send our current issue along with download instructions for this and other reports. Dispatches are posted to our website at least a quarter or more after they are sent to our subscribers. ]
Silverton Consulting, Inc., is a U.S.-based Storage, Strategy & Systems consulting firm offering products and services to the data storage community