A week or so ago I was reading a DDN press release that said they had just published a STAC-M3 benchmark. I had never heard of STAC or their STAC-M3 so thought I would look them up.
STAC stands for Securities Technology Analyst Center and is an organization dedicated to testing system solutions for the financial services industries.
What does STAC-M3 do
It turns out that STAC-M3 simulates processing a time (ticker tape or tick) log of security transactions and identifyies the maximum and weighted bid along with various other statistics for a number (1%) of securities over various time periods (year, quarter, week, and day) in the dataset. They call it high-speed analytics on time-series data. This is a frequent use case for systems in the securities sector.
There are two versions of the STAC-M3 benchmark: Antuco and Kanaga. The Antuco version uses a statically sized dataset and the Kanaga uses more scaleable (varying number of client threads) queries over larger datasets. For example, the Antuco version uses 1 or 10 client threads for their test measurements whereas the Kanaga version scales client threads, in some cases, from 1 to 100 threads and uses more tick data in 3 different sized datasets.
Good, bad and ugly of STAC-M3
Access to STAC-M3 reports requires a login but it’s available for free. Some details are only available after you request them which can be combersome.
One nice thing about the STAC-M3 benchmark information is that it provides a decent summary of the amount of computational time involved in all the queries it performs. From a storage perspective, if one were to take this and just analyze the queries with minimal or light computation that come closer to a pure storage workload than computationally heavy queries.
Another nice thing about the STAC-M3 is that it in some cases it provides detailed statistical information about the distribution of metrics, including mean, median, min, max and standard deviation. Unfortunately, the current version of the STAC-M3 does not provide these statistics for the computational light measures that are of primary interest to me as a storage analyst. It would be very nice to see some of their statistical level reporting be adopted by SPC, SPECsfs or Microsoft ESRP for their benchmark metrics.
STAC-M3 also provides a measure of storage efficiency, or how much storage it took to store the database. This is computed as the reference size of the dataset divided by the amount of storage it took to store the dataset. Although this could be interesting most of the benchmark reports I examined all had similar numbers for storage efficiency 171% or 172%.
The STAC-M3 benchmark is a full stack test. That is it measures the time from the point a query is issued to the point the query response is completed. Storage is just one part of this activity, computing the various statistics is another part and the database used to hold the stock tick data is another aspect of their test environment. But what is being measured is the query elapsed time. SPECsfs2014 has also recently changed over to be a full stack test, so it’s not that unusual anymore.
The other problem from a storage perspective (but not a STAC perspective) is that there is minimal write activity during any of the benchmark specific testing. There’s just one query that generates a lot of storage write activity all the rest are heavy read IO only.
Finally, there’s not a lot of description of the actual storage and server configuration available in the basic report. But this might be further detailed in the Configuration Disclosure report which you have to request permission to see.
STAC-M3 storage submissions
As it’s a stack benchmark we don’t find a lot of storage array submissions. Typical submissions include a database running on some servers with SSD-DAS or occasionally a backend storage array. In the case of DDN it was KX system’s kdb 3.2 database, with Intel Enterprise Edition servers for Lustre, with 8 Intel based DDN EXAscaler servers, talking to a DDN SFA12KX-40 storage array. In contrast, a previousr submission used an eXtremeDB Financial Edition 6.0 database running on Lucera Compute™ (16 Core SSD V2, Smart OS) nodes.
Looking back over the last couple of years of submissions (STAC-M3 goes back to 2010), forstorage arrays, aside from the latest DDN SFA12KX-40, we find an IBM FlashSystem 840, NetApp EF550, IBM FlashSystem 810, a couple of other DDN SFA12K-40 storage solutions, and a Violin V3210 & V3220 submission. Most of the storage array submissions were all-flash arrays, but the DDN SFA12KX-40 is a hybrid (flash-disk) appliance.
Some metrics from recent STAC-M3 storage array runs
In the above chart, we show the Maximum MBPS achieved in the year long high bid (YRHIBID) extraction testcase. DDN won this one handily with over 10GB/sec for its extraction result.
However in the next chart, we show the Mean Response (query elapsed) Time (in msec.) for the query that extracts the Year long High Bid data extraction test (YRHIBID). In this case the IBM FlashSystem 810 did much better than the DDN or even the more recent IBM FlashSystem 840.
Unexpectedly, the top MBPS storage didn’t achieve the best mean response time for the YRHIBID query. I would have thought the mean response time and the maximum MBPS would show the same rankings. Not clear to me why this is, it’s the mean response time not minimum or maximum. Although the maximum response time would show the same rankings. An issue with the YRHIBID is that it doesn’t report standard deviation, median or minimum response time results. Having these other metrics might have shed some more light on this anomaly but for now this is all we have.
If anyone knows of other storage (or stack) level benchmarks for other verticals please let me know and I would be glad to dig into them to see if they provide some interesting viwes on storage performance.
Photo Credit(s): Stock market quotes in newspaper by Andreas Poike
Max MBPS and Mean RT Charts (c) 2015 Silverton Consulting, Inc., All Rights Reserved