I have been analyzing SPECsfs results now for almost 7 years now and I feel that maybe it’s time for me to discuss some of the t problems with SPECsfs2008 today that should be fixed in the next SPECsfs20xx whenever that comes out.
First and foremost, for CIFS SMB 1 is no longer pertinent to today’s data center. The world of Microsoft has moved on to SMB 2 mostly and are currently migrating to SMB 3. There were plenty of performance fixes in the last years SMB 3.0 release which would be useful to test with current storage systems. But I would be even be somewhat happy with SMB2 if that’s all I can hope for.
My friends at Microsoft would consider me remiss if I didn’t mention that since SMB 2 they no longer call it CIFS and have moved to SMB. SPECsfs should follow this trend. I have tried to use CIFS/SMB in my blog posts/dispatches as a step in this direction mainly because SPEC continues to use CIFS and Microsoft wants me to use SMB.
In my continuing quest to better compare different protocol performance I believe it would be useful to insure that the same file size distributions are used for both CIFS and NFS benchmarks. Although the current Users Guide discusses some file size information for NFS it is silent when it comes to CIFS. I have been assuming that they were the same because of lack of information but this would be worthy to have confirmed in documentation.
Finally for CIFS, it would be very useful if there could be a closer approximation of the same amount of data transfers that are done for NFS. This is a nit but when I compare CIFS to NFS storage system results there is a slight advantage to NFS because NFS’s workload definition doesn’t do as much reading as CIFS. In contrast, CIFS has slightly less file data write activity than the NFS benchmark workload. Having them be exactly the same would help in any (unsanctioned) comparisons.
As for NFSv3, although NFSv4 has been out for more than 3 years now, it has taken a long time to be widely adopted. However, these days there seems to be more client and storage support coming online every day and maybe this would be a good time to move on to NFSv4.
The current NFS workloads, while great for the normal file server activities, have not kept pace with much of how NFS is used today especially in virtualized environments. As far as I can tell under VMware NFS data stores don’t do a lot of meta-data operations and do an awful lot more data transfers than normal file servers do. Similar concerns apply to NFS used for Oracle or other databases. Unclear how one could incorporate a more data intensive workload mix into the standard SPECsfs NFS benchmark but it’s worthy of some thought. Perhaps we could create a SPECvms20xx benchmark that would test these types of more data intensive workloads.
For both NFSv3 and CIFs benchmarks
Both the NFSv3 and CIFS benchmarks typically report [throughput] ops/sec. These are a mix of all the meta-data activities and the data transfer activities. However, I think many storage customers and users would like a finer view of system performance. .
I have often been asked just how many files a storage system actually support. This depends of course on the workload and file size distributions but SPECsfs already defines this. As a storage performance expert, I would also like to know how much data transfer can a storage system support in MB/sec read and written. I believe both of these metrics can be extracted from the current benchmark programs with a little additional effort. Probably another half dozen metrics that would be useful maybe we could sit down and have an open discussion of what these might be.
Also the world has changed significantly over the last 6 years and SSD and flash has become much more prevalent. Some of your standard configuration tables could be better laid out to insure that readers understand just how much DRAM, flash, SSDs and disk drives are in a configuration.
Beyond file NAS
Going beyond SPECsfs there is a whole new class of storage, namely object storage where there are no benchmarks available. I would think now that Amazon S3 and Openstack Cinder are well defined and available that maybe a new set of SPECobj20xx benchmarks would be warranted. I believe with the adoption of software defined data centers, object storage may become the storage of choice over the next decade or so. If that’s the case then having some a benchmark to measure object storage performance would help in its adoption. Much like the original SPECsfs did for NFS.
Then there’s the whole realm of server SAN or (hyper-)converged storage which uses DAS inside a cluster of compute servers to support block and file services. Not sure exactly where this belongs but NFS is typically the first protocol of choice for these systems and having some sort of benchmark configuration that supports converged storage would help adoption of this new type of storage as well.
I think thats about it for now but there’s probably a whole bunch more that I am missing out here.