I attended Cloud Field Day 23 (#CFD23) a couple of weeks back and MinIO presented on their AIStor for one of the CFD sessions (see videos here). During the session the speaker mentioned that object storage gateways were not strongly consistent.

Object storage has been around a long time but AWS S3 took it from a niche solution to a major storage type. Originally S3 was “eventually consistent” that is if you wrote an object and the system returned to you, that object wouldn’t necessarily be all there (that is stored in the object storage system) until sometime “later” .
It took AWS from Mar’2006 until Dec’2020 to offer “strong consistency” for objects that is when the system returned to a user after having created a new object, that object was guaranteed to have been written to the object storage before the return happened.
What the MinIO presenter was saying is that object storage gateways didn’t offer strong consistency for objects being written to them. Most object storage gateways are built ontop of enterprise class file systems. For these systems to be not strongly consistent seemed pretty unusual to me so I asked them to clarify why they said that.
The speaker turned to Anand Babu Periasamy (AB), CEO & Co-Founder, MinIO to answer my question. Essentially AB answered me that it was all about the size of the volumes (or LUNs) that underpin the objects and buckets in the file object storage gateway system.

For our post we will set aside the few object storage systems based solely ontop of block storage (are there any of these) and will just consider the object storage gateways built ontop of file systems.
In S3, there are Single (object) PUT requests and multi-part (object) uploads to create objects. According to MinIO as objects and their buckets get bigger, object gateways don’t maintain strong consistency. Bucket sizes for S3 Objects storage can be on the order of many PBs.
As for the Objects themselves, AWS S3 supports a max object size of 5TiB, using multipart uploads and Single PUT maximum of 5GiB. According to AWS, during S3 multi-part uploads, the object is not readable until after the last multi-part upload is replied to, only then is the full (5TiB) object available and thereby offers strong consistency over multi-part uploads. For Single PUT object creations, the object is strongly consistent after the PUT is returned to.
Here’s single PUT and mult-part upload object size constraints for select object storage gateways:
- NetApp ONTAP offers object storage protocols for their Data ONTAP storage systems. The maximum NetApp S3 put size is 5TiB but anything over 5GiB will be flagged to encourage multi-part uploads
- Hitachi VSP One Object storage, Object maximum S3 PUT size is 5GiB and an object is limited to a maximum of 5TiB
- Dell PowerScale ObjectScale maximum S3 PUT size is 5GiB, and the maximum S3 object size is 4.4TiB.
- VAST Data supports S3 objects and it’s maximum S3 PUT size is 5GiB, and the maximum object size is 5TiB.
- WekaFS maximum S3 PUT size is 5GiB and maximum S3 object size is 5TiB.
- Scality (also presented at CFD23, videos here) RING and Artesca maximum S3 PUT size is 5GiB, and implies that they support 10K parts in a multi-part upload so a maximum object size of ~50TiB
- Qumulo (also presented at CFD23, videos here) maximum S3 PUT size is 5GiB, and for S3 compatibility their maximum object size is 5TiB. But they also state that they can actually support an object of ~50TiB (10K multipart uploads * 5GiB)
- MinIO S3 maximum object size is 50TiB and their maximum S3 PUT size is 5TiB.
I’m probably missing a couple of other storage systems that support S3 objects but you get the drift. It’s important to note that just about every object system gateway offers ful S3 compatible single PUT size of 5GiB (NetApp & MinIO offer 5TiB though!). And they all seem to offer 5TiB object sizes (with few exceptions Dell PowerScale ObjectScale’s 4.4TiB, Scality, Qumulo & MinIO 50TiB).
The question needs to be asked is if a customer uses multi-part uploads of 5GiB for a 5TiB object are they guaranteed strong consistency in an object storage gateway
According to everything I see the answer seem to be yes. Although outside of AWS S3 and MinIO I don’t see anything that specifically states they are strongly consistent.
However, most of object storage systems map objects to files and single file creates are strongly consistent for these systems. So for a single (5GiB) PUT the answer would be obviously YES. And most of these systems also support much larger files than 5GiB, so I would suspect that multi-part uploads would also support strong consistency as they would be mapped to a multi-write/single file in these systems. So my answer to the question of whether multi-part uploads are strongly consistent for object storage gateways is a guarded YES.

Perhaps what MinIO was meaning to say was that if you do a 5TiB part for a 50TiB multipart upload will it be strongly consistent. I would have to answer NO for anyone of these systems other than Qumulo, Scality & MinIO because they don’t support anything that large. But this is way (10X for object sizes & 1000X for cumulative multi-part upload part sizes) beyond AWS S3’s current capabilities, so I don’t see how this is relevant for those customers wishing to retain AWS S3 compatibility
And I guess I don’t see the relevance of bucket size to strong consistency for an object. I suppose there may be problems as you populate a bucket with big objects and run out of room during a (multi-part) upload/PUT. But it’s clear to me that when that occurred, the object storage gateway would indicate that the object store failed and then consistency doesn’t matter.
So I have to disagree with my friends at MinIO, regarding the lack of strong consistency with the proviso that all these systems support strong consistency for AWS S3 object sizes.
Comments?
One thought on “MinIO Strong Consistency and object gateways #CFD23”
Comments are closed.