Addressing S3 put request inconsistencies at Transloadit
Amazon S3 is great. Except when it isn't. When you are using it a lot - we have stored around one million customer objects so far - you begin to see some odd things that look more like "eventual madness" than "eventual consistency".
One of the things that we learned right away, is that S3's error codes can't be trusted. Sometimes
you will see a 403 Permission Denied
, but when you execute the exact same request again, it
miraculously works all of a sudden. For that reason, you generally have to retry in case you receive
an error message. Only if a certain error occurs several times, you should start listening. We are
currently re-trying up to 6 times with increasing timeouts.
The latest thing we ran into, is that S3's confirmations for PUT requests are not to be trusted either. The symptoms we saw were that some of the S3 URLs we returned to our customers were simply not working. Reproducing this has been a huge challenge. At one point we were able to see the problem after ~25 PUT's, another time nothing happened after 2500 PUT's (10 GB total). So there is still a tiny chance of a bug being somewhere within our system, but at this point we are pretty convinced that this is an Amazon issue that occurs infrequently.
Enter crazy land. When dealing with an eventually consistent system, where "eventually" may sometimes mean "never", how do you verify the success of a write operation? You can, of course, check the bucket to see whether the object exists after each write, but what if it doesn't? When does this become an error condition? Moreover, when the object does exist, what does that mean? It is safe to assume this means that at least one client is able to see the object, but not that it has finished replicating and is therefore available to all clients.
While there is no perfect answer, we have decided to verify the presence of a stored file with up to eleven checks over a period of two minutes. If the file does not exist by that time, we will retry the job up to six times. While this is far from ideal, it will probably reduce the chance that we hand out invalid S3 URLs to a similar level as winning a decent prize in the lottery.
Another thing that struck us as interesting is that S3's consistency model is not the same in all regions. While "US Default" employs the traditional "eventual consistency" model, all other regions provide "read-after-write" consistency. While incredibly confusing from a customer standpoint, the reason for this seems to be that "US Default" spawns from the east to the west coast, so the speed of light is getting in the way of providing the same guarantees.
We are in the process of collecting better data on our S3 experience, which we will share in the future. S3 is definitely working great for us most of the time, and if it turns out that we were using it wrong, we will follow up on that as well.