Running tus in production
In 2013, we set out to make the world a slightly better place by fixing file uploading for everyone. We did so by developing an open protocol for reliable file uploading, called tus. The road was long and there were many setbacks and milestones along the way, but today, we celebrate that Transloadit's API now fully supports tus. We are already making good progress in updating all of our SDKs to add support for tus as well. We are very excited to be able to offer its robust content ingestion capabilities to all of our customers.
Not everybody around the world benefits from fast and steady internet. Even in first world countries, people often suffer from bad connections, for example during train rides, trips down to the basement, or for those living in more rural areas. Flaky Wi-Fi is also a common source of trouble. And when your app's user is uploading a 100MB video and their connection goes sour after uploading 90MB, that's probably frustrating enough to not want to retry the upload from scratch again.
Poor connection quality reflects badly on your app, and you might find yourself at the receiving end of frustration that should really be directed at network operators or sources of Wi-Fi interference instead.
This problem is getting worse every day as better cameras are released that cause a dramatic increase of file sizes, while the reliability of network connections remains unchanged.
The solution to a common problem
With tus, this is a problem of the past. While we can't fix poor network connections, we are able to make it so that uploads auto-resume exactly where they broke. If all goes well, the end-user will probably not even notice the interruption. The website or app immediately uploads the remaining bytes and the user stays in their flow.
Resumable file uploading has been possible for some time, but there never was an open standard for it, leading to the birth of a thousand incompatible and barely working one-week projects. When we set out to do something about the reliability of file uploading, we didn't want to become one of those projects. While we knew that designing a new protocol wasn't going to be easy, we felt it was the only right way forward – and so we got started.
Struggles along the way
We poured our hearts into this protocol and worked hard. We hit walls, we got demoralized, and we even completely stopped working on it at times. But we always regrouped and tried new approaches. Marius in particular deserves a lot of credit for getting us out of a tough spot. As the project leader, he oversaw feedback from engineers at Vimeo, Google, Yahoo, ZeroMQ, Node.js, and for most part decided what should go in, and what should be left out.
With all these struggles, it was a huge relief when we were finally ready for the release of tus 1.0. We immediately saw an uptick in community involvement and people implementing tus in their favorite programming language. With 1.0, they could now trust that their component would work reliably with other tus components for years to come. The best community implementations have since become officially supported implementations, hosted under our shared roof over at https://github.com/tus/. These are libraries that any developer can now just plug into their app to increase transmission speed and reliability.
Seeing the tus community grow over time has been a true pleasure and, for instance, having Vimeo adopt tus was also big win for the project. Their support has been nothing less than formidable, both in the form of ideas and code contributions.
Today, then, we celebrate that we've also equipped our own API with tus' resumable file uploading capabilities. In fact, we opted for a soft launch and have gradually moved more people onto tus where we could. A few months and half a petabyte (or 500 million pictures) later, end-users around the world are enjoying reliable tus-powered file uploading without even noticing. Things 'just work', even the biggest cat video uploads survive trips down to the basement. And that's how it should be: if you notice, we failed.
If you are already a Transloadit customer, chances are your software is not talking to our API directly, but instead you use one of our SDKs. With an SDK, you can enjoy our best practices right out of the box, and save yourself from writing boilerplate in your programming language. The good news here is that nearly half of our SDKs are already updated to support tus. Here's a list that gets updated automatically (so feel free to check back anytime):
|iOS & macOS||TransloaditKit|
If you use one of the SDKs that already supports tus, you're in luck! All you need to do is upgrade to the latest major version and resumable file uploads will be enabled for you by default. As far as our other SDKs are concerned, we're working to gradually roll out support for tus across the board, so stay tuned. If you're reading this as a developer who would like to get involved with tus or improving our SDKs, we welcome you to reach out because we'd love to talk about your ideas for improvement.
When talking about the client/browser side, a special mention should be made of the sound reliability work that our Uppy team is doing. Besides fully supporting tus, they have also created a plugin that lets you resume file uploads after browser crashes. We're proud to say that both are industry firsts!
The nuts & bolts
Most of the work went into the protocol itself and hammering out its implementations together with the community. Once those were in a good place, it was pretty straightforward to hook up tus to our existing encoding & uploading service.
We are running a stock tusd server on all of our upload-handling machines, alongside our existing receiving software. HAProxy directs any request pointed at
/resumable/* to the tusd instance running on such a machine.
Any tus-enabled SDK will create an Assembly as usual, but instead of uploading immediately, the Assembly returns a tus
/resumable/ endpoint that welcomes uploads. When uploading to this tus endpoint, the SDK attaches the unique location of the Assembly as meta-data. Each time an upload completes, a tusd hook fires and interprets this meta-data to find the Assembly and inform it of a finished upload. The encoding of this file can then already start, even though other uploads are still rolling in. The SDK also discloses how many uploads to expect, so the Assembly knows when it's done and should ping you with the encoding results.
To keep things running smoothly, we rely on good old Upstart (not much different from how we did it back in 2009) as well as the
/metrics endpoint that
tusd exposes. This endpoint contains the numbers that our autoscaler uses to launch or terminate EC2 instances as load goes up or down. We also feed these metrics into Librato so we can keep a close watch on the health of our processes and get informed of any problems:
With plans to move good parts of our infrastructure onto Kubernetes, it's encouraging to know that tusd's
/metrics endpoint is already compatible with their go-to monitoring solution: Prometheus. Another thing that will help people who are looking to deploy state-of-the-art content ingestion networks to Kubernetes or similar Docker-friendly platforms (like Nomad, Swarm, Mesos, etc), is that we're working to get an official
_/tusd Docker image published. We're contemplating whether a Kubernetes Operator for running a Content Distribution and Ingestion Network could make for a third big complementary open source project after tus and Uppy. Leave a comment if you'd like to exchange ideas about this.
If you're looking to run tus servers yourself, but all this container talk and running dedicated servers are not for you, know that there are also many other implementations such as for Node.js, Ruby, PHP, Python, etc., that you can use in conjunction with your existing webserver. It's as easy as adding some code to your existing app. All implementations are MIT licensed (i.e. free in all conceivable ways) so it will only cost a few minutes of your time.
When we first spoke about tus in 2013, we expected to either fail hard immediately, or to succeed within a year. As it turns out, trying to do things right isn't necessarily the shortest road or the path of least resistance.
“It's been a long time coming, but I know a change is gonna come”
— Sam Cooke
You can probably imagine how thrilled we are to finally see these long-term open source investments come to life, matured to a point that they're production worthy and ready for use not only to benefit our customers, but also to redefine an ecosystem that we are just a tiny part of.
Thanks for bearing with us until the finish, we hope you'll find it was worth the wait!
If you would like to learn more or want to help us improve file uploading for everyone around the world – head over to tus.io and uppy.io. As always, your thoughts are also very welcome as comments below or on HN/Reddit.