In the recent past we've experienced some slowdowns, elevated queue times and error rates. Apart from Route 53, RDS and LAN outages we experienced on EC2, some of these were caused by people sending too many requests at once, for example by importing a large library of videos or images. Today we are fixing this by introducing rate limiting.
Generally speaking, one customer's behavior should not affect anyone else's. Without rate limits we cannot guarantee this as our customers still share one job queue. We already tackled most problems with priority queues (for a customer with X jobs in queue, next jobs get appended to queue, while other customer's jobs get prepended) and reducing scale times to 2 minutes per octa-core machines. etc.
But no matter how fast we can scale, there are always limits in our physical world. We experienced large batch imports in the past with people importing thousands of videos at once or tens of thousands of images. This caused our platform to scale up to 100 machines, but
Besides valid mass imports, buggy integration code can also cause too many requests to be sent in a very short period of time. This can lead to large bills at the end of the month that the customer did not plan for.
By limiting the requests you can send, we're dealing with problems.
For now, we only limit the number of
Should you ever hit the rate limit, you will receive a
413 RATE_LIMIT_REACHED error. The error JSON has a subkey
info.retryIn, which tells you the number of seconds you need to wait before you can create the next
Remember that request limits are a feature to protect our platform from (mostly invalid) excessive use, ensuring a smoother ride for everybody. We also allow different rate limits on a per customer basis. So if you feel you need a higher rate limit, please create a support ticket and I will happily raise it.