Last updated: November 29, 2024

Major performance improvements for audio and video concatenation

Tim Koschützki

Co-founder · Berlin, Germany · Show bio ·

We have recently updated our audio and video concatenation Robots to make use of FFmpeg's concat muxer, resulting in a significant improvement in performance for applicable Assemblies.

For Assemblies to take advantage of the new muxer, the input files need to meet the following conditions:

Encoded in the same format (e.g. all MP4s).
Use the same codecs for their streams (e.g. H.264 for video and AAC for audio).
Have the same codec parameters (such as resolution, frame rate for video, or sample rate for audio).

Transloadit Robots working together to form an Assembly

Why it's much faster

Concatenating with the concat muxer is much faster than using complex filters and retranscoding for several reasons:

No re-encoding required

When using the concatenation muxer with -c copy, FFmpeg doesn't need to decode and then re-encode the streams. Re-encoding is a CPU-intensive process that can significantly increase Assembly execution time, especially for Assemblies processing high-resolution video. By avoiding this step, the concat muxer avoids unnecessarily re-encoding the files, completing the Assembly much faster.

Direct data copying

The concat muxer essentially allows FFmpeg to directly copy the media data from the source files to the output file. Since there's minimal processing involved (other than parsing the file headers and adjusting timestamps), the operation is limited primarily by the speed of the storage media.

Preserves quality

Since there's no re-encoding, there's also no quality loss from compression artifacts that might be introduced during the encoding process. This is an added benefit beyond the speed improvement.

In contrast, using complex filters or re-encoding requires FFmpeg to fully decode all streams, process them (which might include scaling, filtering, or other transformations), and then re-encoding them into the target format. This not only takes more time but can also degrade quality and increase file size.

This approach is similar to LosslessCut, an open-source video editor created by our own Transloadian Mikael Finstad. LosslessCut also uses FFmpeg's concat demuxer for fast, lossless video editing.

To make the most of this change, make sure to first transcode differing input files to the same format, codec and resolution. For best performance here, use turbo: true and one of our presets. With both of these, you can greatly improve execution time on your existing Assemblies.

If input files are of the same format, codec and resolution anyway, there is no need to add any additional encoding Steps as you should already be noticing the performance improvements by default. However if your files may be in different formats, codecs or resolutions, we recommend adding a /file/filter Step to pass outlier files to an encoding Step, so that you can get the most out of your later concatenation Steps.

#api #assembly #workflow #file-filter-robot

Co-written by Joseph Grabski

Why it's much faster

No re-encoding required

Direct data copying

Preserves quality

👩‍💻 Join 20k+ developers

File uploading and encoding. Made simple.