Creating audio waveform videos with FFmpeg & Node.js
It's no secret that we love open source projects here at Transloadit. We have released several ambitious projects such as Uppy and Tus, and our founders even met as a result of contributing to an open source project. We also use many open source tools under the hood such as FFmpeg, which cuts right to the heart of our video encoding.
FFmpeg (Fast Forward MPEG) is a command line utility that allows for transcoding audio and video files. Converting between formats, altering sample rates and resolutions, and much more. FFmpeg powers our Video and Audio Encoding services, each having several Robots at their disposal.
By default, Transloadit will configure and scale FFmpeg under the hood, offering a powerful
abstraction to cater to common use cases. But if you want to stray off the paved paths, you can.
Every Robots in both mentioned categories have an ffmpeg
parameter that allows you to
directly interface with FFmpeg and create functionality outside the bounds of our default features.
Today's goal
In this post, we wanted to draw some attention to more creative transcoding options Transloadit has to offer by using our FFmpeg parameter and our Node.js SDK. We already have demos that demonstrate how FFmpeg can be used to remove a video stream, change a sample rate, and create custom encoding presets. Today, we thought we would show that you can make an Assembly to generate audio waveform images that scroll horizontally in time with an inputted audio track.
The recipe to achieve this will be written as a JSON document. We save this as a Template for later use.
Let's create our Template. Head over to your Transloadit accounts Template section, then click "New Template", and save the following JSON in it.
{
"steps": {
":original": {
"robot": "/upload/handle"
},
"waveformed": {
"robot": "/audio/waveform",
"use": ":original",
"result": true,
"width": 8000,
"height": 600,
"background_color": "ffffffff",
"outer_color": "1974c7aa",
"center_color": "1974c7aa",
"format": "image"
},
"merged": {
"robot": "/video/merge",
"use": {
"steps": [
{
"name": "waveformed",
"as": "image"
},
{
"name": ":original",
"as": "audio"
}
]
},
"result": true,
"ffmpeg_stack": "v3.3.3",
"preset": "ipad-high",
"width": "${file.meta.width}",
"height": "${file.meta.height}"
},
"encode_effect": {
"robot": "/video/encode",
"use": "merged",
"ffmpeg_stack": "v3.3.3",
"preset": "empty",
"ffmpeg": {
"vf": "crop=w=800:h=${file.meta.height}:x='(${file.meta.width}-800)*t/${file.meta.duration}':y=0",
"t": "${file.meta.duration}"
}
},
"exported": {
"use": "encode_effect",
"robot": "/s3/store",
"credentials": "s3_cred"
}
}
}
Now, let's break down what's going on. Our first Step, named ":original"
above, is for
receiving our file that we'll submit with the Node.js SDK in this case; no surprises there.
Our second Step, "waveformed"
, is where we generate our waveform image. We can set
options like background_color
and outer_color
(exterior color of the waveform). Let's keep
things simple and stick with a white background and blue waveform. With this Step, you
may notice that we have set the width
of our generated waveform to be much larger than you
typically would expect. We will explain this in more detail below. However, the key takeaway is that
as the width increases the scrolling transition effect becomes smoother.
Onto our third Step, "merged"
. As the Step name implies, we merge the newly
generated waveform images and our original input audio file to generate a new mp4 video file. Use
the "name"
, "as"
naming convention within the use
Step, and you'll be good to go.
We then use Assembly Variables to set the
width
and height
parameter values to be the same dimensions as our input image.
Now, onto the fun part. Our "encode_effect"
Step is where things get a bit unusual.
Here we pipe our fully merged waveform video file to a
/video/encode Robot. Using the FFmpeg parameter, we
can interface directly with the FFmpeg command line utility, we'll be using the "vf"
flag to
generate our video effect. Let's take a closer look!
"crop=w=800:h=${file.meta.height}:x='(${file.meta.width}-800)*t/${file.meta.duration}':y=0"
We use the crop
filter here to adjust the size of our final result. In our waveformed
Step, we set the total width to be 8000 so we figured we would only show 10% of the
entire image while scrolling, so we input 800
as our target width. Again, we are using one of the
Assembly Variable to set the target height
to be the inputted file height. Now since we are only moving left to the right, we set y
to 0
.
For x
, our expression governs our crop window's horizontal position and sets our video to start
from the image's left edge. Using t
, we evaluate how long the cropped image would reach the
opposite edge of our image. This value will be different for each audio file we throw at our
Assembly, which is why we use an
Assembly Variable again so we can
consistantly get the correct time value from our inputted files meta data.
But this isn't the only flag we use within our FFmpeg parameter. Additionally, we use the t
flag.
All this does is set the length of the resulting file. You'll notice we use the same value for this
flag as we did in our vf
flag above.
Finally, we export our result using the /s3/store Robot and then that's us all done!
Getting started
Now we have our Template, let's setup our SDK integration. We have opted to use Node.js for this demonstration, but this could easily be translated to our other available SDKs. But before we start, we need to install Transloadit's API. After initializing npm in your project folder run the following command.
npm i transloadit --save
Below, we have broken our code into a series of Steps. So you don't have to copy each Step tediously; there will be copy-pastable version at the end.
First we import the Transloadit library.
const Transloadit = require('transloadit')
Next we setup our Transloadit client. Both values needed for this Step can be found under your Transloadit account's console under the "Credentials" section.
const transloadit = new Transloadit({
authKey: 'YOUR_AUTH_KEY',
authSecret: 'YOUR_AUTH_SECRET',
})
Now we define the audio file we wish to use in our Assembly. Place any audio file you
wish to use within the working directory while also declaring both a filePath
, and fieldName
variable before passing both variables to our Transloadit clients addFile
method.
const filePath = './joakim_karud-rock_angel.mp3'
const fieldName = 'my_file'
transloadit.addFile(fieldName, filePath)
From here, we create our final function to send our Assembly response. We pass in our Template id value and finally, we set up a block that creates our Assembly and lets us know if it was successful.
const opts = {
params: {
template_id: 'YOUR_TEMPLATE_ID',
},
}
transloadit.createAssembly(opts, (err, result) => {
if (err) {
console.error(err)
return
}
console.log('success')
console.log(result)
})
With all of our code assembled, we can run our program. Here is the full program again:
const Transloadit = require('transloadit')
const transloadit = new Transloadit({
authKey: 'YOUR_AUTH_KEY',
authSecret: 'YOUR_AUTH_SECRET',
})
const filePath = './joakim_karud-rock_angel.mp3'
const fieldName = 'my_file'
transloadit.addFile(fieldName, filePath)
const opts = {
params: {
template_id: 'YOUR_TEMPLATE_ID',
},
}
transloadit.createAssembly(opts, (waveformErr, result) => {
if (waveformErr) {
console.error({ waveformErr })
return
}
console.log('success')
console.log(result)
})
For the audio file, we have opted to use the track "Joakim karud-rock angel" which we have used frequently throughout many of our demos. If you wish to use the same file, it can be found within the GitHub repo below.
Once our program has run and completes, we can log into S3 and look at our final result.
As you can see, everything has gone to plan. But here's where things can get more interesting. As you can see, using FFmpeg, the options your Assemblies have are virtually endless. Why don't you try scrolling the waveform vertically or playing transcoding the video to play backwards?
Or use the ffmpeg
parameter to apply wildy
different filters.
We hope this post has succeeded in showing you how versatile the Transloadit API can be and just how
useful the ffmpeg
parameter can be when creating a range of creative and fun projects.
If you would like to download any of the assets found in this project, go to the following GitHub repo!