Digital creators are constantly seeking new ways to engage their audience. With audio content on the rise again, there's also a growing demand for tools that can enhance the audio-visual experience. Inspired by Spearhead, we decided to attempt to automatically generate a moving waveform ourselves. Whether you're a podcaster, musician, or content creator, this guide will help you transform your audio into a captivating visual experience, elevating your content strategy.

Pastel yellow and blue banner with a retro OS feel. The text 'Audio Visualisation' is in the foreground.

Template

Without skipping a beat, lets move onto the Template!

{
  "steps": {
    ":original": {
      "robot": "/upload/handle"
    },
    "waveformed": {
      "robot": "/audio/waveform",
      "use": ":original",
      "result": true,
      "width": 8000,
      "height": 600,
      "background_color": "00000000",
      "outer_color": "ffffffff",
      "center_color": "ffffffff",
      "format": "image"
    },
    "merged": {
      "robot": "/video/merge",
      "use": {
        "steps": [
          {
            "name": "waveformed",
            "as": "image"
          },
          {
            "name": ":original",
            "as": "audio"
          }
        ]
      },
      "result": true,
      "ffmpeg_stack": "v5.0.0",
      "preset": "ipad-high",
      "width": "${file.meta.width}",
      "height": "${file.meta.height}"
    },
    "scroll_waveform": {
      "robot": "/video/encode",
      "use": "merged",
      "ffmpeg_stack": "v5.0.0",
      "preset": "empty",
      "ffmpeg": {
        "vf": "crop=w=800:h=(${file.meta.height}):x='(${file.meta.width}-800)*t/${file.meta.duration}':y=0",
        "t": "${file.meta.duration}"
      }
    },
    "transcribed": {
      "use": ":original",
      "robot": "/speech/transcribe",
      "provider": "aws",
      "format": "webvtt",
      "result": true
    },
    "subtitle": {
      "use": {
        "bundle_steps": true,
        "steps": [
          {
            "name": "scroll_waveform",
            "as": "video"
          },
          {
            "name": "transcribed",
            "as": "subtitles"
          }
        ]
      },
      "robot": "/video/subtitle",
      "ffmpeg_stack": "v5.0.0",
      "subtitles_type": "burned",
      "preset": "ipad-high",
      "result": true
    }
  }
}

That's probably a lot to absorb all at once, so let's break it down and examine each Step by itself.

First of all, we want to convert our audio to a waveform image. Thankfully, we can use our /audio/waveform Robot to create an 8000px wide image of the waveform from the audio clip. Making it this wide means we can scroll a virtual viewfinder along the image's length, creating the illusion of the waveform moving to match the spoken words.

Next, the /video/merge Robot adds the original audio to this image, producing a video.

Now it's time to put our /video/encode Robot to work. Using custom FFmpeg parameters, we crop the window to be only 800px wide, while keeping the original height. We then set the x position of this viewport to the width of the full image, minus the viewport width, multiplied by our time parameter t over the total duration of the video. The t parameter is the current timestamp in the video, meaning that at the start of the video, the x positon will be 0, slowly increasing towards 7200 (since it's 8000px - 800px).

Phew! That was a lot to take in. To close out our Template, we use two closely related Robots. The /speech/transcribe Robot generates a subtitle file for us, which is then passed to the /video/subtitle Robot.

Results

Let's take a look at the results our Template produced.

The result of passing a sample file into our Template

From start to finish, our Assembly took approximately ten seconds in total. Pretty impressive!

That's it for today's blog. If you happened to follow along to this guide and got some cool results, please let us know on Twitter. We always love seeing the creative ways people come up with to use Transloadit. If you're looking for other ways to generate audio waveforms and visualise them, check out this demo!