TransloaditChristmas hat

Generate waveform images from audio

🤖/audio/waveform generates waveform images for your audio files and allows you to change their colors and dimensions.

We recommend that you use an 🤖/audio/encode Step prior to your waveform Step to convert audio files to MP3. This way it is guaranteed that 🤖/audio/waveform accepts your audio file and you can also down-sample large audio files and save some money.

Similarly, if you need the output image in a different format, please pipe the result of this Robot into 🤖/image/resize.

Usage example

Generate a 400×200 waveform in #0099cc color from an uploaded audio file:

{
  "steps": {
    "waveformed": {
      "robot": "/audio/waveform",
      "use": ":original",
      "width": 400,
      "height": 200,
      "outer_color": "0099ccff",
      "center_color": "0099ccff"
    }
  }
}

Parameters

  • output_meta

    Record<string, boolean> | boolean | Array<string>

    Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    For images, you can add "has_transparency": true in this object to extract if the image contains transparent parts and "dominant_colors": true to extract an array of hexadecimal color codes from the image.

    For videos, you can add the "colorspace: true" parameter to extract the colorspace of the output video.

    For audio, you can add "mean_volume": true to get a single value representing the mean average volume of the audio file.

    You can also set this to false to skip metadata extraction and speed up transcoding.

  • result

    boolean (default: false)

    Whether the results of this Step should be present in the Assembly Status JSON

  • queue

    batch

    Setting the queue to 'batch', manually downgrades the priority of jobs for this step to avoid consuming Priority job slots for jobs that don't need zero queue waiting times

  • force_accept

    boolean (default: false)

    Force a Robot to accept a file type it would have ignored.

    By default, Robots ignore files they are not familiar with. 🤖/video/encode, for example, will happily ignore input images.

    With the force_accept parameter set to true, you can force Robots to accept all files thrown at them. This will typically lead to errors and should only be used for debugging or combatting edge cases.

  • ignore_errors

    boolean | Array<meta | execute> (default: [])

    Ignore errors during specific phases of processing.

    Setting this to ["meta"] will cause the Robot to ignore errors during metadata extraction.

    Setting this to ["execute"] will cause the Robot to ignore errors during the main execution phase.

    Setting this to true is equivalent to ["meta", "execute"] and will ignore errors in both phases.

  • use

    string | Array<string> | Array<object> | object

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)
    • You can provide several Steps as input with arrays:
      {
        "use": [
          ":original",
          "encoded",
          "resized"
        ]
      }
      
  • ffmpeg

    object

    A parameter object to be passed to FFmpeg. If a preset is used, the options specified are merged on top of the ones from the preset. For available options, see the FFmpeg documentation. Options specified here take precedence over the preset options.

  • ffmpeg_stack

    v5 | v6 | v7 | string (default: "v5.0.0")

    Selects the FFmpeg stack version to use for encoding. These versions reflect real FFmpeg versions. We currently recommend to use "v6.0.0".

  • format

    image | json (default: "image")

    The format of the result file. Can be "image" or "json". If "image" is supplied, a PNG image will be created, otherwise a JSON file.

  • width

    string | number (default: 256)

    The width of the resulting image if the format "image" was selected.

  • height

    string | number (default: 64)

    The height of the resulting image if the format "image" was selected.

  • antialiasing

    0 | 1 | boolean (default: 0)

    Either a value of 0 or 1, or true/false, corresponding to if you want to enable antialiasing to achieve smoother edges in the waveform graph or not.

  • background_color

    string (default: "#00000000")

    The background color of the resulting image in the "rrggbbaa" format (red, green, blue, alpha), if the format "image" was selected.

  • center_color

    string (default: "000000ff")

    The color used in the center of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).

  • outer_color

    string (default: "000000ff")

    The color used in the outer parts of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).

  • style

    v0 | v1

    Waveform style version.

    • "v0": Legacy waveform generation (default).
    • "v1": Advanced waveform generation with additional parameters.

    For backwards compatibility, numeric values 0, 1, 2 are also accepted and mapped to "v0" (0) and "v1" (1/2).

  • split_channels

    boolean

    Available when style is "v1". If set to true, outputs multi-channel waveform data or image files, one per channel.

  • zoom

    string | number

    Available when style is "v1". Zoom level in samples per pixel. This parameter cannot be used together with pixels_per_second.

  • pixels_per_second

    string | number

    Available when style is "v1". Zoom level in pixels per second. This parameter cannot be used together with zoom.

  • bits

    8 | 16

    Available when style is "v1". Bit depth for waveform data. Can be 8 or 16.

  • start

    string | number

    Available when style is "v1". Start time in seconds.

  • end

    string | number

    Available when style is "v1". End time in seconds (0 means end of audio).

  • colors

    audition | audacity

    Available when style is "v1". Color scheme to use. Can be "audition" or "audacity".

  • border_color

    string

    Available when style is "v1". Border color in "rrggbbaa" format.

  • waveform_style

    normal | bars

    Available when style is "v1". Waveform style. Can be "normal" or "bars".

  • bar_width

    string | number

    Available when style is "v1". Width of bars in pixels when waveform_style is "bars".

  • bar_gap

    string | number

    Available when style is "v1". Gap between bars in pixels when waveform_style is "bars".

  • bar_style

    square | rounded

    Available when style is "v1". Bar style when waveform_style is "bars".

  • axis_label_color

    string

    Available when style is "v1". Color for axis labels in "rrggbbaa" format.

  • no_axis_labels

    boolean

    Available when style is "v1". If set to true, renders waveform image without axis labels.

  • with_axis_labels

    boolean

    Available when style is "v1". If set to true, renders waveform image with axis labels.

  • amplitude_scale

    string | number

    Available when style is "v1". Amplitude scale factor.

  • compression

    string | number

    Available when style is "v1". PNG compression level: 0 (none) to 9 (best), or -1 (default). Only applicable when format is "image".

Demos