Generate waveform images from audio
🤖/audio/waveform generates waveform images for your audio files and allows you to change their colors and dimensions.

We recommend that you use an 🤖/audio/encode Step prior to your waveform Step to convert audio files to MP3. This way it is guaranteed that 🤖/audio/waveform accepts your audio file and you can also down-sample large audio files and save some money.
Similarly, if you need the output image in a different format, please pipe the result of this Robot into 🤖/image/resize.
Usage example
Generate a 400×200 waveform in #0099cc color from an uploaded audio file:
{
"steps": {
"waveformed": {
"robot": "/audio/waveform",
"use": ":original",
"width": 400,
"height": 200,
"outer_color": "0099ccff",
"center_color": "0099ccff"
}
}
}Parameters
output_metaRecord<string, boolean> | boolean | Array<string>Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.
For images, you can add
"has_transparency": truein this object to extract if the image contains transparent parts and"dominant_colors": trueto extract an array of hexadecimal color codes from the image.For images, you can also add
"blurhash": trueto extract a BlurHash string — a compact representation of a placeholder for the image, useful for showing a blurred preview while the full image loads.For videos, you can add the
"colorspace: true"parameter to extract the colorspace of the output video.For audio, you can add
"mean_volume": trueto get a single value representing the mean average volume of the audio file.You can also set this to
falseto skip metadata extraction and speed up transcoding.resultboolean(default:false)Whether the results of this Step should be present in the Assembly Status JSON
queuebatchSetting the queue to 'batch', manually downgrades the priority of jobs for this step to avoid consuming Priority job slots for jobs that don't need zero queue waiting times
force_acceptboolean(default:false)Force a Robot to accept a file type it would have ignored.
By default, Robots ignore files they are not familiar with. 🤖/video/encode, for example, will happily ignore input images.
With the
force_acceptparameter set totrue, you can force Robots to accept all files thrown at them. This will typically lead to errors and should only be used for debugging or combatting edge cases.ignore_errorsboolean | Array<meta | execute>(default:[])Ignore errors during specific phases of processing.
Setting this to
["meta"]will cause the Robot to ignore errors during metadata extraction.Setting this to
["execute"]will cause the Robot to ignore errors during the main execution phase.Setting this to
trueis equivalent to["meta", "execute"]and will ignore errors in both phases.usestring | Array<string> | Array<object> | objectSpecifies which Step(s) to use as input.
- You can pick any names for Steps except
":original"(reserved for user uploads handled by Transloadit) - You can provide several Steps as input with arrays:
{ "use": [ ":original", "encoded", "resized" ] }
Tip
That's likely all you need to know about
use, but you can view Advanced use cases.- You can pick any names for Steps except
ffmpegobjectA parameter object to be passed to FFmpeg. If a preset is used, the options specified are merged on top of the ones from the preset. For available options, see the FFmpeg documentation. Options specified here take precedence over the preset options.
ffmpeg_stackv5 | v6 | v7 | string(default:"v5.0.0")Selects the FFmpeg stack version to use for encoding. These versions reflect real FFmpeg versions. We currently recommend to use "v6.0.0".
formatimage | json(default:"image")The format of the result file. Can be
"image"or"json". If"image"is supplied, a PNG image will be created, otherwise a JSON file. Whenstyleis"spectrogram", only"image"is supported.widthstring | number(default:256)The width of the resulting image if the format
"image"was selected.heightstring | number(default:64)The height of the resulting image if the format
"image"was selected.antialiasing0 | 1 | boolean(default:0)Either a value of
0or1, ortrue/false, corresponding to if you want to enable antialiasing to achieve smoother edges in the waveform graph or not.background_colorstring(default:"#00000000")The background color of the resulting image in the "rrggbbaa" format (red, green, blue, alpha), if the format
"image"was selected.center_colorstring(default:"000000ff")The color used in the center of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).
outer_colorstring(default:"000000ff")The color used in the outer parts of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).
stylev0 | v1 | spectrogramWaveform style version.
"v0": Legacy waveform generation (default)."v1": Advanced waveform generation with additional parameters."spectrogram": Spectrogram visualization showing frequency content over time.
For backwards compatibility, numeric values
0and1are also accepted and mapped to"v0"and"v1".split_channelsbooleanAvailable when style is
"v1". If set totrue, outputs multi-channel waveform data or image files, one per channel.zoomstring | numberAvailable when style is
"v1". Zoom level in samples per pixel. This parameter cannot be used together withpixels_per_second.pixels_per_secondstring | numberAvailable when style is
"v1". Zoom level in pixels per second. This parameter cannot be used together withzoom.bits8 | 16Available when style is
"v1". Bit depth for waveform data. Can be 8 or 16.startstring | numberAvailable when style is
"v1". Start time in seconds.endstring | numberAvailable when style is
"v1". End time in seconds (0 means end of audio).colorsaudition | audacityAvailable when style is
"v1". Color scheme to use. Can be "audition" or "audacity".border_colorstringAvailable when style is
"v1". Border color in "rrggbbaa" format.waveform_stylenormal | barsAvailable when style is
"v1". Waveform style. Can be "normal" or "bars".bar_widthstring | numberAvailable when style is
"v1". Width of bars in pixels when waveform_style is "bars".bar_gapstring | numberAvailable when style is
"v1". Gap between bars in pixels when waveform_style is "bars".bar_stylesquare | roundedAvailable when style is
"v1". Bar style when waveform_style is "bars".axis_label_colorstringAvailable when style is
"v1". Color for axis labels in "rrggbbaa" format.no_axis_labelsbooleanAvailable when style is
"v1". If set totrue, renders waveform image without axis labels.with_axis_labelsbooleanAvailable when style is
"v1". If set totrue, renders waveform image with axis labels.amplitude_scalestring | numberAvailable when style is
"v1". Amplitude scale factor.compressionstring | numberAvailable when style is
"v1". PNG compression level: 0 (none) to 9 (best), or -1 (default). Only applicable when format is "image".color_mapviridis | plasma | magma | cividis | cool | rainbow | moreland |Available when style is
"spectrogram". Color scheme for the spectrogram visualization. Defaults to"viridis".frequency_scalelinear | logarithmicAvailable when style is
"spectrogram". Frequency scale for the spectrogram."linear"shows frequencies evenly spaced,"logarithmic"emphasizes lower frequencies. Defaults to"logarithmic".frequency_minstring | numberAvailable when style is
"spectrogram". Minimum frequency in Hz to display. Defaults to0.frequency_maxstring | numberAvailable when style is
"spectrogram". Maximum frequency in Hz to display. Defaults to half the sample rate (Nyquist frequency).legendbooleanAvailable when style is
"spectrogram". Whether to include a legend showing the frequency and time scales. Defaults tofalse.gainstring | numberAvailable when style is
"spectrogram". Linear gain factor for spectrogram intensity. Defaults to1.orientationvertical | horizontalAvailable when style is
"spectrogram". Orientation of the spectrogram."horizontal"shows time on the x-axis (default),"vertical"shows time on the y-axis.