Transloadit
Pricing
Log in
Sign up
EssentialsRobotsFAQAPIFormats
Handling uploads
  • /upload/handle
    Handle uploads
File importing
  • /azure/import
    Import files from Azure
  • /backblaze/import
    Import files from Backblaze
  • /box/import
    Import files from Box
  • /cloudfiles/import
    Import files from Rackspace Cloud Files
  • /cloudflare/import
    Import files from Cloudflare R2
  • /digitalocean/import
    Import files from DigitalOcean Spaces
  • /dropbox/import
    Import files from Dropbox
  • /ftp/import
    Import files from FTP servers
  • /google/import
    Import files from Google Storage
  • /http/import
    Import files from web servers
  • /minio/import
    Import files from MinIO
  • /s3/import
    Import files from Amazon S3
  • /sftp/import
    Import files from SFTP servers
  • /supabase/import
    Import files from Supabase
  • /swift/import
    Import files from Openstack/Swift
  • /tigris/import
    Import files from Tigris
  • /vimeo/import
    Import videos from Vimeo
  • /wasabi/import
    Import files from Wasabi
Video encoding
  • /video/adaptive
    Convert videos to HLS and MPEG-Dash
  • /video/artwork
    Extract or insert video artwork
  • /video/concat
    Concatenate videos
  • /video/encode
    Transcode, resize, or watermark videos
  • /video/merge
    Merge video, audio, images into one video
  • /video/ondemand
    Stream videos with on-demand encoding
  • /video/split
    Split video
  • /video/subtitle
    Add subtitles to videos
  • /video/thumbs
    Extract thumbnails from videos
  • Video presets
Audio encoding
  • /audio/artwork
    Extract or insert audio artwork
  • /audio/concat
    Concatenate audio
  • /audio/split
    Split audio
  • /audio/encode
    Encode audio
  • /audio/loop
    Loop audio
  • /audio/merge
    Merge audio files into one
  • /audio/waveform
    Generate waveform images from audio
  • Audio presets
Image manipulation
  • /image/bgremove
    Remove the background from images
  • /image/merge
    Merge several images into one image
  • /image/optimize
    Optimize images without quality loss
  • /image/resize
    Convert, resize, or watermark images
Artificial intelligence
  • /document/ocr
    Recognize text in documents (OCR)
  • /image/describe
    Recognize objects in images
  • /image/facedetect
    Detect faces in images
  • /image/generate
    Generate images from text prompts
  • /image/upscale
    Upscale images
  • /image/ocr
    Recognize text in images (OCR)
  • /speech/transcribe
    Transcribe speech in audio or video files
  • /text/speak
    Synthesize speech in documents
  • /text/translate
    Translate text in documents
  • /ai/chat
    Generate AI chat responses
  • /video/generate
    Generate videos from text prompts
Document processing
  • /document/autorotate
    Auto-rotate documents
  • /document/convert
    Convert documents into different formats
  • /document/merge
    Merge documents into one
  • /document/optimize
    Optimize PDF file size
  • /file/read
    Read file contents
  • /document/split
    Extracts pages
  • /document/thumbs
    Extract thumbnail images from documents
  • /html/convert
    Take screenshots of webpages or HTML files
File filtering
  • /file/filter
    Filter files
  • /file/verify
    Verify the file type
  • /file/virusscan
    Scan files for viruses
Code evaluation
  • /script/run
    Run scripts in Assemblies
Media cataloging
  • /file/hash
    Hash files
  • /file/preview
    Generate a preview thumbnail
  • /meta/write
    Write metadata to media
File compressing
  • /file/compress
    Compress files
  • /file/decompress
    Decompress archives
File exporting
  • Downloading
  • /azure/store
    Export files to Microsoft Azure
  • /backblaze/store
    Export files to Backblaze
  • /box/store
    Export files to Box
  • /cloudfiles/store
    Export files to Rackspace Cloud Files
  • /cloudflare/store
    Export files to Cloudflare R2
  • /digitalocean/store
    Export files to DigitalOcean Spaces
  • /dropbox/store
    Export files to Dropbox
  • /ftp/store
    Export files to FTP servers
  • /google/store
    Export files to Google Storage
  • /minio/store
    Export files to MinIO
  • /s3/store
    Export files to Amazon S3
  • /sftp/store
    Export files to SFTP servers
  • /supabase/store
    Export files to Supabase
  • /swift/store
    Export files to OpenStack/Swift
  • /tigris/store
    Export files to Tigris
  • /tus/store
    Export files to Tus-compatible servers
  • /vimeo/store
    Export files to Vimeo
  • /wasabi/store
    Export files to Wasabi
  • /youtube/store
    Export files to YouTube
Smart CDN
  • /file/serve
    Serve files to web browsers
  • /tlcdn/deliver
    Cache and deliver files globally
  • Pricing

Generate waveform images from audio

🤖/audio/waveform generates waveform images for your audio files and allows you to change their colors and dimensions.

We recommend that you use an 🤖/audio/encode Step prior to your waveform Step to convert audio files to MP3. This way it is guaranteed that 🤖/audio/waveform accepts your audio file and you can also down-sample large audio files and save some money.

Similarly, if you need the output image in a different format, please pipe the result of this Robot into 🤖/image/resize.

Usage example

Generate a 400×200 waveform in #0099cc color from an uploaded audio file:

{
  "steps": {
    "waveformed": {
      "robot": "/audio/waveform",
      "use": ":original",
      "width": 400,
      "height": 200,
      "outer_color": "0099ccff",
      "center_color": "0099ccff"
    }
  }
}

Parameters

  • output_meta

    Record<string, boolean> | boolean | Array<string>

    Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    For images, you can add "has_transparency": true in this object to extract if the image contains transparent parts and "dominant_colors": true to extract an array of hexadecimal color codes from the image.

    For images, you can also add "blurhash": true to extract a BlurHash string — a compact representation of a placeholder for the image, useful for showing a blurred preview while the full image loads.

    For videos, you can add the "colorspace: true" parameter to extract the colorspace of the output video.

    For audio, you can add "mean_volume": true to get a single value representing the mean average volume of the audio file.

    You can also set this to false to skip metadata extraction and speed up transcoding.

  • result

    boolean (default: false)

    Whether the results of this Step should be present in the Assembly Status JSON

  • queue

    batch

    Setting the queue to 'batch', manually downgrades the priority of jobs for this step to avoid consuming Priority job slots for jobs that don't need zero queue waiting times

  • force_accept

    boolean (default: false)

    Force a Robot to accept a file type it would have ignored.

    By default, Robots ignore files they are not familiar with. 🤖/video/encode, for example, will happily ignore input images.

    With the force_accept parameter set to true, you can force Robots to accept all files thrown at them. This will typically lead to errors and should only be used for debugging or combatting edge cases.

  • ignore_errors

    boolean | Array<meta | execute> (default: [])

    Ignore errors during specific phases of processing.

    Setting this to ["meta"] will cause the Robot to ignore errors during metadata extraction.

    Setting this to ["execute"] will cause the Robot to ignore errors during the main execution phase.

    Setting this to true is equivalent to ["meta", "execute"] and will ignore errors in both phases.

  • use

    string | Array<string> | Array<object> | object

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)
    • You can provide several Steps as input with arrays:
      {
        "use": [
          ":original",
          "encoded",
          "resized"
        ]
      }
      
    Tip

    That's likely all you need to know about use, but you can view Advanced use cases.

  • ffmpeg

    object

    A parameter object to be passed to FFmpeg. If a preset is used, the options specified are merged on top of the ones from the preset. For available options, see the FFmpeg documentation. Options specified here take precedence over the preset options.

  • ffmpeg_stack

    v5 | v6 | v7 | string (default: "v5.0.0")

    Selects the FFmpeg stack version to use for encoding. These versions reflect real FFmpeg versions. We currently recommend to use "v6.0.0".

  • format

    image | json (default: "image")

    The format of the result file. Can be "image" or "json". If "image" is supplied, a PNG image will be created, otherwise a JSON file. When style is "spectrogram", only "image" is supported.

  • width

    string | number (default: 256)

    The width of the resulting image if the format "image" was selected.

  • height

    string | number (default: 64)

    The height of the resulting image if the format "image" was selected.

  • antialiasing

    0 | 1 | boolean (default: 0)

    Either a value of 0 or 1, or true/false, corresponding to if you want to enable antialiasing to achieve smoother edges in the waveform graph or not.

  • background_color

    string (default: "#00000000")

    The background color of the resulting image in the "rrggbbaa" format (red, green, blue, alpha), if the format "image" was selected.

  • center_color

    string (default: "000000ff")

    The color used in the center of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).

  • outer_color

    string (default: "000000ff")

    The color used in the outer parts of the gradient. The format is "rrggbbaa" (red, green, blue, alpha).

  • style

    v0 | v1 | spectrogram

    Waveform style version.

    • "v0": Legacy waveform generation (default).
    • "v1": Advanced waveform generation with additional parameters.
    • "spectrogram": Spectrogram visualization showing frequency content over time.

    For backwards compatibility, numeric values 0 and 1 are also accepted and mapped to "v0" and "v1".

  • split_channels

    boolean

    Available when style is "v1". If set to true, outputs multi-channel waveform data or image files, one per channel.

  • zoom

    string | number

    Available when style is "v1". Zoom level in samples per pixel. This parameter cannot be used together with pixels_per_second.

  • pixels_per_second

    string | number

    Available when style is "v1". Zoom level in pixels per second. This parameter cannot be used together with zoom.

  • bits

    8 | 16

    Available when style is "v1". Bit depth for waveform data. Can be 8 or 16.

  • start

    string | number

    Available when style is "v1". Start time in seconds.

  • end

    string | number

    Available when style is "v1". End time in seconds (0 means end of audio).

  • colors

    audition | audacity

    Available when style is "v1". Color scheme to use. Can be "audition" or "audacity".

  • border_color

    string

    Available when style is "v1". Border color in "rrggbbaa" format.

  • waveform_style

    normal | bars

    Available when style is "v1". Waveform style. Can be "normal" or "bars".

  • bar_width

    string | number

    Available when style is "v1". Width of bars in pixels when waveform_style is "bars".

  • bar_gap

    string | number

    Available when style is "v1". Gap between bars in pixels when waveform_style is "bars".

  • bar_style

    square | rounded

    Available when style is "v1". Bar style when waveform_style is "bars".

  • axis_label_color

    string

    Available when style is "v1". Color for axis labels in "rrggbbaa" format.

  • no_axis_labels

    boolean

    Available when style is "v1". If set to true, renders waveform image without axis labels.

  • with_axis_labels

    boolean

    Available when style is "v1". If set to true, renders waveform image with axis labels.

  • amplitude_scale

    string | number

    Available when style is "v1". Amplitude scale factor.

  • compression

    string | number

    Available when style is "v1". PNG compression level: 0 (none) to 9 (best), or -1 (default). Only applicable when format is "image".

  • color_map

    viridis | plasma | magma | cividis | cool | rainbow | moreland |

    Available when style is "spectrogram". Color scheme for the spectrogram visualization. Defaults to "viridis".

  • frequency_scale

    linear | logarithmic

    Available when style is "spectrogram". Frequency scale for the spectrogram. "linear" shows frequencies evenly spaced, "logarithmic" emphasizes lower frequencies. Defaults to "logarithmic".

  • frequency_min

    string | number

    Available when style is "spectrogram". Minimum frequency in Hz to display. Defaults to 0.

  • frequency_max

    string | number

    Available when style is "spectrogram". Maximum frequency in Hz to display. Defaults to half the sample rate (Nyquist frequency).

  • legend

    boolean

    Available when style is "spectrogram". Whether to include a legend showing the frequency and time scales. Defaults to false.

  • gain

    string | number

    Available when style is "spectrogram". Linear gain factor for spectrogram intensity. Defaults to 1.

  • orientation

    vertical | horizontal

    Available when style is "spectrogram". Orientation of the spectrogram. "horizontal" shows time on the x-axis (default), "vertical" shows time on the y-axis.

Demos

  • Service to generate waveform images from audio files

Related blog posts

  • New Robot generates waveform images from audio November 22, 2012
  • New pricing model for future Transloadit customers February 7, 2018
  • Let's Build: music card generator with Transloadit May 5, 2022
  • Creating engaging audio visualizations with Transloadit April 2, 2023
← /audio/merge/image/bgremove →
Transloadit
© 2009–2026 Transloadit-II GmbH
Privacy⋅Terms⋅Imprint

Product

  • Services
  • Pricing
  • Demos
  • Security
  • Support

Company

  • About / Press
  • Blog / Jobs
  • Comparisons
  • Open source
  • Solutions

Docs

  • Getting started
  • Transcoding
  • FAQ
  • API
  • Supported formats

More

  • Platform status
  • Community forum
  • StackOverflow
  • Uppy
  • Tus