Flag of Ukraine
Our /document/thumbs Robot

Extract thumbnail images from documents

🤖/document/thumbs generates an image for each page in a PDF file or an animated gif file that loops through all pages.

Things to keep in mind

  • If you convert a multi-page PDF file into several images, all result images will be sorted with the first image being the thumbnail of the first document page, etc.
  • You can also check the meta.thumb_index key of each result image to find out which page it corresponds to. Keep in mind that these thumb indices start at 0, not at 1.

Parameters

  • use

    String / Array of Strings / Objectrequired

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)

    • You can provide several Steps as input with arrays:

      "use": [
        ":original",
        "encoded",
        "resized"
      ]
      

    💡 That’s likely all you need to know about use, but you can view advanced use cases:

    › Advanced use cases
    • Step bundling. Some Robots can gather several Step results for a single invocation. For example, 🤖/file/compress would normally create one archive for each file passed to it. If you'd set bundle_steps to true, however, it will create one archive containing all the result files from all Steps you give it. To enable bundling, provide an object like the one below to the use parameter:

      "use": {
        "steps": [
          ":original",
          "encoded",
          "resized"
        ],
        "bundle_steps": true
      }
      

      This is also a crucial parameter for 🤖/video/adaptive, otherwise you'll generate 1 playlist for each viewing quality.
      Keep in mind that all input Steps must be present in your Template. If one of them is missing (for instance it is rejected by a filter), no result is generated because the Robot waits indefinitely for all input Steps to be finished.

      Here’s a demo that showcases Step bundling.

    • Group by original. Sticking with 🤖/file/compress example, you can set group_by_original to true, in order to create a separate archive for each of your uploaded or imported files, instead of creating one archive containing all originals (or one per resulting file). This is important for for 🤖/media/playlist where you'd typically set:

      "use": {
        "steps": [
          "segmented"
        ],
        "bundle_steps": true,
        "group_by_original": true
      }
      
    • Fields. You can be more discriminatory by only using files that match a field name by setting the fields property. When this array is specified, the corresponding Step will only be executed for files submitted through one of the given field names, which correspond with the strings in the name attribute of the HTML file input field tag for instance. When using a back-end SDK, it corresponds with myFieldName1 in e.g.: $transloadit->addFile('myFieldName1', './chameleon.jpg').

      This parameter is set to true by default, meaning all fields are accepted.

      Example:

      "use": {
        "steps": [ ":original" ],
        "fields": [ "myFieldName1" ]
      }
      
    • Use as. Sometimes Robots take several inputs. For instance, 🤖/video/merge can create a slideshow from audio and images. You can map different Steps to the appropriate inputs.

      Example:

      "use": {
        "steps": [
          { "name": "audio_encoded", "as": "audio" },
          { "name": "images_resized", "as": "image" }
        ]
      }
      

      Sometimes the ordering is important, for instance, with our concat Robots. In these cases, you can add an index that starts at 1. You can also optionally filter by the multipart field name. Like in this example, where all files are coming from the same source (end-user uploads), but with different <input> names:

      Example:

      "use": {
        "steps": [
          { "name": ":original", "fields": "myFirstVideo", "as": "video_1" },
          { "name": ":original", "fields": "mySecondVideo", "as": "video_2" },
          { "name": ":original", "fields": "myThirdVideo", "as": "video_3" }
        ]
      }
      

      For times when it is not apparent where we should put the file, you can use Assembly Variables to be specific. For instance, you may want to pass a text file to 🤖/image/resize to burn the text in an image, but you are burning multiple texts, so where do we put the text file? We specify it via ${use.text_1}, to indicate the first text file that was passed.

      Example:

      "watermarked": {
        "robot": "/image/resize",
        "use"  : {
          "steps": [
            { "name": "resized", "as": "base" },
            { "name": "transcribed", "as": "text" },
          ],
        },
        "text": [
          {
            "text"  : "Hi there",
            "valign": "top",
            "align" : "left",
          },
          {
            "text"    : "From the 'transcribed' Step: ${use.text_1}",
            "valign"  : "bottom",
            "align"   : "right",
            "x_offset": 16,
            "y_offset": -10,
          }
        ]
      }
      
  • page

    Integer / Null ⋅ default: null

    The PDF page that you want to convert to an image. By default the value is null which means that all pages will be converted into images.

  • format

    String ⋅ default: "png"

    The format of the extracted image(s). Supported values are "jpeg", "jpg", "gif" and "png".

    If you specify the value "gif", then an animated gif cycling through all pages is created. Please check out this demo to learn more about this.

  • delay

    Integer / Null ⋅ default: null

    If your output format is "gif" then this parameter sets the number of 100th seconds to pass before the next frame is shown in the animation. Set this to 100 for example to allow 1 second to pass between the frames of the animated gif.

    If your output format is not "gif", then this parameter does not have any effect.

  • width

    Integer(1-5000) ⋅ default: auto[?]

    Width of the new image, in pixels. If not specified, will default to the width of the input image

  • height

    Integer(1-5000) ⋅ default: auto[?]

    Height of the new image, in pixels. If not specified, will default to the height of the input image

  • resize_strategy

    String ⋅ default: "pad"
  • background

    String ⋅ default: "#FFFFFF"

    Either the hexadecimal code or name of the color used to fill the background (only used for the pad resize strategy).

    By default, the background of transparent images is changed to white. For details about how to preserve transparency across all image types, see this demo.

  • alpha

    String ⋅ default: ""

    Change how the alpha channel of the resulting image should work. Valid values are "Set" to enable transparency and "Remove" to remove transparency.

    For a list of all valid values please check the ImageMagick documentation here.

  • density

    String / Null ⋅ default: null

    While in-memory quality and file format depth specifies the color resolution, the density of an image is the spatial (space) resolution of the image. That is the density (in pixels per inch) of an image and defines how far apart (or how big) the individual pixels are. It defines the size of the image in real world terms when displayed on devices or printed.

    You can set this value to a specific width or in the format widthxheight.

    If your converted image has a low resolution, please try using the density parameter to resolve that.

  • colorspace

    String ⋅ default: ""

    Sets the image colorspace. For details about the available values, see the ImageMagick documentation.

    Please note that if you were using "RGB", we recommend using "sRGB". ImageMagick might try to find the most efficient colorspace based on the color of an image, and default to e.g. "Gray". To force colors, you might then have to use this parameter.

  • trim_whitespace

    Boolean ⋅ default: true

    This determines if additional whitespace around the PDF should first be trimmed away before it is converted to an image. If you set this to true only the real PDF page contents will be shown in the image.

    If you need to reflect the PDF's dimensions in your image, it is generally a good idea to set this to false.

  • pdf_use_cropbox

    Boolean ⋅ default: true

    Some PDF documents lie about their dimensions. For instance they'll say they are landscape, but when opened in decent Desktop readers, it's really in portrait mode. This can happen if the document has a cropbox defined. When this option is enabled (by default), the cropbox is leading in determining the dimensions of the resulting thumbnails.

  • output_meta

    Object / Boolean ⋅ default: {}

    Generally, this parameter allows you to specify a set of metadata that is more expensive on cpu power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    This Robot only supports the default value of {} (meaning all meta data will be extracted) and false. A value of false means that only width, height, size and thumb_index will be extracted for the result images, which would also provide a great performance boost for documents with many pages.

ImageMagick parameters

  • imagemagick_stack

    String ⋅ default: "v2.0.9"

    Selects the ImageMagick stack version to use for encoding. These versions do not reflect any real ImageMagick versions, they reflect our own internal (non-semantic) versioning for our custom ImageMagick builds. We currently recommend to use "v3.0.0".

    Supported values: "v2.0.9", "v3.0.0".

    A full comparison of supported formats, per stack, can be found here.

Demos

Related blog posts

Uppy
20% off any plan for the Uppy community
Use the UPPY20 code when upgrading.
Sign up
tus
20% off any plan for the tus community
Use the TUS20 code when upgrading.
Sign up
Product Hunt
20% off any plan for Product Hunters
Use the PRH20 code when upgrading.
Sign up