Our /image/generate Robot

Generate images from text prompts

πŸ€–/image/generate generates images from text prompts using AI.

This Robot uses generative AI to create images based on text prompts.

With advanced AI models, you can generate high-quality images by describing what you want to see. The generated images can be customized with various parameters like dimensions, style, and more depending on the model you choose.

Supported models

This Robot supports several AI models for image generation, each with different capabilities:

Model Formats Seed Aspect ratios Custom dimensions Styles Cost (USD)
flux-1.1-pro-ultra PNG, JPG βœ”οΈŽ 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16, 9:21 β€” β€” ~$0.08 per image
flux-schnell PNG, JPG, WebP βœ”οΈŽ 1:1, 16:9, 21:9, 3:2, 2:3, 4:5, 5:4, 3:4, 4:3, 9:16, 9:21 β€” β€” ~$0.004 per image
recraft-v3 SVG β€” β€” βœ”οΈŽ β€” ~$0.11 per image

Pricing

This Robot leverages top-tier AI technologies behind the scenes. For this, we partner with multiple providers and can continually pick the best-performing and most cost-effective option.

The price paid reflects the actual usage fees of these AI services, plus a margin that allows us to maintain and evolve these transformations at scale. This ensures pricing that will only go down as we find cost optimizations, and removes the complexity of managing different AI providers yourself. The cost depends on the selected model and is listed in the table above, though this is subject to change based on provider costs. We will always ensure you get the best AI capabilities in your Transloadit workflows.

The cost in USD is converted to usage in bytes based on your current plan's price per included GB, which is your plan's monthly price divided by its included GB per month. If you are on a free plan or a plan with no included GB, a price per included GB of 2 USD per GB is used. In this sense, your plan's included GB can be regarded more as credits rather than actual transcoding volume. This was already the case as we gave discounts for cost-effective workloads and, for instance, counted only every tenth byte. In the generative AI space, this works in the opposite direction, meaning workloads that have relatively small file sizes, while consuming GB credits quicker.

Usage example

Generate an image based on a text prompt:

{
  "steps": {
    "generate_image": {
      "robot": "/image/generate",
      "prompt": "A serene landscape with mountains and a lake at sunset",
      "model": "stable-diffusion-v1.5"
    }
  }
}

Parameters

  • use

    String / Array of Strings / Object required

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)

    • You can provide several Steps as input with arrays:

      "use": [
        ":original",
        "encoded",
        "resized"
      ]
      

    πŸ’‘ That’s likely all you need to know about use, but you can view Advanced use cases.

  • output_meta

    Object / Boolean β‹… default: {}

    Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    For images, you can add "has_transparency": true in this object to extract if the image contains transparent parts and "dominant_colors": true to extract an array of hexadecimal color codes from the image.

    For videos, you can add the "colorspace: true" parameter to extract the colorspace of the output video.

    For audio, you can add "mean_volume": true to get a single value representing the mean average volume of the audio file.

    You can also set this to false to skip metadata extraction and speed up transcoding.

  • model

    Stringrequired

    The AI model to use for image generation. Please see the table of supported models for all available options on their capabilities.

  • prompt

    Stringrequired

    The text prompt describing the image you want to generate.

    Be as descriptive as possible for best results. Include details about style, lighting, composition, and subject matter.

  • format

    String β‹… default: "png"

    The output format for the generated image.

    Please see the table of supported models for the format support per model.

  • seed

    Integer β‹… default: null

    A seed value for deterministic generation. Using the same seed with the same prompt and model will produce similar results. This allows for reproducible outputs.

    Please see the table of supported models for the seed support per model.

  • aspect_ratio

    String β‹… default: null

    The aspect ratio of the generated image.

    Please see the table of supported models for the aspect ratio support per model.

  • height

    Integer β‹… default: null

    The height of the generated image in pixels.

    Please see the table of supported models for the dimensions support per model.

  • width

    Integer β‹… default: null

    The width of the generated image in pixels.

    Please see the table of supported models for the dimensions support per model.

  • style

    String β‹… default: null

    The artistic style to apply to the generated image.

    Please see the table of supported models for the style support per model.

Demos

Related blog posts