Last updated: January 6, 2026

Let's Build: document intelligence pipeline using Transloadit

Kevin van Zonneveld

Co-founder · Amsterdam, The Netherlands · Show bio ·

The document intelligence market is booming. Companies like Reducto.ai are gaining traction by helping enterprises extract structured data from PDFs and scanned documents. Their value proposition is compelling: upload a document, define a schema, get clean JSON back.

That said, if you want product-level control – custom routing, storage, compliance, and tight integration with the rest of your file workflows – it’s often more powerful to build your own Reducto-style experience on top of primitives.

Transloadit has everything you need to provide you with those primitives. In this guide, we’ll show you how to combine our document processing Robots with 🤖 /ai/chat to create a flexible, schema-driven pipeline, which you can shape into your own document AI product.

If you want a turnkey product, a dedicated document AI vendor can be a great fit. But if you need to blend document extraction with uploads, conversions, storage, and downstream workflows, building on Transloadit gives you more leverage.

What document intelligence really means

At its core, document intelligence involves three key capabilities:

Parse – extract text from documents using OCR, preserving layout and structure.
Split – break multi-page documents into manageable chunks.
Extract – pull structured data that matches a predefined schema.

Let’s build each of these using Transloadit's versatile toolkit.

Setting up your TypeScript project

First, set up a project using the Transloadit Node SDK:

yarn init -y
yarn add transloadit

Note
The v4 Node SDK requires Node.js 20 or newer. If you are upgrading from v3 or CommonJS, see the migration guide.

Create your client:

import { Transloadit } from 'transloadit'

const transloadit = new Transloadit({
  authKey: process.env.TRANSLOADIT_AUTH_KEY!,
  authSecret: process.env.TRANSLOADIT_AUTH_SECRET!,
})

Step 1: Document parsing with OCR (optional)

The 🤖 /document/ocr Robot extracts text from PDFs, including scanned PDFs. It supports multiple providers and can return results with layout coordinates or plain text. If your source isn’t a PDF, convert it first using the 🤖 /document/convert Robot.

If you are using a PDF-capable model such as Claude Sonnet 4, you can skip OCR and send the PDF directly to the 🤖 /ai/chat Robot. OCR is still useful when you need layout coordinates or want to normalize non-PDF files first.

const parseResult = await transloadit.createAssembly({
  params: {
    steps: {
      ocr_extract: {
        robot: '/document/ocr',
        use: ':original',
        provider: 'gcp',
        format: 'json',
        granularity: 'full',
        result: true,
      },
    },
  },
  files: {
    document: './invoice.pdf',
  },
  waitForCompletion: true,
})

const ocrResults = parseResult.results.ocr_extract

The granularity: 'full' option returns bounding box coordinates for each text block, which is useful for understanding layout.

Step 2: Document splitting

For large documents, the 🤖 /document/split Robot lets you extract specific pages:

const splitResult = await transloadit.createAssembly({
  params: {
    steps: {
      first_pages: {
        robot: '/document/split',
        use: ':original',
        pages: ['1-5'],
      },
      remaining_pages: {
        robot: '/document/split',
        use: ':original',
        pages: ['6-'],
      },
    },
  },
  files: {
    document: './large-report.pdf',
  },
  waitForCompletion: true,
})

Step 3: Schema-driven data extraction with AI

Here’s where the magic happens. The 🤖 /ai/chat Robot can process documents and return structured JSON that matches your schema. This is directly comparable to Reducto’s Extract API. Claude Sonnet 4 supports PDFs, so we’ll use model: 'anthropic/claude-4-sonnet-20250514' below. When you set format: 'json', the output is a JSON file in the Assembly results.

Zod v4 ships a native z.toJSONSchema() helper. The snippets below use a small helper that calls it when available and falls back to zod-to-json-schema for Zod v3 projects.

If your account does not have shared AI credentials configured, create AI Template Credentials in the Transloadit dashboard (for OpenAI, Anthropic, or Google) and reference them via credentials.

For a quick start, you can omit credentials and set test_credentials: true to use Transloadit-provided test keys. While this is convenient for demos, shared keys can be rate-limited, so production workloads should supply their own credentials.

import { z } from 'zod'
import { zodToJsonSchema } from 'zod-to-json-schema'

const toJsonSchema = (schema: z.ZodTypeAny) =>
  typeof (z as { toJSONSchema?: (schema: z.ZodTypeAny) => unknown }).toJSONSchema === 'function'
    ? (z as { toJSONSchema: (schema: z.ZodTypeAny) => unknown }).toJSONSchema(schema)
    : zodToJsonSchema(schema)

const invoiceSchema = z.object({
  invoice_number: z.string(),
  vendor_name: z.string(),
  vendor_address: z.string().optional(),
  invoice_date: z.string().optional(),
  due_date: z.string().optional(),
  total_amount: z.number(),
  currency: z.string().optional(),
  line_items: z
    .array(
      z.object({
        description: z.string(),
        quantity: z.number().optional(),
        unit_price: z.number().optional(),
        total: z.number().optional(),
      }),
    )
    .optional(),
  tax_amount: z.number().optional(),
  payment_terms: z.string().optional(),
})

const extractionResult = await transloadit.createAssembly({
  params: {
    steps: {
      extract_data: {
        robot: '/ai/chat',
        use: ':original',
        credentials: 'my_ai_credentials',
        model: 'anthropic/claude-4-sonnet-20250514',
        format: 'json',
        schema: JSON.stringify(toJsonSchema(invoiceSchema)),
        messages: `Extract all invoice data from this document.
Be precise with amounts and dates.
If a field is not present, omit it from the response.`,
        result: true,
      },
    },
  },
  files: {
    invoice: './invoice.pdf',
  },
  waitForCompletion: true,
})

const extractedFile = extractionResult.results.extract_data[0]
const invoiceData = await fetch(extractedFile.ssl_url).then((response) => response.json())
console.log(`Invoice #${invoiceData.invoice_number}: $${invoiceData.total_amount}`)

Building a complete pipeline

Now let’s combine everything into a production-ready pipeline that:

optionally extracts text with OCR,
splits large files,
extracts structured data with AI, and
stores results in S3.

import { z } from 'zod'
import { zodToJsonSchema } from 'zod-to-json-schema'
import { Transloadit } from 'transloadit'

const toJsonSchema = (schema: z.ZodTypeAny) =>
  typeof (z as { toJSONSchema?: (schema: z.ZodTypeAny) => unknown }).toJSONSchema === 'function'
    ? (z as { toJSONSchema: (schema: z.ZodTypeAny) => unknown }).toJSONSchema(schema)
    : zodToJsonSchema(schema)

const financialDocumentSchema = z.object({
  document_type: z.enum(['invoice', 'receipt', 'statement', 'contract']),
  document_date: z.string(),
  parties: z.array(
    z.object({
      name: z.string(),
      role: z.enum(['vendor', 'customer', 'signatory']),
      address: z.string().optional(),
    }),
  ),
  amounts: z.array(
    z.object({
      description: z.string(),
      value: z.number(),
      currency: z.string(),
    }),
  ),
  key_terms: z.array(z.string()).optional(),
  summary: z.string(),
})

type FinancialDocument = z.infer<typeof financialDocumentSchema>

const transloadit = new Transloadit({
  authKey: process.env.TRANSLOADIT_AUTH_KEY!,
  authSecret: process.env.TRANSLOADIT_AUTH_SECRET!,
})

async function processFinancialDocument(filePath: string) {
  const result = await transloadit.createAssembly({
    params: {
      steps: {
        pdf_verified: {
          robot: '/file/filter',
          use: ':original',
          accepts: [['${file.mime}', '==', 'application/pdf']],
        },
        non_pdf: {
          robot: '/file/filter',
          use: ':original',
          accepts: [['${file.mime}', '!=', 'application/pdf']],
        },
        pdf_converted: {
          robot: '/document/convert',
          use: 'non_pdf',
          format: 'pdf',
        },
        extract_structured: {
          robot: '/ai/chat',
          use: ['pdf_verified', 'pdf_converted'],
          credentials: 'ai_credentials',
          model: 'anthropic/claude-4-sonnet-20250514',
          format: 'json',
          schema: JSON.stringify(toJsonSchema(financialDocumentSchema)),
          messages: 'Extract the structured data from this document.',
          result: true,
        },
        store_results: {
          robot: '/s3/store',
          use: [':original', 'extract_structured'],
          credentials: 's3_credentials',
          path: 'documents/${file.name}/',
        },
      },
    },
    files: {
      document: filePath,
    },
    waitForCompletion: true,
  })

  if (result.ok !== 'ASSEMBLY_COMPLETED') {
    throw new Error(`Assembly failed: ${result.error}`)
  }

  const extractedFile = result.results.extract_structured[0]
  return fetch(extractedFile.ssl_url).then((response) => response.json())
}

const data = await processFinancialDocument('./contract.pdf')
console.log(`Processed ${data.document_type}: ${data.summary}`)

The /document/convert → PDF path supports the following input types:

Word: .doc, .docx
PowerPoint: .ppt, .pptx, .pps, .ppz, .pot
Excel: .xls, .xlsx, .xla
OpenDocument: .odt, .ott, .odd, .oda
Web/markup: .html, .xhtml, .xml, Markdown (.md)
Text & rich text: .txt, .csv, .rtf, .rtx, .tex/LaTeX
Images: .jpg, .jpeg, .png, .gif, .svg
Vector/print: .ai, .eps, .ps

If you need OCR output for layout-aware workflows, insert a /document/ocr Step and change use: ':original' to use: 'ocr_text', then include that Step in your storage targets.

Processing multiple document types

With the 🤖 /file/filter Robot, you can route PDFs directly to the model while converting everything else to PDF first. Using != makes the non‑PDF branch explicit:

// Reuse invoiceSchema + toJsonSchema from above.
const multiTypeResult = await transloadit.createAssembly({
  params: {
    steps: {
      pdf_verified: {
        robot: '/file/filter',
        use: ':original',
        accepts: [['${file.mime}', '==', 'application/pdf']],
      },
      non_pdf: {
        robot: '/file/filter',
        use: ':original',
        accepts: [['${file.mime}', '!=', 'application/pdf']],
      },
      pdf_converted: {
        robot: '/document/convert',
        use: 'non_pdf',
        format: 'pdf',
      },
      extract_data: {
        robot: '/ai/chat',
        use: ['pdf_verified', 'pdf_converted'],
        credentials: 'ai_credentials',
        model: 'anthropic/claude-4-sonnet-20250514',
        format: 'json',
        schema: JSON.stringify(toJsonSchema(invoiceSchema)),
        messages: 'Extract data from this document.',
        result: true,
      },
    },
  },
  files: {
    doc1: './receipt.pdf',
    doc2: './receipt-photo.jpg',
  },
  waitForCompletion: true,
})

Flow overview:

:original
  ├─ pdf_verified (file/filter: mime == pdf) ──▶ /ai/chat
  └─ non_pdf (file/filter: mime != pdf) ──▶ /document/convert (pdf) ──▶ /ai/chat

Using Templates for reusability

For production use, save your Assembly Instructions as a Template and reference it by ID:

const result = await transloadit.createAssembly({
  params: {
    template_id: 'your-invoice-extraction-template',
    fields: {
      custom_prompt: 'Focus on extracting payment terms and due dates.',
    },
  },
  files: {
    document: './invoice.pdf',
  },
  waitForCompletion: true,
})

Handling large document batches

In order to process a batch of many documents efficiently, it is best to keep concurrency limited:

import pMap from 'p-map'

async function processBatch(files: string[]): Promise<Map<string, FinancialDocument>> {
  const concurrency = 5

  const batchResults = await pMap(
    files,
    async (file) => {
      const result = await transloadit.createAssembly({
        params: {
          template_id: 'document-extraction-template',
        },
        files: { document: file },
        waitForCompletion: true,
      })

      // Assumes the template includes a step named `extract_structured`.
      const extractedFile = result.results.extract_structured[0]
      const data = await fetch(extractedFile.ssl_url).then((response) => response.json())

      return {
        file,
        data,
      }
    },
    { concurrency },
  )

  return new Map(batchResults.map(({ file, data }) => [file, data]))
}

Why teams choose Transloadit for document AI

A single API covers ingest, conversion, OCR, AI extraction, and delivery.
Import/export integrations for cloud storage (S3, Azure, Google Cloud Storage, Dropbox, and more).
Assembly Instructions let you version, reuse, and branch pipelines without glue code.
A single vendor for documents and broader file workloads (previews, thumbnails, virus scanning, image/audio/video processing).

Transloadit vs. Dedicated document AI platforms

Think of Reducto as a focused product and Transloadit as a composable platform. You can replicate the core extraction flow and then extend it with everything around it.

Feature	Transloadit	Reducto.ai
OCR / text extraction	✅ 🤖 /document/ocr (PDFs)	✅ Parse API
Document splitting	✅ 🤖 /document/split	✅ Split API
Schema-based extraction	✅ 🤖 /ai/chat with JSON schema	✅ Extract API
Storage integrations	✅ Import/export to cloud storage	External tooling
Workflow orchestration	✅ Assembly Instructions + Templates	External tooling
AI providers	✅ OpenAI, Anthropic, Google (with credentials)	—
Broader file workloads	✅ Image/video/audio + previews + security	Document-focused stack

When to use this approach

This Transloadit-based approach is a strong fit when:

you want to build your own document AI product or internal platform,
you want one vendor for ingest, conversion, extraction, and delivery,
you need to combine document intelligence with broader media workflows, and
you want flexibility in choosing AI providers and predictable pricing.

Ready to build?

Document intelligence doesn’t have to mean adding another specialized vendor. With the 🤖 /document/ocr, 🤖 /document/split, and 🤖 /ai/chat Robots, you can build sophisticated extraction pipelines that rival dedicated platforms.

Start by exploring our AI Robot docs and see how far you can take your document workflows inside one platform.

#letsbuild #ai #document-processing-service #artificial-intelligence-service #developer-experience