
Extract pages from a document
π€/document/split extracts pages from a document.
This Robot can extract individual or multiple pages from a document and output them in a separate document. This is particularly useful when processing larger documents, but only the first few pages are needed, for example.
Pages can either be selected for extraction through their page number (e.g. 1) or through a range (e.g. 1-10). All of the selected pages will be put into one combined document, whose file type matches the input document.
Note: Currently, this Robot only supports PDF files. If you want to extract pages from other document formats, such as Microsoft Word (DOCX) or OpenDocument files (ODT), use the π€/document/convert to first convert the document into a PDF.
A well-suited combination for this Robot is π€/file/filter. Together they can be used to trim a document to a maximum page count if the limit is exceeded. See the demo for more details.
Usage example
Extract the first 10 pages from a PDF document:
{
"steps": {
"converted": {
"robot": "/document/split",
"use": ":original",
"pages": "1-10"
}
}
}
Parameters
-
use
String / Array of Strings / Object requiredSpecifies which Step(s) to use as input.
-
You can pick any names for Steps except
":original"
(reserved for user uploads handled by Transloadit) -
You can provide several Steps as input with arrays:
"use": [ ":original", "encoded", "resized" ]
π‘ Thatβs likely all you need to know about
use
, but you can view Advanced use cases. -
-
pages
String / Array of StringsrequiredThis parameter specifies the pages to extract from the document. Pages can be selected either through their page number (starting at 1, e.g.
"5"
) or through an inclusive range (e.g."1-10"
).Multiple page selections can be expressed through either an array or a comma-separated string. For example, to select the first 5 pages and the 10th page, you can use
"1-5,10"
or["1-5", "10"]
. All of the selected pages will be put into one combined document.To select pages relative to the document's end, use dynamic evaluation. The page count is accessible through the
file.meta.page_count
variable. To extract the last page of a document, use"${file.meta.page_count}"
. To extract the last 5 pages of a document, use"${file.meta.page_count-4}-${file.meta.page_count}"
.