How to Extract Images from PDF Files with cURL, Tutorial

Share this page

Why Extract PDF Images with cURL?

The pdfRest Extract Images API Tool is a powerful resource for developers and users who need to extract images from PDF documents efficiently. By leveraging this tool, you can automate the process of pulling images from PDFs, which can be particularly useful for batch processing or integrating into larger workflows. This tutorial will demonstrate how to send an API call using cURL to extract images from a PDF document.

Imagine you are working for a company that receives numerous PDF reports daily, each containing valuable images that need to be archived separately. Instead of manually extracting each image, you can use the Extract Images API Tool to automate this task, saving time and reducing the potential for human error. This automation can streamline workflows and ensure that image assets are efficiently managed and stored.

Extract PDF Images with cURL Code Example

curl -X POST "https://api.pdfrest.com/extracted-images" \
  -H "Accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -H "Api-Key: xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" \
  -F "file=@/path/to/file" \
  -F "output=example_out" \
  -F "pages=1-last"

Source: GitHub - pdf-rest-api-samples

Breaking Down the Code

Let's break down the provided cURL command to understand each component:

curl -X POST "https://api.pdfrest.com/extracted-images"

This line initiates a POST request to the Extract Images endpoint of the pdfRest API. The POST method is used here because we are sending data to the server to process.

-H "Accept: application/json"

This header indicates that the client expects the server to respond with JSON-formatted data. JSON is a lightweight data interchange format that is easy for humans to read and write.

-H "Content-Type: multipart/form-data"

This header specifies that the request body will be sent as multipart/form-data. This is necessary when uploading files because it allows the file to be sent as binary data.

-H "Api-Key: xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

The API key is a unique identifier used to authenticate the request. You must replace the placeholder with your actual API key to authorize your request.

-F "file=@/path/to/file"

This form field specifies the file to be uploaded. The '@' symbol tells cURL to read the file from the specified path. Replace '/path/to/file' with the actual path to your PDF document.

-F "output=example_out"

This form field sets the name of the output file. In this example, the extracted images will be saved with a prefix 'example_out'.

-F "pages=1-last"

This field defines the range of pages from which images will be extracted. '1-last' means that images will be extracted from the first page to the last page of the PDF.

Beyond the Tutorial

In this tutorial, you learned how to use cURL to make an API call to the pdfRest Extract Images endpoint. This process allows you to extract images from PDF files efficiently. To explore more functionalities, consider experimenting with all the pdfRest API Tools in the API Lab. For further details, refer to the API Reference Guide.

Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at GitHub - JSON Payload Examples.