How to Split PDF with Python

Learn how to separate a PDF into separate documents in Python with the Split PDF API tool from pdfRest.
Share this page

Why Would You Split a PDF with Python?

The pdfRest Split PDF API Tool is designed to help users programmatically split a PDF document into multiple parts. This tool can be particularly useful in scenarios where you need to extract specific pages from a large document, such as when you wa nt to share only a particular section of a report with a colleague or when you need to process different parts of a document separately.

In this tutorial, we will demonstrate how to send an API call to the Split PDF endpoint using Python. We will walk through the code that makes the API call, explaining each part of the process.

Code Example for Splitting PDF with Python

from requests_toolbelt import MultipartEncoder
import requests
import json

split_pdf_endpoint_url = 'https://api.pdfrest.com/split-pdf'

split_request_data = []
split_request_data.append(('file',('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf')))
split_request_data.append(('pages', '1,2,5'))
split_request_data.append(('pages', '3,4'))
split_request_data.append(('output', 'example_splitPdf_out'))

mp_encoder_splitPdf = MultipartEncoder(
    fields=split_request_data
)

headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_splitPdf.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
}

print("Sending POST request to split-pdf endpoint...")
response = requests.post(split_pdf_endpoint_url, data=mp_encoder_splitPdf, headers=headers)

print("Response status code: " + str(response.status_code))

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

The code above is sourced from the pdf-rest-api-samples repository on GitHub.

Breaking Down the Python Code

The code starts by importing the necessary libraries:

from requests_toolbelt import MultipartEncoder
import requests
import json

The MultipartEncoder is used for creating a multipart/form-data payload for the POST request. The requests library is used to make the HTTP request, and the json library is used to handle JSON data.

The endpoint URL is defined as a constant:

split_pdf_endpoint_url = 'https://api.pdfrest.com/split-pdf'

Next, we prepare the data for the request:

split_request_data = []
split_request_data.append(('file',('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf')))
split_request_data.append(('pages', '1,2,5'))
split_request_data.append(('pages', '3,4'))
split_request_data.append(('output', 'example_splitPdf_out'))

This data includes the PDF file to be split, the page ranges for splitting, and the output file prefix. The file is opened in binary read mode.

Then, the MultipartEncoder is used to encode the request data:

mp_encoder_splitPdf = MultipartEncoder(
    fields=split_request_data
)

The headers for the request are set, including the 'Content-Type' from the encoder and the 'Api-Key' for authentication:

headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_splitPdf.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
}

The POST request is sent, and the response is printed:

response = requests.post(split_pdf_endpoint_url, data=mp_encoder_splitPdf, headers=headers)

If the request is successful, the JSON response is printed; otherwise, the error text is printed:

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

Summarizing How we Split the PDF with Python

In this tutorial, we have learned how to make an API call to the Split PDF endpoint of the pdfRest API using Python. By sending a POST request with the appropriate headers and multipart/form-data payload, we can split a PDF document into separate files based on the specified page ranges.

To explore and demo all of the pdfRest API Tools, visit the API Lab. For more detailed information, refer to the API Reference documentation.

Note: This is an example of a multipart API call. Code samples using JSON payloads for the Split PDF API can be found at pdf-rest-api-samples repository on GitHub.

Generate a self-service API Key now!

Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.