The pdfRest Split PDF API Tool allows users to programmatically extract pages from PDF documents to separate files. This can be particularly useful in scenarios where you have a large document that needs to be divided into smaller sections, such as when distributing individual chapters of a book to different reviewers, or when extracting specific pages from a report to share with a team.
By using Python, you can automate this process and integrate it into your workflow or application.
The following code is a complete example of how to call the Split PDF API using Python. It was sourced from the pdfRest API samples available on GitHub:
from requests_toolbelt import MultipartEncoder import requests import json split_pdf_endpoint_url = 'https://api.pdfrest.com/split-pdf' # The /split-pdf endpoint can take one PDF file or id as input. # This sample takes one PDF file that has at least 5 pages and splits it into two documents when given two page ranges. # Create a list of tuples for data that will be sent to the request split_request_data = [] split_request_data.append(('file',('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'))) split_request_data.append(('pages', '1,2,5')) split_request_data.append(('pages', '3,4')) split_request_data.append(('output', 'example_splitPdf_out')) mp_encoder_splitPdf = MultipartEncoder( fields=split_request_data ) # Let's set the headers that the split-pdf endpoint expects. # Since MultipartEncoder is used, the 'Content-Type' header gets set to 'multipart/form-data' via the content_type attribute below. headers = { 'Accept': 'application/json', 'Content-Type': mp_encoder_splitPdf.content_type, 'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' # place your api key here } print("Sending POST request to split-pdf endpoint...") response = requests.post(split_pdf_endpoint_url, data=mp_encoder_splitPdf, headers=headers) print("Response status code: " + str(response.status_code)) if response.ok: response_json = response.json() print(json.dumps(response_json, indent = 2)) else: print(response.text) # If you would like to download the file instead of getting the JSON response, please see the 'get-resource-id-endpoint.py' sample.
Reference: GitHub Repository
The code snippet above demonstrates how to split a PDF document using the pdfRest API in Python. Let's break it down:
from requests_toolbelt import MultipartEncoder import requests import json
This imports the necessary modules. MultipartEncoder
is used for creating a multipart/form-data payload, which is required for file uploads.
split_pdf_endpoint_url = 'https://api.pdfrest.com/split-pdf'
This sets the API endpoint URL for splitting PDFs.
split_request_data = [] split_request_data.append(('file',('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'))) split_request_data.append(('pages', '1,2,5')) split_request_data.append(('pages', '3,4')) split_request_data.append(('output', 'example_splitPdf_out'))
Here, we're creating the data to be sent with the request. We specify the PDF file, the page ranges for splitting, and the output name.
headers = { 'Accept': 'application/json', 'Content-Type': mp_encoder_splitPdf.content_type, 'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' }
The headers include the API key, which you need to replace with your own. The Content-Type
is set automatically by the MultipartEncoder
.
response = requests.post(split_pdf_endpoint_url, data=mp_encoder_splitPdf, headers=headers)
This sends the POST request to the API endpoint with the data and headers.
In this tutorial, we've learned how to split a PDF into separate documents using the pdfRest API and Python. You can now use this code as a starting point to integrate PDF splitting functionality into your applications.
I encourage you to demo all of the pdfRest API Tools in the API Lab and refer to the API Reference documentation for further exploration.
Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at GitHub Repository.
Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.