Query PDF

Query PDF is a REST API tool that provides a programmatic way to retrieve a wide range of insights about a PDF document. It allows developers to check for conditional properties, metadata, and content details such as forms, fonts, security settings, and digital signatures. This tool is essential for conditional processing, enabling you to automate workflows and trigger subsequent actions based on a file’s unique characteristics.

Key Benefits of Query PDF API

Perform over 25 different queries in a single API call, including checks for document metadata, embedded fonts, and JavaScript, for a comprehensive overview of a PDF's properties.
Validate PDF/A conformance with the industry-standard veraPDF validation engine, returning a simple true or false value for easy programmatic checks without complex reporting.
Automate workflows with conditional processing, saving time and resources by using file properties to determine if you need to apply OCR, convert to PDF/A, or perform other operations.
Identify accessibility features by checking for the presence of structure tags, ensuring your documents meet compliance standards.
Retrieve and leverage custom metadata, returned as a JSON list of key:value pairs, enabling you to extract and use unique data properties that were added to the document by other applications.
Extract key document information, including whether a file contains signatures, passwords, or forms (Acroforms or XFA), to drive secure and specialized workflows.

Try Now with API Lab

Start right from your browser - upload files, choose parameters, generate code, and send API Calls directly from API Lab!

Build Your Solution

You have document processing problems, we have Solutions. Explore the many ways pdfRest can align your documents with your business objectives.

Browse all solutions

A PDF is sent to pdfRest for investigation under a magnifying glass then sent to a ChatGPT bot for further analysis

Integrate pdfRest with ChatGPT to Generate PDF Info Summary

Parse PDF Files to Streamline Data Extraction

Integrate pdfRest with Microsoft Power Automate

Ensure GDPR Compliance for PDF Processing with EU-Based Cloud API

Detect and Repair Non-Conformant PDF/A Documents

Add Page Numbers to PDF Files

Why is pdfRest the best API to get info from PDF?

pdfRest offers the best solution for checking a PDF for metadata and document information, because it supports conditional processing, PDF/A validation, and over 25 query options with each call.

Drive Conditional Processing

Query PDF pairs well with many other pdfRest API Tools by providing valuable information that can be used to programmatically assess and determine next steps for each document. Some common use cases include:

Key Benefits of Query PDF API

Conditionally split PDFs based on number of pages or file size
Convert PDFs to PDF/A only when they are not already conformant to that standard
Encrypt files that do not already have security measures applied
Confirm PDFs contain the expected content, such as tags, signatures, or forms, before sending them on to their intended audience

Validate PDF/A Conformance

pdfRest's Query PDF validates whether a document successfully conforms to any of the many PDF/A conformance levels. Powered by veraPDF, the industry-standard for PDF/A conformance validation, Query PDF produces results you can depend on. A simple true/false value in the JSON response provides straightforward, actionable information, so you won't waste time trying to parse results from complex reports.

Learn it All with One Call

Return all of the information you need about a PDF and its contents with one API Call. Choose from any of the 25+ query options, and send one API request with your PDF file and comma-separated list of queries. A quick response will return all of the information you requested with easy-to-parse key:value pairs in standard JSON format. All the answers you need without the overhead of complex reports to parse or superfluous data to sift out.

See Customize Your Solution below for more details about all of the supported queries.

Check out other videos

Start from Code Examples

See more code examples in our GitHub repository

Need more help?

Start with a Tutorial for step-by-step guidance

How to Check PDF Conditions and Metadata in .NET with C#

How to Check PDF Conditions and Metadata with cURL

How to Check PDF Conditions and Metadata with JavaScript in NodeJS

How to Check PDF Conditions and Metadata with PHP

How to Check PDF Conditions and Metadata with Python

How to Validate PDF/A Conformance in .NET with C#

How to Validate PDF/A Conformance with cURL

How to Validate PDF/A Conformance with JavaScript in NodeJS

10 items

Customize Your Solution

Learn about the parameters for this tool to create your custom solution.

File

The file parameter allows you to select a local file to be uploaded to pdfRest’s processing server.

See Documentation

The id parameter allows you to submit a resource ID generated by one of our API Tools. Each of our API Tools assigns a unique resource ID to your output file(s), allowing you to chain requests together without having to download intermediate files between requests.

For example, you can submit the output ID you receive from our Merge PDFs tool to reduce the file size of your new PDF before you download it.

See Documentation

Queries

tagged
- Checks for presence of structure tags in the input document.
- Returns true or false
image_only
- Checks if the document is 'image only' meaning that it will only feature a series of embedded graphical image files, one per page and does not have any text or other features common to PDF documents, except for some metadata.
- Returns true or false
title
- The title of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a title
subject
- The subject of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a subject
author
- The author of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have an author
producer
- The producer of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a producer
creator
- The creator of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a creator
creation_date
- The creation date of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a creation date
modified_date
- The most recent modification date of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have a modification date
keywords
- The keywords of the PDF as listed in the metadata.
- Returns a string which may be empty if the document does not have keywords
custom_metadata
- Retrieves custom metadata from the PDF
- Returns a JSON list of key:value pairs, where each pair represents a custom property and its value.
doc_language
- The language that the file claims to be written in.
- Returns a string
page_count
- The number of pages in the PDF document.
- Returns an integer
contains_annotations
- Checks whether the document contains annotations, such as notes, highlighted text, file attachments, crossed out text, and text callout boxes.
- Returns true or false
contains_signature
- Checks if the document contains any digital signatures.
- Returns true or false
pdf_version
- Retrieves the version of the PDF standard that the document was created with.
- Returns a string of the form X.Y.Z where X, Y, and Z are the major, minor, and extension versions respectively
file_size
- Retrieves the size of the input file in bytes.
- Returns an integer
filename
- The name of the input file.
- Returns a string
restrict_permissions_set
- Checks whether the document has restrict permissions set to prevent printing, copying, signing etc.
- Returns true or false
contains_xfa
- Checks whether the document contains XFA forms.
- Returns true or false
contains_acroforms
- Checks whether the document contains Acroforms.
- Returns true or false
contains_javascript
- Checks whether the document contains javascript.
- Returns true or false
contains_transparency
- Checks whether the document contains transparent objects.
- Returns true or false
contains_embedded_file
- Checks whether the document contains one or more embedded files.
- Returns true or false
uses_embedded_fonts
- Checks whether the document contains fully embedded fonts.
- Returns true or false
uses_nonembedded_fonts
- Checks whether the document contains non-embedded fonts.
- Returns true or false
pdfa
- Checks whether the document claims and conforms to a PDF/A standard.
- Returns true or false
requires_password_to_open
- Checks whether the document requires a password to open.
- Returns true or false.
- Note: A document requiring a password cannot be opened by this route and will not be able to return much other information

See Documentation

Safe & Secure

Confidently process your sensitive data with pdfRest. Our platform is built for robust, Enterprise-grade security and compliance. We meet rigorous standards for GDPR and HIPAA, and our controls are independently audited to ensure strict SOC 2 Type 2 compliance. Your data's protection is our commitment.

Visit Our Trust Center to Learn More

Frequently Asked Questions

Need more help? Contact Us or visit our documentation.

You can get a wide range of information about a PDF's metadata, content, and security settings by specifying any of the following queries:

Metadata Queries:
- title: The title of the PDF.
- subject: The subject of the PDF.
- author: The author of the PDF.
- producer: The producer of the PDF.
- creator: The creator of the PDF.
- creation_date: The creation date of the PDF.
- modified_date: The most recent modification date of the PDF.
- keywords: The keywords of the PDF.
- doc_language: The language that the file claims to be written in.
- custom_metadata: Retrieves any custom metadata from the PDF and presents it as a JSON list of key:value pairs.
Document Properties:
- page_count: The total number of pages in the PDF document.
- pdf_version: The version of the PDF standard the document was created with (e.g., "1.7").
- file_size: The size of the input file in bytes.
- filename: The name of the input file.
- pdfa: Checks whether the document claims and conforms to a PDF/A standard.
- pdfua_claim: Checks whether the document claims to conform to a PDF/UA standard.
- pdfe_claim: Checks whether the document claims to conform to a PDF/E standard.
- pdfx_claim: Checks whether the document claims to conform to a PDF/X standard.
Content & Structure Checks:
- tagged: Checks for the presence of structure tags in the document, which are important for accessibility.
- image_only: Checks if the document is 'image only' and lacks text or other common PDF features.
- contains_annotations: Checks for the presence of annotations, such as notes, highlights, or attachments.
- contains_signature: Checks if the document contains any digital signatures.
- contains_xfa: Checks whether the document contains XFA forms.
- contains_acroforms: Checks whether the document contains Acroforms.
- contains_javascript: Checks whether the document contains JavaScript.
- contains_transparency: Checks whether the document contains transparent objects.
- contains_embedded_file: Checks for one or more embedded files.
- uses_embedded_fonts: Checks whether the document contains fully embedded fonts.
- uses_nonembedded_fonts: Checks whether the document contains non-embedded fonts.
Security & Permissions:
- restrict_permissions_set: Checks if the document has security restrictions applied to prevent actions like printing or copying.
- requires_password_to_open: Checks if the document requires a password to open and view.

Generate a self-service API Key now!

Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.

Query PDF

Key Benefits of Query PDF API

Drive Conditional Processing

Key Benefits of Query PDF API

Validate PDF/A Conformance

Learn it All with One Call

Need more help?

Safe & Secure

What is the Query PDF API Tool?

What kind of information can I get from a PDF using this tool?

What does the API return?

How many queries can I run in a single API call?

How can I check for PDF/A conformance?

What is "conditional processing," and why is it a key benefit?

What happens if the PDF is password-protected or corrupted?

Why is pdfRest the best API to get info from a PDF?

Is pdfRest's Query PDF API secure and private?

How do I integrate PDF info queries into my application?

How can I use a no-code solution to query a PDF?

Is there a self-hosted option for this tool?

Query PDF

Key Benefits of Query PDF API

Drive Conditional Processing

Key Benefits of Query PDF API

Validate PDF/A Conformance

Learn it All with One Call

Need more help?

Safe & Secure

What is the Query PDF API Tool?

What kind of information can I get from a PDF using this tool?

What does the API return?

How many queries can I run in a single API call?

How can I check for PDF/A conformance?

What is "conditional processing," and why is it a key benefit?

What happens if the PDF is password-protected or corrupted?

Why is pdfRest the best API to get info from a PDF?

Is pdfRest's Query PDF API secure and private?

How does pdfRest handle GDPR compliance for querying PDFs?

How do I integrate PDF info queries into my application?

How can I use a no-code solution to query a PDF?

Is there a self-hosted option for this tool?