How to Check PDF Conditions and Metadata in .NET with C#

Learn how to check PDF files for conditional information and metadata using pdfRest Query PDF API Tool with C#.
Share this page

Why Use Query PDF with C#?

The pdfRest Query PDF API Tool is a powerful resource that enables developers to retrieve various pieces of information from a PDF document. By sending an API call to Query PDF with C#, you can extract metadata such as the title, page count, document language, and more without having to manually open the document or write complex parsing logic.

This can be particularly useful in content management systems, where you might need to catalog and organize large numbers of PDF files based on their metadata. For instance, a user might use Query PDF to quickly scan a repository of PDFs to identify and categorize them by author or creation date.

Query PDF with C# Code Example

using System.Text;

using (var httpClient = new HttpClient { BaseAddress = new Uri("https://api.pdfrest.com") })
{
    using (var request = new HttpRequestMessage(HttpMethod.Post, "pdf-info"))
    {
        request.Headers.TryAddWithoutValidation("Api-Key", "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");
        request.Headers.Accept.Add(new("application/json"));
        var multipartContent = new MultipartFormDataContent();

        var byteArray = File.ReadAllBytes("/path/to/file.pdf");
        var byteAryContent = new ByteArrayContent(byteArray);
        multipartContent.Add(byteAryContent, "file", "file_name.pdf");
        byteAryContent.Headers.TryAddWithoutValidation("Content-Type", "application/pdf");

        var byteArrayOption = new ByteArrayContent(Encoding.UTF8.GetBytes("title, page_count, doc_language, tagged, image_only, author, creation_date, modified_date, producer"));
        multipartContent.Add(byteArrayOption, "queries");

        request.Content = multipartContent;
        var response = await httpClient.SendAsync(request);

        var apiResult = await response.Content.ReadAsStringAsync();

        Console.WriteLine("API response received.");
        Console.WriteLine(apiResult);
    }
}

Reference: pdf-rest-api-samples on GitHub

Breaking Down the Code

The code above is a C# example of how to use the pdfRest API to query information from a PDF document. Let's break it down:

var httpClient = new HttpClient { BaseAddress = new Uri("https://api.pdfrest.com") };

This line initializes a new HttpClient instance with the base address set to the pdfRest API endpoint.

request.Headers.TryAddWithoutValidation("Api-Key", "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx");

This line adds your API key to the request headers. Replace the placeholder with your actual pdfRest API key.

var byteArray = File.ReadAllBytes("/path/to/file.pdf");
var byteAryContent = new ByteArrayContent(byteArray);
multipartContent.Add(byteAryContent, "file", "file_name.pdf");
byteAryContent.Headers.TryAddWithoutValidation("Content-Type", "application/pdf");

These lines read the PDF file as a byte array, create a new ByteArrayContent with it, and add it to the multipart form data content. The "file" and "file_name" parameters are the form field name and the file name sent in the request, respectively.

var byteArrayOption = new ByteArrayContent(Encoding.UTF8.GetBytes("title, page_count, doc_language, tagged, image_only, author, creation_date, modified_date, producer"));
multipartContent.Add(byteArrayOption, "queries");

This snippet creates a byte array content with the specified queries and adds it to the multipart content. These queries determine what information you want to retrieve from the PDF.

var response = await httpClient.SendAsync(request);
var apiResult = await response.Content.ReadAsStringAsync();

Finally, the request is sent asynchronously, and the response is read as a string, which contains the queried PDF information.

Beyond the Tutorial

In this tutorial, we've accomplished sending a multipart API call to pdfRest's Query PDF endpoint using C#. This allows us to programmatically retrieve specific information from a PDF document. Exploring all of the pdfRest API Tools in the API Lab is a great next step. You can also refer to the API Reference documentation for more details on the available endpoints and their functionalities.

Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at pdf-rest-api-samples on GitHub.

Generate a self-service API Key now!

Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.