Introducing PDF Inspector

Over the years, we’ve had our fair share of weird PDF documents with surprising bugs — ones that require you to dig deep into a file to understand what’s wrong. In the past, investigating these issues was a tedious process since there weren’t any great tools available to do the work for us.

This eventually led to the inception of PDF Inspector, a native Mac app that’s built for exactly this use case. Today, we’re releasing this diagnostic tool publicly on the Mac App Store.

PDF Syntax 101

A PDF consists of objects that can have varying types (see PDF Syntax 101). So you can simply open a PDF in your favorite text editor and read it, however, editing the file will likely corrupt the document, as PDFs are really binary documents.

Content streams define how a page, an annotation, or a form is rendered. They are usually compressed and nested within thousands of other objects. For efficient browsing, it makes sense to present the objects in a tree. This is exactly how we built PDF Inspector.

Features

PDF Inspector is a powerful diagnostic tool for reading and analyzing PDF files. It displays a PDF’s objects as a tree and allows you to view, edit, delete, and update arbitrary PDF objects. These are the main features:

We built PDF Inspector to debug and understand PDF documents and to improve the PSPDFKit SDK. It’s an advanced diagnostic tool that can help you understand why a file is corrupted or even fix issues in files. Because it’s built on the strong foundation of PSPDFKit for macOS, it should be able to open any PDF you throw at it.

To demonstrate how PDF Inspector works, let’s explore a couple use cases.

Use Case: Check the Appearance Stream of an Annotation

Whenever there’s an issue with an annotation, it’s useful to inspect the key/value pairs defined in the object and subobjects. Simply select the page the annotation is on, choose the Annots array, and cycle through the annotations until you find the object you’re interested in. (See also: What Are Annotations?)

In the following example, we’re inspecting a free text annotation with a predefined appearance stream.

PDF Inspector makes it easy to modify any key/value pair and even edit or remove appearance streams. Once a document is edited, PDF Inspector supports both incremental and full save — with the latter you can ensure that deleted objects are actually deleted. (See also: What’s Hiding in Your PDF?)

The PDF specification explains what each key is for. See 12.5.2 Annotation Dictionaries, Page 382.

Use Case: See Validation Rules of a PDF Form

PDF Forms can be highly complex and even include JavaScript. And with some creativity, you can create surprising things in a PDF! When a form doesn’t quite work, it might contain specific validation rules or even JavaScript. PDF Inspector makes it easy to look at the individual form objects and see which validation rules are active.

Above we have two rules in the AA (additional actions) dictionary, one being F, “a JavaScript action that shall be performed before the field is formatted to display its value.” The other is K, “a JavaScript action that shall be performed when the user modifies a character in a text field or combo box or modifies the selection in a scrollable list box. This action may check the added text for validity and reject or modify it.” (The definitions are from the PDF specification, Table 196 – Entries in a form field’s additional-actions dictionary, Page 416).

Both use AFDate to format the input to m/d/yy. There’s no other document-level JavaScript set.

Conclusion

PDF Inspector is now available on the Mac App Store. If you’re having issues or have feature requests, hit us up on support.

PSPDFKit Newsletter

Subscribe to our newsletter for more articles like this.