Image Document Specification

Image Documents

Introduction

Here at PSPDFKit, we want to provide users with an enhanced experience when annotating images. We believe that users should be able to open their images at any time and have no problem editing the annotations that were created in the past, just like with a PDF document.

We have built a standard leveraging the open ISO 16684-1:2012 extensible metadata platform (XMP) specification to save various kinds of information, which can be drawn upon when opening images via PSPDFKit. To add to that, in this guide we will be opening up our XMP format so that anybody can parse the metadata and create an editable PDF which can be saved and opened at later date.

See our announcement blog post to learn how you can annotate PNG and JPG just like PDF with Image Documents or read the detailed Image Document Guide

XMP Format

The XMP data within an image files holds all the necessary information to be able to compile an editable PDF. This PDF represents the original image with annotations overlaid. To compile the editable PDF, we hold four important pieces of information. This information is held in the pspdf namespace, and as such, all attributes will have a prefix of pspdf:.

Image Document Version

To ensure that we can update the Image Documents standard in the future, we have included a version number. That way, if we do make changes to the XMP format, we can identify the required parser:

| Attribute | Value | | imageDocumentVersion | Semantic version |

1
pspdf:imageDocumentVersion="1.0.0"

Document

The document tag holds a full copy of the PDF document, but it omits one important piece of information: the original image. To retain the quality and size of the input image, it was decided that the original image should not be stored within the PDF data and instead appended upon opening. Additionally, by holding the entire PDF document, the standard has the ability to build upon the many features of the PDF specification.

The PDF document is encoded in base64, which will mitigate any ASCII representation issues in XMP:

| Attribute | Value | | document | A base64-encoded PDF document |

1
pspdf:document="3fkWTefn33..."

Original Image

As we have decided not to hold the original image in the PDF document, and since the image displayed in the main image file will be a render of the current representation with annotations, we have to hold the original image in the metadata for recall at a later date. This data will be held in base64 and will be a direct copy of the original image data. With this data, along with PDF data, we can compile an up-to-date PDF representation of the annotated image:

| Attribute | Value | | originalImage | A base64-encoded image file |

1
originalImage="3fknefn33..."

Rendered Image Checksum

We needed a way of checking if the image has changed in any way without us knowing, because it is still possible to edit images externally with an image editor. For this reason, we added a CRC32 checksum of the image shown in a standard image viewer. The checksum data can be checked when reopening the file to ensure no edits have been made. At this point, the developer can then decide to disregard the XMP data and create a new image document or show an error to the user and advise them of what to do in this situation:

| Attribute | Value | | fileImageChecksum | CRC32 checksum |

1
pspdf:fileImageChecksum="3fknefn33..."

Example File

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
�PNG
^Z
^@^@^@
IHDR^@^@^Ah^@^@^@�^D^C^@^@^@4vd'^@^@.^RiTXtXML:com.adobe.xmp^@^@^@^@^@
<?xpacket begin=" " id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.6.0">
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about=""
    xmlns:pspdf="http://pspdfkit.com/pdf/xmp/1.0/"
   pspdf:imageDocumentVersion"1.0.0"
   pspdf:document="JVBERi0xLj......"
   pspdf:originalImage="iVBORw0KG......"
   pspdf:fileImageChecksum="234097520">
  </rdf:Description>
 </rdf:RDF>
</x:xmpmeta>
<?xpacket end="r"?>
${PNG_IMAGE_DATA}