Extract Data from PDF Form Fields in iOS

The data that users fill in PDF form fields can be extracted programmatically or serialized to the XFDF format or the Instant JSON format.

Extracting Form Data Programmatically

To extract data in a custom format, you can read the values from the form fields, which are accessible using a document’s form parser:

// Try to get a reference to the form fields in the document.
guard let formFields = document.formParser?.formFields else {
    return
}

// Enumerate the form fields.
for formField in formFields {
    // You may want to honor this flag by not exporting the value of this field if the flag is `true`.
    if formField.isNoExport {
	     continue
	 }

    // Use `fullyQualifiedName` to uniquely identify the field within the document.
	 let fieldIdentifier = formField.fullyQualifiedName

    // Read the value as either a string or an array of strings.
    if let stringValue = formField.exportValue as? String {
        // Use the value...
    } else if let arrayValue = formField.exportValue as? [String] {
        // Use the value...
    }
}

Extracting Form Data as XFDF

You can export the values of form fields from a document as an XFDF file using XFDFWriter like this:

let document: Document = ...

// Collect all form elements from the document.
let formElements = document.allAnnotations(of: .widget).values.flatMap { $0 }

// Get a file URL where the XFDF file should be written.
guard let documentsDirectory = FileManager.default.urls(for: FileManager.SearchPathDirectory.documentDirectory, in: .userDomainMask).first else {
    return
}
let fileURLToExportTo = documentsDirectory.appendingPathComponent("form-values.xfdf", isDirectory: false)

// Export form values as XFDF to this file.
let dataSink = try FileDataSink(fileURL: fileURLToExportTo)
try XFDFWriter().write(formElements, to: dataSink, documentProvider: document.documentProviders[0])

Extracting Form Data as Instant JSON

Instant JSON is optimized for annotations. However, generating Instant JSON from a document will also include form field values.

Since exporting Instant JSON from a document produces a diff between the in-memory document and the document saved on disk, you should ensure auto-save is disabled and that you’re not saving the document at any other time. Disable auto-save by configuring a PDFViewController like this:

let pdfViewController = PDFViewController(document: document) {
    $0.isAutosaveEnabled = false
}

You can export Instant JSON as a file using generateInstantJSON(from:) like this:

let instantJSONData = try document.generateInstantJSON(from: document.documentProviders[0])
try instantJSONData.write(to: fileURLToExportTo)

Note that this Instant JSON will also include the visual properties of the form element, such as text color and border style, as well as other annotations, such as text highlights and drawings.

Exporting Instant JSON from an Annotation using the generateInstantJSON() function isn’t appropriate for form field values because this models the properties of the form element (text color, border style, etc.) and doesn’t include the value of a form field associated with a form element.