How to OCR a PDF on iOS

This guide gives an overview of the API needed to perform OCR (Optical Character Recognition) with PSPDFKit for iOS. To gain access to the OCR API and related functionality on iOS, you first need to integrate the PSPDFKitOCR framework.

API Overview

These are the main API entry points in PSPDFKitOCR that enable working with OCR:

Performing OCR

To perform OCR on an existing document, you need to provide the document to a new processor configuration. Call the performOCROnPages(at:options:) method on the processor configuration to mark the document for performing OCR once the processor is activated. You can then provide two parameters to this method. The first is in the form of page indices that tell the processor which pages of the document to perform OCR on. The second is a language option, provided using ProcessorOCROptions, that specifies the language in which the text in the document should be detected.

You can use this snippet to start:

let document: Document = ...
guard let processorConfiguration = Processor.Configuration(document: document) else {
    return
}
// Mark the processor to perform OCR on all document pages and detect text in English.
processorConfiguration.performOCROnPages(at: IndexSet(0..<IndexSet.Element(document.pageCount)), options: ProcessorOCROptions(language: .english))

let processor = Processor(configuration: processorConfiguration, securityOptions: nil)
let ocrURL: URL = ... // Writeable URL

DispatchQueue.global(qos: .userInitiated).async {
    do {
    	// This performs the actual OCR and generates the new document at the provided URL.
        try processor.write(toFileURL: ocrURL)
    } catch {
        // Handle error.
    }
    DispatchQueue.main.async {
        let ocrDocument = Document(url: ocrURL)
    }
}

PSPDFDocument *document = ...
PSPDFProcessorConfiguration *processorConfiguration = [[PSPDFProcessorConfiguration alloc] initWithDocument:document];

// Mark the processor to perform OCR on all document pages and detect text in English.
[processorConfiguration performOCROnPagesAtIndexes:[NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, document.pageCount-1)] options:[[PSPDFProcessorOCROptions alloc] initWithLanguage:PSPDFOCRLanguageEnglish]];

PSPDFProcessor *processor = [[PSPDFProcessor alloc] initWithConfiguration:processorConfiguration securityOptions:nil];
NSURL *ocrURL = ... // Writeable URL

dispatch_async(dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0), ^{
    NSError *error;
    // This performs the actual OCR and generates the new document at the provided URL.
    [processor writeToFileURL:ocrURL error:&error];
    if (error) {
        // Handle error.
    }
    dispatch_async(dispatch_get_main_queue(), ^{
        PSPDFDocument *ocrDocument = [[PSPDFDocument alloc] initWithURL:ocrURL];
    });
});

Performing OCR works even if a document already has existing real text objects on a page. In that case, the existing content will remain untouched and coexist with the OCR-detected text.

ℹ️ Note: The write call, which also performs OCR, is dispatched to a background thread. We recommend always dispatching processing to a background thread when OCR is involved, since it can take some time to complete.

Language Bundling

For OCR, we ship separate trained data models for every language. These are not included in the framework, and they need to be separately added to your project. This ensures that data models you might not need don’t add to your app bundle’s size. Therefore, we recommend only adding the specific language bundles for the languages you actually need to perform OCR for. If you added PSPDFKitOCR manually, this can be done by dragging the .bundle files into the Copy Bundle Resources build phase in your app. Read more about integrating PSPDFKitOCR and the language packs in our Getting Started guide.