Scan and Convert to Searchable PDFs on iOS

PSPDFKit ships with advanced OCR capabilities. For a list of supported languages, see here.

When working with PDFs, you might encounter documents that contain pages with inaccessible text. This is especially common when dealing with scanned documents or documents that contain photographed pages. With our OCR component, you can enhance those raster and vector PDFs to give you interactive text, thereby unlocking powerful PDF text functionality such as text annotations, text selection, text extraction, and search.

PSPDFKit’s OCR framework uses machine learning (ML) to analyze the scanned image in a PDF and converts it into textual information. This textual information is then embedded back into the PDF as invisible text, which makes the text selectable and searchable. To read about how our OCR works in detail, check out this blog post.

OCR is an additional component that can be added to your license. Please reach out to us if you’re interested in adding this to your license, if you want to learn more about the roadmap for OCR, or if you want to provide feedback and feature requests related to your use case.

Before following the next steps, please make sure you’ve set up PSPDFKitOCR correctly, as described in the getting started guide.

API Overview

These are the main API entry points in PSPDFKitOCR that enable working with OCR:

Performing OCR

To perform OCR on an existing document, you need to pass the document to a new processor configuration. Call the performOCROnPages(at:options:) method on the processor configuration to configure the processor for OCR once the processing executes with the write call. You can provide two parameters to this method. The first is a set of page indices that tell the processor which pages of the document to perform OCR on. The second is a language option, provided using ProcessorOCROptions. This specifies the language in which the text in the document should be detected.

You can use this snippet to start:

let document: Document = ...
guard let processorConfiguration = Processor.Configuration(document: document) else {
    // Handle error.
    return
}
// Mark the processor to perform OCR on all document pages and detect text in English.
processorConfiguration.performOCROnPages(at: IndexSet(0..<IndexSet.Element(document.pageCount)), options: ProcessorOCROptions(language: .english))

let processor = Processor(configuration: processorConfiguration, securityOptions: nil)
let ocrURL: URL = ... // Writeable URL.

DispatchQueue.global(qos: .userInitiated).async {
    do {
    	// This performs the actual OCR and generates the new document at the provided URL.
        try processor.write(toFileURL: ocrURL)
    } catch {
        // Handle error.
    }
    DispatchQueue.main.async {
        let ocrDocument = Document(url: ocrURL)
    }
}
PSPDFDocument *document = ...
PSPDFProcessorConfiguration *processorConfiguration = [[PSPDFProcessorConfiguration alloc] initWithDocument:document];

// Mark the processor to perform OCR on all document pages and detect text in English.
[processorConfiguration performOCROnPagesAtIndexes:[NSIndexSet indexSetWithIndexesInRange:NSMakeRange(0, document.pageCount-1)] options:[[PSPDFProcessorOCROptions alloc] initWithLanguage:PSPDFOCRLanguageEnglish]];

PSPDFProcessor *processor = [[PSPDFProcessor alloc] initWithConfiguration:processorConfiguration securityOptions:nil];
NSURL *ocrURL = ... // Writeable URL.

dispatch_async(dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0), ^{
    NSError *error;
    // This performs the actual OCR and generates the new document at the provided URL.
    [processor writeToFileURL:ocrURL error:&error];
    if (error) {
        // Handle error.
    }
    dispatch_async(dispatch_get_main_queue(), ^{
        PSPDFDocument *ocrDocument = [[PSPDFDocument alloc] initWithURL:ocrURL];
    });
});

Performing OCR works even if a document already has existing real text objects on a page. In that case, the existing content will remain untouched and coexist with the OCR-detected text.

ℹ️ Note: The write call, which also performs OCR, is dispatched to a background thread. We recommend always dispatching processing to a background thread when OCR is involved, since it can take some time to complete.

Language Bundling

For OCR, we ship separate trained data models for every language. These aren’t included in the framework, and they need to be separately added to your project. This ensures that data models you might not need don’t add to your app bundle’s size. Therefore, we recommend only adding the specific language bundles for the languages you actually need to perform OCR for. If you added PSPDFKitOCR manually, this can be done by dragging the .bundle files into the Copy Bundle Resources build phase in your app. Read more about integrating PSPDFKitOCR and the language packs in our getting started guide.