OCR

Information

OCR for SharePoint, Microsoft Teams, Microsoft OneDrive, and Salesforce is currently in development. For early access, or to learn more, please get in touch.

You can perform OCR (Optical Character Recognition) on any document with PSPDFKit for Web Server-Backed.

To do so, open the document from PSPDFKit Server and apply the performOcr document operation with Instance.applyOperations:

await instance.applyOperations([
  { type: "performOcr", language: "english", pageIndexes: "all" }
]);

This will detect all English text in the document and make it available for searching and manual text selection.

Other Languages

If your document is written in a language other than English, you can extract its text by modifying the language parameter. For example, to perform OCR in Spanish, run:

await instance.applyOperations([
  { type: "performOcr", language: "spanish", pageIndexes: "all" }
]);

PSPDFKit for Web can perform OCR in the following languages:

  • Croatian

  • Czech

  • Danish

  • Dutch

  • English

  • Finnish

  • French

  • German

  • Indonesian

  • Italian

  • Malay

  • Norwegian

  • Polish

  • Portuguese

  • Serbian

  • Slovak

  • Slovenian

  • Spanish

  • Swedish

  • Turkish

  • Welsh