OCR Supported Languages in Linux


PSPDFKit Processor has been deprecated and replaced by PSPDFKit Document Engine. All PSPDFKit Processor licenses will work as before and be supported until 15 May 2024 (we will contact you about license migration). To start using Document Engine, refer to the migration guide. With Document Engine, you’ll have access to robust new capabilities (read the blog for more information).

The OCR component offered by many of our products supports a wide variety of languages. The identification of text in a given document requires understanding of the specific language and its symbols and rules (for example, the language’s use of ligatures and punctuation rules). Because of this, our OCR component requires you to pass in the language of the processed document as part of its configuration.

Below is an exhaustive list of the languages the OCR component supports on all supported platforms.

  • Croatian

  • Czech

  • Danish

  • Dutch

  • English

  • Finnish

  • French

  • German

  • Indonesian

  • Italian

  • Malay

  • Norwegian

  • Polish

  • Portuguese

  • Serbian

  • Slovak

  • Slovenian

  • Spanish

  • Swedish

  • Turkish

  • Welsh

Please note that languages aren’t region specific. For example, you’d use English for both American English and British English.

If you don’t see the language you need in this list, please contact support.