Enabling the unicode61 Tokenizer


PSPDFKit uses SQLite to build the full-text index used in PSPDFLibrary and PSPDFDocumentPickerController, and also for various other data saving operations (like the image cache metadata). The default build for PSPDFKit doesn't ship with its own SQLite version, but uses the one that is already in iOS. Although PSPDFKit also supports custom SQLite builds.

The current version of SQLite is 3.12.x, which offers many improvements both in performance and stability. (For example, PSPDFKit will switch to memory-mapped I/O which can be twice as fast, when it detects SQLite 3.7.17 or higher.)

By default, PSPDFKit uses its own, custom tokenizer which works great for many languages, including CJK. It also enables searching for related words, like finding 'dependencies' when searching for 'depending'.

When should you ship your own build of SQLite?

If you rely a lot on exact word or phrase matches, the tokenizer shipped with PSPDFKit might not be optimal and you should consider switching to a custom one.

By default, PSPDFKit uses a custom tokenizer for building the FTS-Index that can deal with CJK characters as well. Alternatively, the unicode61-Tokenizer is also a good choice. Apple only started to ship this tokenizer in iOS 9.

If your Application only targets iOS 9 and above, you can simply enable the unicode61 tokenizer. All you need to do is to create your own library and initialize your PSPDFDocumentPickerController with that.

Copy
1
2
3
4
5
6
do {
    let library = try PSPDFLibrary(path: PSPDFLibrary.defaultLibraryPath(), tokenizer: "unicode61")
    let documentPicker = PSPDFDocumentPickerController(directory: "/path/to/files", includeSubdirectories: true, library: library)
} catch {
    // Handle error
}
Copy
1
2
PSPDFLibrary *library = [PSPDFLibrary libraryWithPath:PSPDFLibrary.defaultLibraryPath tokenizer:@"unicode61" error:NULL];
PSPDFDocumentPickerController *documentPicker = [[PSPDFDocumentPickerController alloc] initWithDirectory:@"/path/to/files" includeSubdirectories:YES library:library];

Optionally you can also ship your own version of SQLite. To do so, please follow the following steps.

In the PSPDFKit.dmg you downloaded, you will find a current version of SQLite in the "Extras" folder already prepared to be linked. Add the SQLite.xcodeproj to your Xcode project, then add libSQLite.a as a Target Dependency and under "Link Binary with Libraries". Make sure that you don't link the libsqlite3.tbd library.

You will have to delete your app or at least the library file so that the index is fully rebuilt after a different tokenizer has been set.

Was this page helpful? We're happy to answer any questions.