Indexing PDF Documents on Android

PSPDFKit supports fast and efficient full-text search in PDF documents through PdfLibrary. This document describes how to get started with PdfLibrary.

Getting Started

Using PdfLibrary is relatively straightforward. You begin by indexing documents:

// Assume that you have two valid `PdfDocument`s.
val doc1 : PdfDocument = ...
val doc2 : PdfDocument = ...

// The library will be saved in your application's files directory.
val library = PdfLibrary.get(File(context.filesDir, "library.db").absolutePath)
library.enqueueDocuments(listOf(doc1, doc2))
// Assume that you have two valid `PdfDocument`s.
PdfDocument doc1, doc2;

// The library will be saved in your application's files directory.
PdfLibrary library = PdfLibrary.get(new File(context.getFilesDir(), "library.db").getAbsolutePath());
List<PdfDocument> documentList = new ArrayList<>();
documentList.add(doc1);
documentList.add(doc2);
library.enqueueDocuments(documentList);

PdfLibrary allows you to query for the current indexing state.

You can decide to only query the library if all documents have been indexed by using isIndexing(). You can also check the current status for individual documents by using getIndexStatusForUID().

The results are delivered to you with an onSearchCompleted callback in QueryResultListener. The results themselves are delivered as a Map that maps the document’s UID String to a set of page numbers containing the result.

If you wish to show preview snippets, you should enable the generateTextPreviews() query option. Then the preview text snippets will be delivered to you in the onSearchPreviewsGenerated method of QueryResultListener as a Map mapping the document’s UID String to a set of QueryPreviewResult objects.

Example:

// Set up search result options.
val options = QueryOptions.Builder()
    .generateTextPreviews(true)
    .previewRange(20, 120)
    .build()

// Run the search. The search will run on a background thread and the callbacks will be called
// from the background thread as well.
library.search("looking for this text", options, object : QueryResultListener {
    override fun onSearchCompleted(p0: String, p1: Map<String, Set<Int>>) {
        // Results contain UID → set of pages mapping.
    }

    override fun onSearchPreviewsGenerated(p0: String, p1: Map<String, Set<QueryPreviewResult>>) {
    	// Previews contain UID → set of `QueryPreviewResult` mappings.
    }
})
// Set up search result options.
final QueryOptions options = new QueryOptions.Builder()
    .generateTextPreviews(true)
    .previewRange(20, 120)
    .build();

// Run the search. The search will run on a background thread and the callbacks will be called
// from the background thread as well.
library.search("looking for this text", options, new QueryResultListener() {
    @Override
    public void onSearchCompleted(@NonNull String searchString, @NonNull Map<String, Set<Integer>> results) {
        // Results contain UID → set of pages mapping.
    }

    @Override
    public void onSearchPreviewsGenerated(@NonNull String searchString, @NonNull Map<String, Set<QueryPreviewResult>> previews) {
    	// Previews contain UID → set of `QueryPreviewResult` mappings.
    }
});