Document Processing

PSPDFKit allows editing, splitting, and merging of documents, as well as annotation flattening using the PdfProcessor class.

Important: When using the processor API before loading a document, you must ensure PSPDFKit is fully initialized or processing will fail. Check out our Adding the License Key guide for more information.

Extraction of Pages

PdfProcessor can export pages from one document into another document. You can choose to extract a single page, a range of pages, or even multiple page ranges:

Copy
1
2
3
4
5
6
7
8
// Pages numbers start at 0. This range therefore contains the fifth page of the document.
val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4))

// Keep pages 5, 6, and 7.
val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6))

// Remove the first page.
val task = PdfProcessorTask.fromDocument(document).removePages(setOf(0))
Copy
1
2
3
4
5
6
7
8
// Pages numbers start at 0. This range therefore contains the fifth page of the document.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4));

// Keep pages 5, 6, and 7.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4, 5, 6));

// Remove the first page.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document).removePages(new HashSet<Integer>(Arrays.asList(0));

After creating PdfProcessorTask, you can start the extraction of the pages by calling the PdfProcessor#processDocumentAsync method or the PdfProcessor#processDocument method. Note that by default, all annotations will be preserved. You can queue multiple operations on a document by calling multiple methods on a PdfProcessorTask object before starting processing. The operations will be executed in the same order as your method calls:

Copy
1
2
3
4
5
6
7
8
9
    val outputFile = File(getFilesDir(), "extracted-pages.pdf")
    val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6))

    PdfProcessor.processDocumentAsync(task, outputFile)
                  .subscribeOn(Schedulers.io())
                  .observeOn(AndroidSchedulers.mainThread())
                  .subscribe({ progress: PdfProcessor.ProcessorProgress -> Toast.makeText(context, "Processing page " + progress.pagesProcessed + "/" + progress.totalPages, Toast.LENGTH_SHORT) },
                             { error: Throwable -> Toast.makeText(context, "Processing has failed:" + error.message, Toast.LENGTH_SHORT) },
                             { Toast.makeText(context, "Processing has been completed successfully.", Toast.LENGTH_SHORT); });
Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
final File outputFile = new File(getFilesDir(), "extracted-pages.pdf");

// Keep pages 5, 6 and 7.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document).keepPages(new HashSet<Integer>(Arrays.asList(4, 5, 6));
PdfProcessor.processDocumentAsync(task, outputFile)
              // Run processing on background thread
              .subscribeOn(Schedulers.io())
              // Publish results on main thread so we can update UI
             .observeOn(AndroidSchedulers.mainThread())
             .subscribe(new DefaultSubscriber<PdfProcessor.ProcessorProgress>() {

                 @Override
                 public void onComplete() {
                     Toast.makeText(context, "Processing has been completed successfully.", Toast.LENGTH_SHORT);
                 }

                 @Override
                 public void onError(Throwable e) {
                     Toast.makeText(context, "Processing has failed:" + e.getMessage(), Toast.LENGTH_SHORT);
                 }

                 @Override
                 public void onNext(PdfProcessor.ProcessorProgress processorProgress) {
                     Toast.makeText(context, "Processing page " + processorProgress.getPagesProcessed() + "/" + processorProgress.getTotalPages(), Toast.LENGTH_SHORT);
                 }
             });

Tip: You can use page extraction to merge pages of two or more documents. All you need to do is load a compound PdfDocument — for example, by using PSPDFKit#openDocuments or any of the PdfActivity#showDocuments methods. Have a look at the DocumentProcessingExample inside the Catalog app for a demo of this.

Annotation Flattening

When flattening an annotation, the annotation is removed from the document while its visual representation is kept intact. A flattened annotation is still visible but is no longer editable by your users or by your app. This can be used to, for example, fix annotations onto your document. If not otherwise specified, the processor will keep all annotations as they are.

To change how annotations are processed, use the PdfProcessorTask#changeAllAnnotations, PdfProcessorTask#changeAnnotationsOfType, or PdfProcessorTask#changeAnnotations method calls:

Copy
1
2
3
4
5
6
7
8
// Process all pages of the document, flattening all of its annotations.
val task = PdfProcessorTask.fromDocument(document).changeAllAnnotations(PdfProcessorTask.AnnotationProcessingMode.FLATTEN)
PdfProcessor.processDocumentAsync(...) ...

// Flatten only free text annotations, and copy everything else.
val task = PdfProcessorTask.fromDocument(document).changeAllAnnotations(PdfProcessorTask.AnnotationProcessingMode.KEEP)
                                           .changeAnnotationsOfType(AnnotationType.FREETEXT, PdfProcessorTask.AnnotationProcessingMode.FLATTEN)
PdfProcessor.processDocumentAsync(...) ...
Copy
1
2
3
4
5
6
7
8
9
10
// Process all pages of the document, flattening all of its annotations.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document)
				.changeAllAnnotations(PdfProcessorTask.AnnotationProcessingMode.FLATTEN);
PdfProcessor.processDocumentAsync(...) ...

// Flatten only free text annotations, and copy everything else.
PdfProcessorTask task = PdfProcessorTask.fromDocument(document)
				.changeAllAnnotations(PdfProcessorTask.AnnotationProcessingMode.KEEP)
				.changeAnnotationsOfType(AnnotationType.FREETEXT, PdfProcessorTask.AnnotationProcessingMode.FLATTEN);
PdfProcessor.processDocumentAsync(...) ...

Form Flattening

Form elements are of a special annotation type, AnnotationType::WIDGET. You can use the above-mentioned method to control flattening for all the form elements inside the document. If you want to flatten only form elements of a specific FormType, you can use PdfProcessorTask#changeFormsOfType instead.

For example, you might not want to flatten a signature annotation, as only the visual representation of the digital signature would be included in the resulting document and not in the actual digital signature.

Rotating Pages

Page rotation is supported for 90, 180, and 270 degrees:

Copy
1
2
3
4
5
6
7
// Rotate all pages of the document by 90 degrees.
val task = PdfProcessorTask.fromDocument(document)
for(pageIndex in 0..document.pageCount-1) {
    task.rotatePage(pageIndex, 90)
}

PdfProcessor.processDocumentAsync(...) ...
Copy
1
2
3
4
5
6
7
// Rotate all pages of the document by 90 degrees.
final PdfProcessorTask task = PdfProcessorTask.fromDocument(document);
for (int pageIndex = 0, pageCount = document.getPageCount(); pageIndex < pageCount; pageIndex++) {
    task.rotatePage(pageIndex, 90);
}

PdfProcessor.processDocumentAsync(...) ...

Protecting a Document with a Password

A PDF document can be encrypted to protect its contents from unauthorized access.

PdfProcessor supports creating encrypted password-protected documents by setting a password via DocumentSaveOptions#setPassword:

Copy
1
2
3
4
5
6
7
8
9
val task = PdfProcessorTask.fromDocument(document)

// Create default document save options.
var documentSaveOptions = document.getDefaultDocumentSaveOptions()
// This will create an encrypted password-protected document.
documentSaveOptions.password = "password"

// Use created save options when processing the document.
PdfProcessor.processDocumentAsync(task, outputFile, documentSaveOptions) ...
Copy
1
2
3
4
5
6
7
8
9
final PdfProcessorTask task = PdfProcessorTask.fromDocument(document);

// Create default document save options.
DocumentSaveOptions documentSaveOptions = document.getDefaultDocumentSaveOptions();
// This will create an encrypted password-protected document.
documentSaveOptions.setPassword("password");

// Use created save options when processing the document.
PdfProcessor.processDocumentAsync(task, outputFile, documentSaveOptions) ...