Data Providers

In addition to loading documents using a URI, PSPDFKit can load documents using a DataProvider. A data provider defines a common interface for PSPDFKit to load PDF documents from arbitrary sources like cloud hosts, in-memory, content providers, and others.

Launching a PdfActivity from a data provider just works like starting it using a Uri. Here is an example using the AssetDataProvider for loading a PDF document from the assets/ directory of your app.

Copy
1
2
3
4
5
6
7
8
// Create a data provider that reads directly from the app's assets.
val dataProvider = AssetDataProvider("document.pdf")

// Launch the activity using the data provider.
val intent = PdfActivityIntentBuilder.fromDataProvider(context, dataProvider)
        .configuration(configuration.build())
        .build()
context.startActivity(intent)
Copy
1
2
3
4
5
6
7
8
// Create a data provider that reads directly from the app's assets.
DataProvider dataProvider = new AssetDataProvider("document.pdf");

// Launch the activity using the data provider.
final Intent intent = PdfActivityIntentBuilder.fromDataProvider(context, dataProvider)
        .configuration(configuration.build())
        .build();
context.startActivity(intent);

Existing Data Provider Classes

PSPDFKit comes with a range of predefined data providers that all implement DataProvider:

  • AssetDataProvider – allows loading of documents directly from the app's assets/ directory. This is useful if you ship PDF documents as part of your APK file. Note, that copying assets to the internal device storage may perform better than reading them directly from the assets using this provider.

  • ContentResolverDataProvider – uses Android's content resolver framework for reading documents directly from a ContentProvider specified by a URI using the content:// scheme.

  • InputStreamDataProvider – is an abstract base class that simplifies reading documents from an InputStream. Subclasses have to override the openInputStream() method to provide the ready-to-read stream. Be aware that while it is convenient to use an InputStream it can pose performance issues. This is caused by the fact that PDF documents are read using random access, whereas InputStream only offers stream access. Therefore, InputStreamDataProvider will reopen the underlying input stream every time it needs to "seek backwards".

  • AesDataProvider - is shipped with the catalog app and allows you to open AES256-CTR encrypted files without storing the decrypted blocks anywhere. It supports random seeking and can handle large PDF files without causing OutOfMemoryExceptions.

Custom Data Provider

To create a custom data provider for your application you will have to create a class that implements the DataProvider interface and all of it's methods. If you would like to use your data provider with PdfActivity, your class also needs to implement Android's Parcelable interface. If you plan to use the data provider directly with the PdfFragment, you don't need to make it parcelable.

Have a look at the CustomDataProviderExample inside the catalog app, which shows how to create a data provider that can read a PDF document from the app's res/raw/ directory using an InputStream.

Data Provider Progress

If your custom DataProvider needs to prepare a document before it can be shown using PSPDFKit – for example by downloading the PDF file from the web or by decrypting it upfront – you can implement the ProgressDataProvider interface and PSPDFKit will take care of showing a progress UI to your users. The interface defines a single method #observeProgress which has to return a RxJava Flowable which emits the current progress as a Double in the range of [0, 1].

💡 Tip: In case your ProgressDataProvider has no progress to report (e.g. if a file is already available) it can simply return ProgressDataProvider.COMPLETE inside #observeProgress.

Here's a small example of a data provider that reports the progress of a file download. The full example can be found in our catalog app as ProgressProviderExample.

Copy
RemoteDataProvider.kt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class RemoteDataProvider : InputStreamDataProvider, ProgressDataProvider {
    /** Internally, the data provider uses a subject to publish any download progress. */
    private val progressSubject = PublishSubject.create<Double>()
    /** Responsible for downloading our PDF.  */
    private var downloadJob: DownloadJob? = null
    /** Used to block document opening until the download is done. */
    private val downloadLatch = CountDownLatch(1)

    override fun observeProgress(): Flowable<Double> {
        // We can just return our PublishSubject.
        return progressSubject.toFlowable(BackpressureStrategy.LATEST)
    }

    override fun openInputStream(): InputStream {
        val job = startDownloadIfNotRunning()
        // Block document opening, until the download is done.
        downloadLatch.await()
        // Once the download is complete, continue with the document opening.
        return FileInputStream(job.outputFile)
    }

    private fun startDownloadIfNotRunning(): DownloadJob {
        var job = downloadJob
        if(job != null) return job

        job = DownloadJob.startDownload(...)
        job.setProgressListener(object : DownloadJob.ProgressListener {
            override fun onProgress(progress: Progress) {
                // Notify our listeners about the download progress.
                progressSubject.onNext(progress.bytesReceived.toDouble() / progress.totalBytes.toDouble())
            }

            override fun onComplete(output: File) {
                progressSubject.onComplete()
                // Unblock the actual document loading.
                downloadLatch.countDown()
            }

            ...
        })

        downloadJob = job
        return job
    }
}
Copy
RemoteDataProvider.java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public static class RemoteDataProvider
    extends InputStreamDataProvider implements ProgressDataProvider {

    /** Internally, the data provider uses a subject to publish any download progress. */
    private PublishSubject<Double> progressSubject = PublishSubject.create();
    /** Responsible for downloading our PDF. */
    private DownloadJob downloadJob;
    /** Used to block document opening until the download is done. */
    private CountDownLatch downloadLatch = new CountDownLatch(1);

    @NonNull @Override protected InputStream openInputStream() throws Exception {
        startDownloadIfNotRunning();
        // Block document opening, until the download is done.
        downloadLatch.await();
        // Once the download is complete, continue with the document opening.
        return new FileInputStream(downloadJob.getOutputFile());
    }

    @NonNull @Override public Flowable<Double> observeProgress() {
        // We can just return our PublishSubject.
        return progressSubject.toFlowable(BackpressureStrategy.LATEST);
    }

    private void startDownloadIfNotRunning() {
        if (downloadJob != null) return;

        // In this short example, we use the PSPDFKit DownloadJob to download a PDF from the web.
        downloadJob = DownloadJob.startDownload(...)
        downloadJob.setProgressListener(new DownloadJob.ProgressListener() {
            @Override public void onProgress(@NonNull Progress progress) {
                // Notify our listeners about the download progress.
                progressSubject.onNext((double) progress.bytesReceived / (double) progress.totalBytes);
            }

            @Override public void onComplete(@NonNull File output) {
                progressSubject.onComplete();

                // Unblock the actual document loading.
                downloadLatch.countDown();
            }
            ...
        });
    }
}

Avoiding In-memory Data Providers

In most cases, it is best to serve your PDF data from some kind of random access storage – usually a file. Keeping your PDF data solely in memory has two major disadvantages:

  • It can quickly lead to OutOfMemoryException situations. This can happen when you don't have full control over the size of your PDFs (e.g. when a user opens a big PDF) or when running your app on a device with less memory than anticipated.

  • An in-memory data provider can't be retained across process recreations. By design, Android can kill your app's process as soon as all activities of your app are in the background. Once the app comes to the foreground your process is recreated – leaving your in-memory data provider without data.

ℹ️ Note: Due to these reasons, the previously available MemoryDataProvider was removed in PSPDFKit 3.1.1 for Android. Depending on your original usage scenario, you are adviced to use one of the other available data provider classes.