Save PDFs to a Custom Data Provider on iOS

PSPDFKit supports loading data from many different sources. In fact, this can be done from any object that conforms to the DataProviding protocol, which is known as a data provider. This is especially helpful if you want to support your own encryption or compression scheme.

A data provider is an object that defines how PDF data is read and written from a particular source. This makes reading and writing PDF data customizable, and it gives you the freedom to store your data in the exact way you need it.

Existing Data Providers

PSPDFKit ships with several premade data provider classes:

Custom Data Providers

You can also write your own custom data provider and pass it along to init(dataProviders:).

Read Support

To provide read support, you have to implement the DataProviding protocol. It offers methods for reading the data at a specific offset and for uniquely identifying the content. Implementing the more specialized FileDataProviding protocol is preferred if your data provider is backed by a file on disk.

Here’s an example of how to implement the DataProviding protocol to read from a Data instance:

class YourDataProvider: NSObject, NSSecureCoding, DataProviding {

    // MARK: Properties

    var size: UInt64 {
        // Returns the size of the data.
        guard let data = self.data else { return 0 }
        return UInt64(data.count)
    }

    var uid: String {
        // This can be anything that uniquely identifies your data:
        // the resource name from the original data, the UUID — you name it.
        return uniqueIdentifierForYourData
    }

    // [...]

    // MARK: DataProviding

    func readData(withSize size: UInt64, atOffset offset: UInt64) -> Data {
        guard let data = self.data else { return Data() }
        // We have to clamp the given size and offset to make sure we don't try
        // to read data that doesn't exist.
        let length = self.size
        let clampedOffset = min(offset, length)
        let clampedSize = min(size, length - clampedOffset)
        // Actually return the data.
        let range: Range = Int(clampedOffset)..<Int(clampedOffset+clampedSize)
        return data.subdata(in: range)
    }
}
@interface YourDataProvider : NSObject <PSPDFDataProviding> @end

@implementation YourDataProvider

#pragma mark - Properties

- (uint64_t)size {
    // Returns the size of the data.
    return self.data.length;
}

- (NSString *)UID {
    // This can be anything that uniquely identifies your data:
    // the resource name from the original data, the UUID — you name it.
    return uniqueIdentifierForYourData;
}

// [...]

#pragma mark - PSPDFDataProviding

- (NSData *)readDataWithSize:(uint64_t)size atOffset:(uint64_t)offset {
    // We have to clamp the given size and offset to make sure we don't try
    // to read data that doesn't exist.
    const NSUInteger length = self.size;
    NSUInteger clampedOffset = MIN((NSUInteger)offset, length);
    NSUInteger clampedSize =  MIN((NSUInteger)size, length - clampedOffset);
    // Actually return the data.
    return [self.data subdataWithRange:NSMakeRange(clampedOffset, clampedSize)];
}

@end

Write Support

Write support is a little more difficult to implement; you can’t simply offer a write method. The reason for this is that PSPDFKit actually has to be able to read a document while writing it, which means data can’t just be overwritten.

For this reason, we introduced the concept of DataSink. It supports the following options (DataSinkOptions):

  • [] (empty set of DataSinkOptions) — the incoming writes are from the beginning of the file (this is the default option)

  • .append — the incoming writes should be appended to the file

To support writing, you first need to write a DataSink:

class YourDataSink: NSObject, DataSink {

    // MARK: Properties

    private(set) var isFinished = false
    let options: DataSinkOptions
    var writtenData = Data()

    // MARK: Lifecycle

    init(options: DataSinkOptions) {
        self.options = options
        super.init()
    }

    // MARK: DataSink

    func write(_ data: Data) -> Bool {
        // We append the passed-in data to our `writtenData`.
        writtenData.append(data)
        return true
    }

    func finish() -> Bool {
        // If you're implementing compression or encryption writing, you might need
        // to tell the compression or encryption library that you're finished
        // writing. You can do this here. For our purposes with the `Data`, we
        // don't need to do anything.
        isFinished = true
        return true
    }
}
@interface YourDataSink : NSObject <PSPDFDataSink>

@property (nonatomic, readonly) PSPDFDataSinkOptions options;
@property (nonatomic, readonly) NSMutableData *writtenData;

@end

@interface YourDataSink ()

@property (nonatomic) BOOL isFinished;

@end

@implementation YourDataSink

- (instancetype)initWithOptions:(PSPDFDataSinkOptions)options {
    if ((self = [super init])) {
        // We initialize `writtenData` with an empty mutable `NSMutableData`.
        _writtenData = [NSMutableData data];
        _options = options;
    }
    return self;
}

- (BOOL)writeData:(NSData *)data {
    // We append the passed-in data to our `writtenData`.
    [self.writtenData appendData:data];
    return YES;
}

- (BOOL)finish {
    // If you're implementing compression or encryption writing, you might need
    // to tell the compression or encryption library that you are finished
    // writing. You can do this here. For our purposes with the `NSData`, we
    // don't need to do anything.
    self.isFinished = YES;
    return YES;
}

@end

This is a basic DataSink implementation that writes to the passed-in Data. For the data provider to make use of it, you have to extend it just a little:

class YourDataProvider: NSObject, NSSecureCoding, PSPDFDataProviding {

    // ... your previous implementation ...

    var additionalOperationsSupported: PSPDFDataProvidingAdditionalOperations {
        // Signal to PSPDFKit that this data provider can support writing by
        // returning the following option:
        return .write
    }

    func createDataSink(options: PSPDFDataSinkOptions) -> PSPDFDataSink {
        // When PSPDFKit wants to write to the data provider, it will call this
        // method and it passes in if it wants to overwrite or append to the file.
        return YourDataSink(options: options)
    }

    func replace(with replacementDataSink: PSPDFDataSink) -> Bool {
        // After PSPDFKit finishes writing, it passes in the data sink
        // that was previously created in `-createDataSinkWithOptions:`.
        let dataSink = replacementDataSink as! YourDataSink

        // We have to check if we have to overwrite or append.
        if dataSink.options.contains(.append) {
            // We have to append the data.
            guard let data = data else { return false }
            var replacementData = data
            replacementData.append(dataSink.writtenData)
        } else {
            // We can simply replace our data.
            data = dataSink.writtenData
        }

        return true
    }
}
@implementation YourDataProvider

// ... your previous implementation ...

- (PSPDFDataProvidingAdditionalOperations)additionalOperationsSupported {
    // Signal to PSPDFKit that this data provider can support writing by
    // returning the following option:
    return PSPDFDataProvidingAdditionalOperationWrite;
}

- (id<PSPDFDataSink>)createDataSinkWithOptions:(PSPDFDataSinkOptions)options {
    // When PSPDFKit wants to write to the data provider, it will call this
    // method and passes in if it wants to overwrite or append to the file.
    return [[YourNSDataSink alloc] initWithOptions:options];
}

- (BOOL)replaceWithDataSink:(id<PSPDFDataSink>)replacementDataSink {
    // After PSPDFKit finishes writing, it passes in the data sink
    // that was previously created in `-createDataSinkWithOptions:`.
    YourNSDataSink *dataSink = (YourNSDataSink*)replacementDataSink;
    // We have to check if we have to overwrite or append.
    if (dataSink.options & PSPDFDataSinkOptionAppend) {
        // We have to append the data.
        NSMutableData *replacementData = [self.data mutableCopy];
        [replacementData appendData:dataSink.writtenData];
        self.data = replacementData;
    } else {
        // We can simply replace our data.
        self.data = dataSink.writtenData;
    }
    return YES;
}

@end

Always remember that even while writing, the data provider must be able to fully read the document.