Using RegEx to Redact PDFs on iOS

PSPDFKit enables you to redact text in a PDF document using regular expression patterns. This guide shows how to redact URLs in a document using NSRegularExpression.

First, you need to create the regular expression with the URL pattern:

let urlPattern = #"[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)"#
let urlRegularExpression = try! NSRegularExpression(pattern: urlPattern)

Then, you’ll have to loop through all the pages in the document to find and mark the matching URLs for redaction. To get the text in the document, you’ll need to use TextParser, which you can access via the Document’s textParserForPage(at:) method:

let fileURL = Bundle.main.url(forResource: "Document", withExtension: "pdf")!
let document = Document(url: fileURL)

for pageIndex in 0..<document.pageCount {
    if let textParser = document.textParserForPage(at: pageIndex) {
        textParser.words.forEach { word in
            let wordString = word.stringValue
            let range = NSRange(wordString.startIndex..<wordString.endIndex, in: wordString)

            // Redact all the words that match the regex.
            let isValidURL = urlRegularExpression.numberOfMatches(in: wordString, options: [], range: range) > 0
            if isValidURL {
                // Create a redaction annotation for each URL and add it to the document.
                let redactionRect = word.frame
                let redaction = RedactionAnnotation()
                redaction.boundingBox = redactionRect
                redaction.rects = [redactionRect]
                redaction.color = .orange
                redaction.fillColor = .black
                redaction.overlayText = "REDACTED"
                redaction.pageIndex = pageIndex
                document.add(annotations: [redaction])
            }
        }
    }
}

And finally, create a new PDF file for the redacted document by applying redactions using the Processor API, instantiate a Document object, and present it in a PDFViewController, like so:

// Use Processor to create the newly redacted document.
let processorConfiguration = Processor.Configuration(document: document)!

// Apply redactions to permanently redact URLs from the document.
processorConfiguration.applyRedactions()

let processor = Processor(configuration: processorConfiguration, securityOptions: nil)
try? processor.write(toFileURL: redactedDocumentURL)

// Instantiate the redacted document and present it.
let redactedDocument = Document(url: redactedDocumentURL)
let pdfController = PDFViewController(document: redactedDocument)
present(UINavigationController(rootViewController: pdfController), animated: true)

For more details about redacting a document using regular expressions, please refer to RedactTextUsingRegexExample.swift in the PSPDFKit Catalog app.