Extract Images from PDFs in iOS

This guide shows how to programmatically extract bitmap images from a PDF page.

Embedded images in a PDF page are represented by the ImageInfo class and can be retrieved via the images property on TextParser. ImageInfo also provides an assortment of image metadata properties, as well as methods to extract an image from a PDF as a UIImage. To obtain the text parser for a given page, use the Document.textParserForPage(at:) API.

The code below will grab all the bitmap images from the first page of the given PDF document and make them available for further processing as UIImage instances:

// Update to use your document name and location.
let fileURL = Bundle.main.url(forResource: "Document", withExtension: "pdf")!
let document = Document(url: fileURL)

guard let parser = document.textParserForPage(at: 0) else {
    print("Parsing failed.")
    return
}

let images: [UIImage] = imageInfos.compactMap { imageInfo in
    do {
        // Some PDF images are in the CMYK color space, which isn't a supported encoding.
        // Using this call converts all images to the RGB color space.
        return try imageInfo.imageInRGBColorSpace()
    } catch let error {
        print("Image processing failed. Error \(error.localizedDescription)")
        return nil
    }
}

// Do something with the images...
print("Found \(images.count) images.")

The TextParser API also offers access to the page text. To learn more about it, refer to the parsing and Text Extraction guides.