Extract Images from PDFs in C#
This guide explains how to extract images from PDF documents using C#. Images can be added to a PDF document in the following ways:
-
Embedded in the internal structure of the PDF document.
-
Added to the PDF document as an image annotation.
GdPicture.NET currently enables you to extract images embedded in a PDF document. Extracting images from image annotations isn’t supported.
To extract images embedded in a PDF document, follow these steps:
-
Create a
GdPicturePDF
object and aGdPictureImaging
object. -
Select the source document by passing its path to the
LoadFromFile
method of theGdPicturePDF
object. -
Determine the number of pages with the
GetPageCount
method of theGdPicturePDF
object and loop through them. -
Determine the number of images on the page with the
GetPageImageCount
method of theGdPicturePDF
object and loop through them. -
Extract the image by passing the index of the image to the
ExtractPageImage
method of theGdPicturePDF
object. -
Save the output in a new image file with the
SaveAsPNG
method of theGdPictureImaging
object. -
Release unnecessary resources.
The example below extracts all embedded images from a PDF document:
using GdPicturePDF gdpicturePDF = new GdPicturePDF(); using GdPictureImaging gdpictureImaging = new GdPictureImaging(); // Select the source document. gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf"); // Determine the number of pages and loop through them. int pageCount = gdpicturePDF.GetPageCount(); for (int page = 1; page <= pageCount; page++) { gdpicturePDF.SelectPage(page); // Determine the number of images on the page and loop through them. int imageCount = gdpicturePDF.GetPageImageCount(); for (int imageIndex = 0; imageIndex < imageCount; imageIndex++) { // Extract the image. int imageId = gdpicturePDF.ExtractPageImage(imageIndex); // Save the output in a new image file. gdpictureImaging.SaveAsPNG(imageId, @"C:\temp\page-" + page + "-image-" + imageIndex + ".png"); // Release unnecessary resources. gdpictureImaging.ReleaseGdPictureImage(imageId); } }
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF() Using gdpictureImaging As GdPictureImaging = New GdPictureImaging() ' Select the source document. gdpicturePDF.LoadFromFile("C:\temp\source.pdf") ' Determine the number of pages and loop through them. Dim pageCount As Integer = gdpicturePDF.GetPageCount() For page = 1 To pageCount gdpicturePDF.SelectPage(page) ' Determine the number of images on the page and loop through them. Dim imageCount As Integer = gdpicturePDF.GetPageImageCount() For imageIndex = 0 To imageCount - 1 ' Extract the image. Dim imageId As Integer = gdpicturePDF.ExtractPageImage(imageIndex) ' Save the output in a new image file. gdpictureImaging.SaveAsPNG(imageId, "C:\temp\page-" & page & "-image-" & imageIndex & ".png") ' Release unnecessary resources. gdpictureImaging.ReleaseGdPictureImage(imageId) Next Next End Using End Using