Convert PDF to PDF/A in C#

To archive electronic documents with long-term preservation, convert source files to PDF/A documents that are intended for archiving. PSPDFKit GdPicture.NET supports converting source files into all PDF/A versions and conformance levels:

  • PDF/A-1a, PDF/A-1b

  • PDF/A-2a, PDF/A-2u, PDF/A-2b

  • PDF/A-3a, PDF/A-3u, PDF/A-3b

  • PDF/A-4, PDF/A-4e, PDF/A-4f

For more information on the long-term preservation of documents, check out our demo video below, or have a look at our complete guide to PDF/A.

Converting PDF to PDF/A

To convert a PDF to a PDF/A document, follow these steps:

  1. Create a GdPicturePDF object.

  2. Load the source document by passing its path to the LoadFromFile method.

  3. Convert the source document to PDF/A by calling the ConvertToPDFA method. This method takes the following parameters:

    • The path to the output document.

    • A member of the PdfConversionConformance enumeration that specifies the PDF/A conformance level of the output document.

    • A Boolean value that specifies whether to convert page elements to vector-based graphics when direct conversion isn’t possible. Recommended: Set this value to true.

    • A Boolean value that specifies whether to convert pages to raster images when direct conversion and vectorization aren’t possible. Recommended: Set this value to true. For more information, see Configuring PDF/A Conversion

  4. Release unnecessary resources by calling the CloseDocument method.

using GdPicturePDF gdpicturePDF = new GdPicturePDF();
// Load the source document.
gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");
// Convert to a document with PDF/A-2a conformance level.
gdpicturePDF.ConvertToPDFA(@"C:\temp\output.pdf", PdfConversionConformance.PDF_A_2a, true, true);
// Release unnecessary resources.
gdpicturePDF.CloseDocument();
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
    ' Load the source document.
    gdpicturePDF.LoadFromFile("C:\temp\source.pdf")
    ' Convert to a document with PDF/A-2a conformance level.
    gdpicturePDF.ConvertToPDFA("C:\temp\output.pdf", PdfConversionConformance.PDF_A_2a, True, True)
    ' Release unnecessary resources.
    gdpicturePDF.CloseDocument()
End Using
Used Methods

Related Topics

Configuring PDF/A Conversion

PDF/A documents are intended for long-term preservation, and their structure is different from PDF documents. To ensure compliance with your chosen conformance level, the conversion process may introduce changes to the document’s content or appearance. GdPicture.NET might change the document by adding, editing, or removing document structure elements, embedding fonts, etc.

In some cases, direct conversion isn’t possible. GdPicture.NET then uses other techniques, such as vectorization and rasterization:

  • Vectorization means that if some document elements cannot be used directly in the PDF/A output, they’re embedded in the output document as vector-based graphic elements. This technique is typically used for fonts and paths.

  • Rasterization means that if some document content cannot be used directly in the PDF/A output, it’s embedded in the output document in the form of raster images.

Both approaches result in the loss of fonts and text information because the text is converted into shapes and raster images. Text information can later be recovered using optical character recognition.

To control if GdPicture.NET uses the vectorization and rasterization techniques if necessary, use the parameters of the ConvertToPDFA method explained above.