Extract Data from PDF Form Fields in C# .NET

To extract data from form fields in a PDF document, use the following steps:

  1. Create a GdPicturePDF object.

  2. Load the PDF file with the LoadFromFile method.

  3. Get the total number of form fields in the PDF document with the GetFormFieldsCount method.

  4. Loop through all the form fields.

  5. Use any method to get the form field data and save it to a variable. For more information, refer to the read form fields guide.

The following code example gets the field ID, type, and page location of all form fields and saves them to a CSV file:

using GdPicturePDF gdpicturePDF = new GdPicturePDF();

// Create a `StringBuilder` variable to store data.
StringBuilder data = new StringBuilder();
// Add headers to the first line.
String[] headers = {"Field ID", "Field Type", "Field Location Page"};
data.AppendLine(string.Join(";", headers));

gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");
// Get the form field count.
int fieldCount = gdpicturePDF.GetFormFieldsCount();
// Loop through all form fields.
for (int i = 0; i < fieldCount; i++)
{
    // Get the field ID, type, and page location of each form field.
    int fieldID = gdpicturePDF.GetFormFieldId(i);
    PdfFormFieldType fieldType = gdpicturePDF.GetFormFieldType(fieldID);
    int location = gdpicturePDF.GetFormFieldPage(fieldID);
    // Add a new line to the `StringBuilder` with the form field data.
    String[] newLine = { fieldID.ToString(), fieldType.ToString(), location.ToString() };
    data.AppendLine(string.Join(";", newLine));
}
// Save the collected data to a CSV file.
String formData = @"C:\temp\output.csv";
File.AppendAllText(formData, data.ToString());
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
    ' Create a `StringBuilder` variable to store data.
    Dim data As StringBuilder = New StringBuilder()
    ' Add headers to the first line.
    Dim headers = {"Field ID", "Field Type", "Field Location Page"}
    data.AppendLine(String.Join(";", headers))

    gdpicturePDF.LoadFromFile("C:\temp\source.pdf")
    ' Get the form field count.
    Dim fieldCount As Integer = gdpicturePDF.GetFormFieldsCount()
    ' Loop through all form fields.
    For i = 0 To fieldCount - 1
        ' Get the field ID, type, and page location of each form field.
        Dim fieldID As Integer = gdpicturePDF.GetFormFieldId(i)
        Dim fieldType As PdfFormFieldType = gdpicturePDF.GetFormFieldType(fieldID)
        Dim location As Integer = gdpicturePDF.GetFormFieldPage(fieldID)
        ' Add a new line to the `StringBuilder` with the form field data.
        Dim newLine As String() = {fieldID.ToString(), fieldType.ToString(), location.ToString()}
        data.AppendLine(String.Join(";", newLine))
    Next
    ' Save the collected data to a CSV file.
    Dim formData = "C:\temp\output.csv"
    File.AppendAllText(formData, data.ToString())
End Using
Used Methods

Related Topics