Working with PDFs in ASP.NET

Illustration: Working with PDFs in ASP.NET

ASP.NET has a long history, and it has changed its makeup drastically over the years. Today, with the latest iteration of ASP.NET — ASP.NET Core — we have a highly flexible framework for creating web applications, RESTful APIs, microservices, and more. And to top it off, it’s now cross-platform compatible. That’s right Linux, you’ve got some official .NET love.

In this post, I want to introduce you to how we can work with PDFs in ASP.NET using the PSPDFKit .NET Library. The post will be focusing on a simple PDF form reading and writing mechanism in a web application, but there are many other features we could expose — for example, document editing, redaction, annotation manipulation, and more.

Introducing the Use Case

In this use case, there are two goals we want to achieve.

First, the user should be able to upload a PDF and the application will extract all the form fields from the PDF. It needs to be able to give the user a list of names and values set for these form fields. Second, the user should be able to both upload a PDF and apply form field values to the given form field names. The resulting document will be saved and downloaded to the user’s file system.

These will show us how we can open a document sent from the user and perform analysis and manipulation on the document. It will also introduce the saving mechanism and the options we have when saving.

The following blog post will be completed in C#, but note that .NET is compatible with a few different languages.

For the remainder of the post, I’ll walk through some of what is required to set up a project and what source code to add. Alternatively, the following project setup and code is available to clone from our PSPDFKit-Labs repository. Doing so will allow you to jump straight to the Running the Application step.

Setting Up the Project

Lucky for us, ASP.NET is supported by the cross-platform dotnet command line application, which means it doesn’t matter which operating system we are working on. Win number one for ASP.NET.

You’ll find all the ways to download dotnet on the Microsoft .NET website.

In our use case, we’re going to create a web application. To do so, we use the following dotnet command in the terminal:

1
dotnet new WebApp -o MyWebApp --no-https

Next, we’ll need to add the PSPDFKit .NET Library as a NuGet dependency:

1
dotnet add package PSPDFKit.NET

Now we’re ready to jump into the code.

Initializing the PSPDFKit .NET Library

In order to initialize the PSPDFKit .NET Library, you’ll need to obtain a license key. If you’re already a customer, you can go to the Customer Portal to obtain your license key. If you are not yet a customer, please head over to the trial page to request a license key.

All you’ll need to do is call the initialization method with the license key, and your PSPDFKit instance will be validated and ready to use:

1
PSPDFKit.Sdk.Initialize("YOUR_LICENSE_KEY_GOES_HERE");

Replace YOUR_LICENSE_KEY_GOES_HERE with your license key.

I placed the above call in the ConfigureServices method of the Startup.cs file, but all that’s important is that it’s called once per instance and that it’s called before any other PSPDFKit API is called.

Calling the PSPDFKit API

From here on out, we are going to be working on a new webpage. The new page is aptly named Read, as we are going to read the form field values. The page will have a Read.cshtml template which describes the HTML that will be passed to the browser, and a Read.cshtml.cs model C# file to handle the actions involved with the page.

As we’ve already set up the PSPDFKit .NET Library, we can now call the API to start operating on the PDF.

In the example I’ve laid out, I set up a small web form that can take in a file with a .pdf extension and save it to a temporary location. The file upload step is not important to this post, although if you’d like to see the code for it, please refer to the repository accompanying the blog post.

Opening a PDF and Saving the Form Field Values

The following code will show all that is required to add to a Razor page to open a PDF and display the form fields and values found:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Used to show the form field values on the page.
[BindProperty(SupportsGet = true)]
public IList<FormFieldValue> FormFieldValues { get; } = new List<FormFieldValue>();

// Used to pass the form field values as URL parameters.
[BindProperty(SupportsGet = true)] public string FormFieldsJson { get; set; } = null;



// Open the PDF and retrieve the form field values.
var document = new PSPDFKit.Document(new FileDataProvider(filePath));
var fieldValuesJson = document.GetFormProvider().GetFormFieldValuesJson();

// Refresh the page with the form field data shown.
return RedirectToPage(new {FormFieldsJson = fieldValuesJson.ToString()});

Above we see the document that has been uploaded (filePath) opened and queried for the form values. As the name of the method (GetFormFieldValuesJson) suggests, the return value is a JObject, which is very useful because we can pass it directly to the next step.

By calling RedirectToPage without a new URL, we are asking the browser to load the same page again, but in the command, we are also setting FormFieldsJson, which passes the JSON data as a parameter of the URL.

Parsing the Newly Loaded JSON Data

In the previous step, we saw the form field values passed as parameter values in a JSON format. On the reload of the page, we can extract these JSON values and display them on the page with the following:

Copy
1
2
3
4
5
6
7
8
9
10
public void OnGet()
{
    if (FormFieldsJson == null) return;

    var formFieldsJson = JObject.Parse(FormFieldsJson);
    foreach (var (key, value) in formFieldsJson)
    {
        FormFieldValues.Add(new FormFieldValue {Name = key, Value = value.ToString()});
    }
}

The OnGet method, found in Read.cshtml.cs, will be called for every load of the page. When localhost:5000/Read is called, FormFieldsJson is null. The data bound to the FormFieldsJson variable will be whatever is passed to the URL parameters. Therefore, the localhost:5000/Read?FormFieldsJson=my_fields_are_here request binds my_fields_are_here to the FormFieldsJson variable.

Another way of binding variables to the URL call was seen in the previous code block. By calling RedirectToPage with route values of new {FormFieldsJson = fieldValuesJson.ToString()}, the contents of fieldValuesJson.ToString() will be assigned to FormFieldsJson upon the next load.

Now the OnGet method will pass the null test and proceed to parse the JSON into our FormFieldValues objects.

Displaying the Form Field Values on the Loaded Page

We have populated FormFieldValues, so all we need now is to generate the HTML to represent the form fields in Read.cshtml:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<table class="table">
    <thead>
    <tr>
    <th>
        @Html.DisplayNameFor(model => model.FormFieldValues[0].Name)
    </th>
    <th>
        @Html.DisplayNameFor(model => model.FormFieldValues[0].Value)
    </th>
    </thead>

    <tbody>
    @foreach (var item in Model.FormFieldValues)
    {
        <tr>
            <td>
                @Html.DisplayTextFor(modelItem => item.Name)
            </td>
            <td>
                @Html.DisplayTextFor(modelItem => item.Value)
            </td>
        </tr>
    }
    </tbody>
</table>

The slightly funky syntax, called Razor syntax, allows us to dynamically generate our HTML based on data bound in the Read.cshtml.cs model. You can see the “magic bind” annotation from the first code block where you saw FormFieldValues annotated with the BindProperty:

1
2
[BindProperty(SupportsGet = true)]
public IList<FormFieldValue> FormFieldValues { get; } = new List<FormFieldValue>();

We take the values populated in FormFieldValues and create a table with headings generated from the class member variable names. Then we iterate over the List to dynamically add the number of fields found in the document.

To find out more about Razor syntax and how to dynamically generate HTML, please follow the guides in the Microsoft documentation.

Running the Application

The instructions above have outlined areas of interest relating to PDF operations. If you’d like to see and run the full source, please clone the complete example from our PSPDFKit-Labs repository. Please do not forget to replace YOUR_LICENSE_KEY_GOES_HERE with the license key you obtained from your customer account or trial license.

From within the repository directory that you cloned, you can run the following command:

1
dotnet run

The command will launch the web server locally and open your default browser with a path to localhost:5000 with the web application up and running! If for some reason it doesn’t, just head to localhost:5000 in a browser on your machine.

Try to upload a document with a form field.

Conclusion

This post set out to introduce you to a simple ASP.NET web application and show how we can operate on PDFs with the PSPDFKit .NET Library. If you explore the ASP.NET application repository, you’ll find another page with an example of how to fill out form fields.

We could take the application further and implement more features like document redaction (where we can irrecoverably remove data from a document), or document editing (where we can add, remove, and move pages, as well as merge one or more documents together).

To find out more about form filling with the PSPDFKit .NET Library or the many other features supported by the library, please request a trial.

PSPDFKit Library for Java/.NET

Download the free 60-day trial and integrate it into your app today.