Extract Selected Text from PDFs Programmatically

PSPDFKit for Web comes with a reliable, cross-browser API to access text selection and retrieve both selected text and text lines located within a given selection range.

Each PSPDFKit instance can have only one selection at a time. The current text selection for a document has the shape of a PSPDFKit.TextSelection object.

Getting Text Selection Programmatically

Text selection for the current PDF document can be retrieved at any time using the Instance#getTextSelection method, which returns an instance of TextSelection:

const textSelection = instance.getTextSelection();
console.log(textSelection instanceof PSPDFKit.TextSelection);
// > true
const textSelection = instance.getTextSelection();
console.log(textSelection instanceof PSPDFKit.TextSelection);
// > true

Since TextSelection is an immutable object, its value won’t change over time — for example, when the current text is deselected or some other text is selected. In order to get the updated text selection, Instance#getTextSelection must be invoked again.

Once a TextSelection exists, the selected text can be retrieved with getText. This method is asynchronous, so it returns a Promise:

const textSelection = instance.getTextSelection();
const text = await textSelection.getText();
console.log(text);
const textSelection = instance.getTextSelection();
textSelection.getText().then(function (text) {
  console.log(text);
});

TextSelection objects expose two other methods:

Other useful properties, like selection’s startNode and endNode, are available on TextSelections. Please refer to the API docs for further details.

Subscribing to Change Events

Text selection updates are dispatched every time a text is selected or deselected. Subscribing to the textSelection.change event ensures a notification when the selection changes:

instance.addEventListener("textSelection.change", textSelection => {
  if (textSelection) {
    textSelection.getText().then(text => {
      console.log(text);
    });
  } else {
    console.log("no text is selected");
  }
});
instance.addEventListener("textSelection.change", function(textSelection) {
  if (textSelection) {
    textSelection.getText().then(function(text) {
      console.log(text);
    });
  } else {
    console.log("no text is selected");
  }
});

Notice that when the text is deselected, the textSelection argument is null.