Convert MS Office to Image Files in Linux

Information

PSPDFKit Processor has been deprecated and replaced by PSPDFKit Document Engine. All PSPDFKit Processor licenses will work as before and be supported until 15 May 2024 (we will contact you about license migration). To start using Document Engine, refer to the migration guide. With Document Engine, you’ll have access to robust new capabilities (read the blog for more information).

To convert an Office document to an image, post a multipart request to the /build endpoint, including both the Office file as the input and the instructions JSON. In response, you’ll receive a ZIP archive containing all of the document’s pages as images.

Converting an Office document to an image requires you to provide dimensions for the resulting rendered pages via a width, height, or dpi option.

Only one option — width, height, or dpi — can be chosen. Other dimensions are calculated before rendering, so as to preserve the page aspect ratio of the rendered image.

The format of the rendered images can be controlled via a format option. Supported image formats are PNG, JPEG, WEBP, and TIFF.

Before you get started, make sure Processor is up and running.

You can download and use either of the following sample documents for the examples in this guide:

You’ll be sending multipart POST requests with instructions to Processor’s /build endpoint. To learn more about multipart requests, refer to our blog post on the topic, A Brief Tour of Multipart Requests.

Check out the API Reference to learn more about the /build endpoint and all the actions you can perform on PDFs with PSPDFKit Processor.

Converting an Office File on Disk to an Image

Send a multipart request to the /build endpoint, attaching an input file and the instructions JSON:

curl -X POST http://localhost:5000/build \
  -F document=@/path/to/example-document.docx \
  -F instructions='{
  "parts": [
    {
      "file": "document"
    }
  ],
  "output": {
    "type": "image",
    "format": "jpg",
    "dpi": 500
  }
}' \
  -o result.zip
POST /process HTTP/1.1
Content-Type: multipart/form-data; boundary=customboundary

--customboundary
Content-Disposition: form-data; name="document"; filename="example-document.docx"
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document

<DOCX data>
--customboundary
Content-Disposition: form-data; name="instructions"
Content-Type: application/json

{
  "parts": [
    {
      "file": "document"
    }
  ],
  "output": {
    "type": "image",
    "format": "jpg",
    "dpi": 500
  }
}
--customboundary--

Converting an Office File from a URL to an Image

Send a multipart request to the /build endpoint, attaching an input file and the instructions JSON:

curl -X POST http://localhost:5000/build \
  -F instructions='{
  "parts": [
    {
      "file": {
        "url": "https://pspdfkit.com/downloads/examples/paper.docx"
      }
    }
  ],
  "output": {
    "type": "image",
    "format": "jpg",
    "dpi": 500
  }
}' \
  -o result.zip
POST /process HTTP/1.1
Content-Type: multipart/form-data; boundary=customboundary

--customboundary
Content-Disposition: form-data; name="instructions"
Content-Type: application/json

{
  "parts": [
    {
      "file": {
        "url": "https://pspdfkit.com/downloads/examples/paper.docx"
      }
    }
  ],
  "output": {
    "type": "image",
    "format": "jpg",
    "dpi": 500
  }
}
--customboundary--