Metrics Integration

PSPDFKit Server sends metrics using an open DogStatsD protocol, which is an extension of the popular StatsD protocol. Any metric collection engine that understands this protocol can ingest metrics exported by Server — including Telegraf, AWS CloudWatch Agent, and DogStatsD itself.

To enable exporting metrics, set the STATSD_HOST and STATSD_PORT configuration variables to point to the hostname and port where the collection engine is running.

This guide covers instructions on how to integrate Server metrics with various metric collection systems, more specifically:

Docker-Based Setup with Telegraf, InfluxDB, and Grafana

This deployment scenario integrates Server with Telegraf, InfluxDB, and Grafana via docker-compose. This approach is a great fit when you want to maintain complete control over the entire environment, you can’t deploy in the cloud, or you just want to try things out. It can be also adapted to other deployment orchestration tools based on Docker containers, such as Kubernetes.

In this setup, Server sends metrics to Telegraf, which aggregates time during fixed-period time buckets. After metrics are aggregated, they’re sent to InfluxDB for storage. Finally, the operator can view and analyze metrics in the Grafana dashboard by writing queries for InfluxDB.

Prerequisites

In order to complete this section, you’ll need to have Docker with docker-compose installed. You’ll also need a PSPDFKit Server activation key or a trial license key; please refer to the Product Activation guide.

Setting Up

To get started, clone the repository with the configuration files: https://github.com/PSPDFKit/pspdfkit-server-example-metrics. In the root of the repository, you’ll find the following docker-compose.yml file:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
version: "3.8"

services:
  grafana:
    image: grafana/grafana:7.1.5
    ports:
      - 3000:3000
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=secret
    depends_on:
      - influxdb
    volumes:
      - ./grafana/provisioning/datasources:/etc/grafana/provisioning/datasources:ro
      - ./grafana/provisioning/dashboards:/etc/grafana/provisioning/dashboards:ro
      - ./grafana/dashboards:/var/lib/grafana/dashboards:ro
  influxdb:
    image: influxdb:1.8.2
  telegraf:
    image: telegraf:1.14.5-alpine
    depends_on:
      - influxdb
    volumes:
      - ./telegraf/telegraf.conf:/etc/telegraf/telegraf.conf:ro
  db:
    image: postgres:11.6
    environment:
      POSTGRES_USER: pspdfkit
      POSTGRES_PASSWORD: password
      POSTGRES_DB: pspdfkit
      PGDATA: /var/lib/postgresql/data/pgdata
    volumes:
      - pgdata:/var/lib/postgresql/data
  pspdfkit:
    image: pspdfkit/pspdfkit:latest
    environment:
      STATSD_HOST: telegraf
      STATSD_PORT: 8125

      ACTIVATION_KEY: <YOUR_ACTIVATION_KEY>
      DASHBOARD_USERNAME: dashboard
      DASHBOARD_PASSWORD: secret

      PGUSER: pspdfkit
      PGPASSWORD: password
      PGDATABASE: pspdfkit
      PGHOST: db
      PGPORT: 5432
      API_AUTH_TOKEN: secret
      SECRET_KEY_BASE: secret-key-base
      JWT_PUBLIC_KEY: |
        -----BEGIN PUBLIC KEY-----
        MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALd41vG5rMzG26hhVxE65kzWC+bYQ94t
        OxsSxIQZMOc1GY8ubuqu2iku5/5isaFfG44e+VAe+YIdVeQY7cUkaaUCAwEAAQ==
        -----END PUBLIC KEY-----
      JWT_ALGORITHM: RS256
    ports:
      - 5000:5000
    depends_on:
      - db
      - telegraf

volumes: pgdata:

This configuration file defines all services required to deploy Server and observe metrics reported by it. Remember to swap out the <YOUR_ACTIVATION_KEY> placeholder with the actual activation key of PSPDFKit Server. Note that Server is configured to send metrics to Telegraf: STATSD_HOST points to telegraf service, and STATSD_PORT is configured to use port 8125.

In the telegraf/ subdirectory, you’ll find the configuration for the Telegraf agent, telegraf.conf:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[agent]
  interval = "5s"

[[inputs.statsd]]
  service_address = ":8125"

  metric_separator = "."
  datadog_extensions = true

  templates = [
      "*.* measurement.field",
      "*.*.* measurement.measurement.field",
  ]

[[outputs.influxdb]]
  urls = ["http://influxdb:8086"]
  database = "pspdfkit"

Let’s go through the options defined here:

  • interval specifies how often Telegraf takes all the received data points and aggregates them into metrics.
  • inputs.statsd.service_address defines the address of the UDP listener. The port number here must match the STATSD_PORT configuration for Server.
  • inputs.statsd.metric_separator needs to be set to . when used with PSPDFKit Server.
  • inputs.statsd.datadog_extensions needs to be set to true so that metric tags sent by Server are parsed correctly.
  • inputs.statsd.templates defines how names of metrics sent by Server are mapped to Telegraf’s metric representation. This needs to be set to the exact value shown in the configuration file above.
  • outputs.influxdb is the URL of the InfluxDB instance where data will be stored.
  • outputs.influxdb.database is the name of the database in InfluxDB where metrics will be saved. You’ll need to use the same name when querying data in Grafana.

The above is just an example configuration file, and it’s by no means a complete configuration suitable for production deployments; please refer to the Telegraf documentation for all the available options.

In the grafana/ subdirectory, you’ll find definitions of data sources, along with an example dashboard. You can provide all this data through Grafana UI; this is just an example to get you up and running quickly. Refer to the Grafana documentation for more information.

Now when you run docker-compose up from the directory when the docker-compose.yml file is placed, all the components will be started.

Viewing Metrics

Head over to http://localhost:3000 in your browser to access the Grafana dashboard using admin/secret credentials to log in. Now open the PSPDFKit Server dashboard. You’ll see a dashboard like in the image below, but most likely with different data points.

ℹ️ Note: If no data is shown in the dashboard, open the Server dashboard at <SERVER_URL>/dashboard, upload a couple documents, and perform some operations on them.

The dashboard shows a few crucial Server metrics — you can see their definitions by selecting a panel and choosing Edit. For a complete reference of available metrics, see this guide.

AWS CloudWatch

If you’re deploying your services to AWS, chances are you’re already monitoring them using AWS CloudWatch. If you followed our AWS deployment guide and host Server on AWS ECS, system-level metrics like CPU and memory utilization are automatically collected for you. This section shows how you can integrate PSPDFKit Server with the CloudWatch Agent to export internal Server metrics to CloudWatch.

Prerequisites

This section assumes you followed the Server AWS deployment guide or have deployed PSPDFKit Server to AWS ECS backed by an EC2 instance yourself.

Setting Up

First, you’ll need to install the CloudWatch agent on the host where it’s reachable by PSPDFKit Server. Please refer to the CloudWatch agent installation guide for specific instructions.

The next step is agent configuration:

Copy
1
2
3
4
5
6
7
8
9
10
{
  "metrics": {
    "namespace": "PSPDFKitServer",
    "metrics_collected": {
      "statsd": {
        "service_address": ":8125"
      }
    }
  }
}

This configuration specifies that the agent should start a StatsD-compatible listener on port 8125. In addition, all metrics collected by the agent will be placed under the PSPDFKitServer namespace. Save the configuration in the cwagent.json file on the host where you installed the agent and run:

1
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -c file:cwagent.json -s

ℹ️ Note: The command above requires root privileges.

You can see that the agent is up by running the following:

1
systemctl status amazon-cloudwatch-agent.service

The only thing that’s left is to configure PSPDFKit Server to send metrics to the agent. You’ll need the IP address or the DNS entry for the host — this depends on where you deployed the agent. Make sure Server can reach the agent on port 8125 (e.g. if you deployed it to AWS EC2, configure the security group to allow incoming UDP traffic on port 8125).

Now head to the ECS task definitions page and modify the task definition for PSPDFKit Server. Click on the most recent revision, select Create new revision, and scroll down to edit the Server container. Set the STATSD_HOST environment variable to point to the CloudWatch agent’s IP address or DNS entry, and set STATSD_PORT to point to 8125.

Save the changes and create a new revision. Now update the ECS service that’s running the PSPDFKit Server task and change the task definition revision to the one you just created. Then wait until the task restarts.

Viewing Metrics

You can now go to the AWS CloudWatch console to view the metrics exported by Server. (Note that it takes a while until the agent sends the collected metrics upstream). Use this link to go straight to the PSPDFKitServer namespace in the metrics browser: https://console.aws.amazon.com/cloudwatch/home#metricsV2:graph=~();namespace=~'PSPDFKitServer’.

As an example, to view the HTTP response time metric, search for http_server_req_end in the search box and tick all the checkboxes. You’ll see a graph of HTTP timing metrics grouped by the HTTP method and response status code.

The data points you’ll see will most likely be different than what’s shown in the screenshot above.

ℹ️ Note: If there’s no data shown in the dashboard, open the Server dashboard at <SERVER_URL>/dashboard, upload a couple documents, and perform some operations on them.

You can use the CloudWatch console to browse the available metrics. For more advanced use, see the CloudWatch search expressions and metric math guides. Check out the complete list of metrics exported by PSPDFKit Server on the metrics reference page.

Google Cloud Monitoring

Google Cloud Monitoring (formerly known as Stackdriver) is a monitoring service provided by the Google Cloud Platform. It’s a great choice when you deploy your services on the Google Kubernetes Engine (GKE) stack, since it provides metrics for all the Kubernetes resources out of the box. To forward metrics from PSPDFKit Server to Google Cloud Monitoring, you’ll use Telegraf.

Prerequisites

To follow this section, make sure you’ve deployed PSPDFKit Server to GKE as outlined in our deployment guide.

Setting Up

To get started, first you need to expose the Telegraf agent configuration as a ConfigMap in the GKE cluster. Save the following ConfigMap definition in the telegraf-config.yml file:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: ConfigMap
metadata:
  name: telegraf-config
data:
  telegraf.conf: |
    [agent]
      interval = "90s"
      flush_interval = "90s"


    [[inputs.statsd]]
      service_address = ":8125"

      metric_separator = "."
      datadog_extensions = true

      templates = [
          "*.* measurement.field",
          "*.*.* measurement.measurement.field",
      ]

    [[outputs.stackdriver]]
      project = "<GCP PROJECT ID>"
      namespace = "pspdfkit"

This ConfigMap embeds the Telegraf configuration file directly. Let’s go through the options defined here:

  • interval and flush_interval specify how often the metrics are aggregated and how often they’re pushed to Google Cloud. Make sure flush_interval isn’t less than the interval, and that interval is set to at least 60 seconds. This is because the Google Cloud Monitoring API allows you to create, at most, one data point per minute.
  • inputs.statsd.service_address defines the address of the UDP listener. Use the port number defined here when creating a Service for the Telegraf agents.
  • inputs.statsd.metric_separator needs to be set to . when used with PSPDFKit Server.
  • inputs.statsd.datadog_extensions needs to be set to true so that metric tags sent by Server are parsed correctly.
  • inputs.statsd.templates defines how names of metrics sent by Server are mapped to Telegraf’s metric representation. This needs to be set to the exact value shown in the configuration file above.
  • outputs.stackdriver is a configuration of the Google Cloud Monitoring output plugin. Make sure to put your actual GCP project name here.

The above is just an example configuration file, and it’s by no means a complete configuration suitable for production deployments; please refer to the Telegraf documentation for all the available options.

You can create the ConfigMap in the cluster by running:

1
kubectl apply -f telegraf-config.yml

Now that the configuration is available, deploy Telegraf itself:

Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: telegraf
spec:
  template:
    metadata:
      labels:
        app: telegraf
    spec:
      containers:
        - name: telegraf
          image: telegraf
          volumeMounts:
            - name: telegraf-config
              mountPath: /etc/telegraf/
              readOnly: true
      volumes:
        - name: telegraf-config
          configMap:
            name: telegraf-config
            items:
              - key: telegraf.conf
                path: telegraf.conf
---
apiVersion: v1
kind: Service
metadata:
  name: telegraf
spec:
  clusterIP: None
  selector:
    app: telegraf
  ports:
    - protocol: UDP
      port: 8125
      targetPort: 8125

This YAML file defines the Telegraf DaemonSet, which deploys a single Telegraf agent to every node in the GKE cluster, along with a headless Service that allows Server to reach Telegraf using a domain name. Notice that you’re mounting the previously defined Telegraf configuration as a file in the Telegraf container. Save that definition in telegraf.yml and run:

1
kubectl apply -f telegraf.yml

The last thing you need to do is configure PSPDFKit Server to send metrics to Telegraf. Modify the Server resource definition file by adding these two environment variables:

1
2
3
4
5
6
env:
  ...
  - name: STATSD_HOST
    value: telegraf
  - name: STATSD_PORT
    value: "8125"

Make sure to recreate the Server deployment by running kubectl apply -f on the file where it’s defined.

Viewing Metrics

To view metrics, head over to Monitoring in the Google Cloud console. If you follow that link, you’ll land on the Metrics Explorer page, where you can search for all the PSPDFKit Server metrics. Note that the PSPDFKit Server metrics are available under the Global resource.

For example, search for pspdfkit/http_server/req_end_mean to view the average HTTP response time of PSPDFKit Server. Apply a filter to only view metrics with a standard group, and group them by the HTTP method.

The data points you’ll see will most likely be different than what’s shown in the screenshot above.

ℹ️ Note: If there’s no data shown in the dashboard, open the Server dashboard at <SERVER_URL>/dashboard, upload a couple documents, and perform some operations on them.

You can now add the chart to the dashboard, or continue exploring available metrics. Learn more about how to build an advanced monitoring solution on top of Google Cloud Monitoring by following the relevant guides.