Consuming MicroProfile Metrics with Prometheus

Published on 19 Dec 2018

In a distributed micro-services architecture, it is important to have an overview of your systems in terms of CPU, memory management and other important metrics.

This is called Observability, measuring the internal state of a system, in this case, the micro-services instances.

These metrics are gathered centrally, so that you can instantly have an overview of all the values and it allows the determination (mostly automated based on some rules) of the health of the instances.

MicroProfile Metrics allows you to expose some custom defined metrics, but also exposes system metrics, related to CPU and memory, for example.

MicroProfile Metrics

The goal of MicroProfile Metrics is to expose monitoring data from the implementation in a unified way. It also defines a Java API so that the developer can define and supply his own values.

Exposing a value is very easy. The only thing which needs to be done is to annotate a field or method (returning some primitive value) with @Gauge

@Gauge(unit = MetricUnits.NONE, name = "inventorySize"  description = "Number of items")
  public int getTotal() {
    return items.size();
  }

MicroProfile metrics has also other annotations to define some timing information on JAX-RS endpoints, for example:

@Metered: A meter measures the rate at which the endpoint is called.
@Timed: It is a timer that tracks the duration of the request.
@Counted: A counter is a simple incrementing value.

By default, it also exposes some system values related to

CPU usage
Heap memory
Garbage collection
Java Threads
etc …

MicroProfile Fault Tolerance

MicroProfile Fault Tolerance has a lot of nice features to improve the fault tolerance and resilience of applications. Concepts included are:

TimeOut: Define a duration for the timeout
RetryPolicy: Define criteria on when to retry
Fallback: Provide an alternative solution for a failed execution.
Bulkhead: isolate failures in part of the system while the rest part of the system can still function.
CircuitBreaker: offer a way to fail fast by automatically failing execution to prevent the system overloading and indefinite wait or timeout by the clients.

Since version 1.1 (contained in MicroProfile 1.4 and 2.0) it also has an integration with MicroProfile metrics.

Interesting statics, like the number of retries, number of times the circuit breaker was open, number of calls to fallback method, etc are gathered and exposed.

Prometheus

Prometheus is the most popular Open-Source product for gathering metrics. It was started in 2012 by SoundCloud (online audio distribution platform and music sharing) and is in 2018 graduated at the Cloud Native Computing Foundation.

You can see it as a database for storing time series but has many more features

Multi-dimensional data model with time series
Query Language
Pull data from metric sources
Alert Manager

Therefore, MicroProfile Metrics exposes the values in the Prometheus format by default. So it can be consumed easily by the scrapers.

Integrate

The following steps describe how you can deploy the Payara Server which contains all the MicroProfile implementations, including the one for Metrics and Fault Tolerance, into a docker environment together with a Prometheus instance to gather the metrics.

As an example application, the demo code used in the presentation “Deploy, monitor, and take control of your Micro-Services with MicroProfile” which explores the same topic, can be used. You can find the source in this GitHub repository.

Create Network

To make it easier to connect the different docker instances, create a specific network in docker with the following command.

docker network create demo-net

Create Application Image

For our tests, we are creating a specific image which is based on the official Payara Docker image where we add the WAR file with our application.

FROM payara/server-full
COPY ./target/monitoring.war $DEPLOY_DIR

You can execute the following commands in a terminal from the root of the Maven project (also containing the DockerFile).

mvn clean package

And then to create the image with

docker build -t demo/service .

Startup the Image with the Application

Now that we have the Docker image, let start up a container with this image.

docker run -d -p 8080:8080 --name service --net demo-net demo/service

The name service is here important as it is used in the Prometheus configuration file. We have defined that it looks up the application through DNS by using the names service.

You can verify if everything is ok by calling the following URL in your browser : http://localhost:8080/monitoring/rest/hello

Create Prometheus Image

We are creating a special Docker image containing the Prometheus server (from Adam Bien's Dockland) which contains also our Prometheus configuration.

The Dockerfile content:

FROM airhacks/prometheus
COPY prometheus.yml .

The configuration file contains the following lines:

global:
  scrape_interval:     15s

  external_labels:
    monitor: 'payara5-monitor'

scrape_configs:
  - job_name: 'payara5'
    scrape_interval: 2s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['service:8080']

Run the build image command from the directory containing both files:

docker build -t demo/prometheus .

Startup the Image with Prometheus

Now that we have Prometheus Docker image, let start the container with this image:

docker run -d -p 9090:9090 --name prometheus --net demo-net demo/prometheus

You can verify if the connection with the application works with the following steps:

1. Open the browser with the URL http://localhost:9090.

2. Select the metrics vendor:system_cpu_load in the drop-down.

3. Put some load on your machine with the badly written (on purpose) multi-threaded check with prime numbers on http://localhost:8080/monitoring/rest/prime.

4. You should see the spike in CPU usage after you have pressed the execute button.

Gain Better Insight Into Application Resilience

The metrics exposed by MicroProfile Metrics are in the Prometheus format by default. This makes it very easy to gather all your custom and system defined values within Prometheus. You just need to point a scraper to the running instance of Payara.

Also MicroProfile Fault Tolerance has support for Metrics so that we can have better insight into the resilience of our applications.