GCP Goodies Part 5— Stackdriver logging and log based alerting

Krzysztof Grajek
SoftwareMill Tech Blog
7 min readSep 26, 2019

--

Jon Sullivan @ Flickr CC 2.0

Among many different types of data that Google Stackdriver can gather about your infrastructure are logs. Stackdriver allows you to store, search, analyze, monitor, and alert with logs coming from GCP or even AWS.

When you run Kubernetes cluster, it’s nice to have persistent logging solution so you are able to browse through the logs between pod restarts and new releases. What is even nicer, is the ability to create custom metrics from the log data, like watching for exceptions, counting number of exceptional behavior occurences over time etc. and once the metrics are defined it’s just a step away from creating alerts like we did in the previous part.

In this part of the series, we are going to create a Kubernetes cluster with a single pod deployed. The pod itself will be a simple Scala Play application with some endpoints allowing you to throw some exceptions and log some entries with INFO or ERROR levels. This way we can query them on the Stackdriver Logging console and create custom metrics.

Initial Setup

This time we will take a slightly different approach and won’t use the Google Cloud Shell console but execute our work straight from our own terminal. The only difference, compared to the previous parts, is that you have to authenticate with gcloud command so all your consecutive commands will have a context of authorized user and the project you are working on. To do that, fire up your terminal and execute:

NAME=stackdriver-test2
ZONE=us-west2-a
gcloud auth logingcloud config set compute/zone $ZONEgcloud config set project softwaremill-playground-2

Change the values like NAME, ZONE and project name in the snippet above according to your needs.

Web Application

Fork gcp-goodies repository, clone it and navigate to part-5. You will find there an already prepared web application which you can release as a docker container. You can try it out first locally with docker:publishLocal or release it straight into Google Docker Image Repository with sbt release plugin/command.

sbt clean docker:publishLocaldocker run -p 9000:9000 — name playapp -it eu.gcr.io/softwaremill-playground-2/play-scala-stackdriver-logging:1.0

The application itself provides the following endpoints:

GET        /donothing   -- logs INFO messages to the System.out           
GET /thrownpe -- throws NullPointerException
GET /throwiae -- throws IllegalArgumentException
GET /logerrors -- logs ERROR messages to System.err

All the exceptions thrown by our application are send to System.err. System.out and System.err log destinations were chosen deliberately as Stackdriver, by default, reads from those two sources and marks the messages on the Stackdriver logging console as INFO and ERROR appropriately.

In our logback configuration ( conf/logback.xml ) we need to filter out the messages we are not interested in:

Please note that we have reduced INFO and WARN levels to a single INFO level on the Stackdriver side, this is ok for the tutorial purposes but it can be not for a production setup. If you want to have full support for your logging level information you can either check out the setup at: ‘Setting Up Stackdriver Logging for Java’ or write your own PatternLayout like here.

EDIT:

There is a new project added to the GCP Goodies repository under the name part-5-1 which utilizes Google library for making Logback based logs working with Stackdriver. Please have a look if you are interested.

Once you stop playing around with our Play application locally, you can remove it from the local docker with:

docker rm playapp

The best way to use our application in Google’s Kubernetes cluster is to dockerize it and send it over to Google Container Registry so it will be available to our cluster without the need for additional authentication etc.

Navigate to build.sbt and change the address to your GCR appropriately:

dockerRepository := Some("eu.gcr.io/softwaremill-playground-2")

then start the release process:

sbt clean release

After you pick your version numbers the whole release process starts and puts your newly created docker image in GCR:

Setting up the cluster

Now its the time to set up our cluster where we will deploy previously built web application. This time, again, we will do it a bit differently and we won’t be using Google Deployment Manager. Simply execute the following command to set up 2-node GKE cluster:

NAME=stackdriver-test2
ZONE=us-west2-a
gcloud config set compute/zone $ZONE
gcloud container clusters create $NAME --num-nodes=2

check if all is good with the following commands:

gcloud compute instances list
gcloud container clusters get-credentials $NAME
kubectl get po

You should see two nodes of your cluster listed and no pods deployed.

Deploying Play Server application

Assuming you have your dockerized application on GCR available you can simply deploy it to our cluster directly with kubectl:

kubectl create deployment stackdriver-test3-logging --image=eu.gcr.io/softwaremill-playground-2/play-scala-stackdriver-logging:1.1

in case you would ever need to update it (e.g. with new version tag):

kubectl set image deployment/stackdriver-test3-logging play-scala-stackdriver-logging=eu.gcr.io/softwaremill-playground-2/play-scala-stackdriver-logging:1.2

Just to make things easier for us, when calling different endpoints on our web application we will expose its port to the world:

kubectl expose deployment stackdriver-test3-logging --type=LoadBalancer --port 80 --target-port 9000kubectl get service

External IP: 35.235.124.94

Enable Stackdriver:

Stackdriver for Kubernetes cluster on Google Cloud is not enabled by default, you can either deploy a new cluster with the property — enable-stackdriver-kubernetes specified (which we haven’t done) or update existing cluster with the command below:

gcloud beta container clusters update ${NAME} --enable-stackdriver-kubernetes --zone ${ZONE}

Navigate to Stackdriver -> Logging. If you won’t see your cluster available under Resources -> Kubernetes Cluster you would have to wait a minute or two and refresh the page. Once you find it you can navigate to your pod under the Services tab:

Select

If you play around with the external IP and the endpoints listed at the beginning of this post, you should be able to see some logs with ERROR levels indicated by Stackdriver.

Log based Metrics

Ok, so we have our application up and running on the GKE cluster and throwing some exceptions, which is good :). Now we can configure a metric to gather the particular errors we are interested in, into one dataset we can analyse. Let’s say we want to create a metric for all NullPointerExceptions happening in our application. You can first search our logs for this particular exception happening on our cluster with specifying the filters:

The fastest way to create a metric is to simply specify the name for your new metric, save it, and after that click View Logs for Metric :

you can then update the filtering to suit your needs, like in the example below:

resource.type=”k8s_container”
resource.labels.cluster_name=”stackdriver-test2"
resource.labels.namespace_name=”default”
resource.labels.container_name=”play-scala-stackdriver-logging”
severity>=ERROR
“NullPointerException”

and update the metric in the Metric editor on the right side of the page.

Log based Alerting

By navigating to Create alert from metric in the log based metrics explorer, you can define the new alert which should be triggered when some conditions are fulfilled. Of course there are many options to choose from but for the purposes of this tutorial we are going to create an alert when the NullPointerException happens.

Resource type and metric will be already filled for you, select an aggregator and condition in Configuration section to choose when the alert should be triggered:

After saving our new alerting policy you can go back to our application endpoint and hit it to generate some more exceptions. After a minute or two the policy fires up and the notification is sent over.

You can browse incident details on the Stackdriver panel where you will be able to see recent history of your metric:

Created policy was a trivial example to show you how log-based alerting can be utilized, be aware that for the errors and exceptions happening within your cluster there is a designated page on Stackdriver called Stackdriver Error Reporting where you can browse the exceptions found in your application and create basic email notification, to get notified when they occur.

Watch this space for more Stackdriver and Google Compute Cloud topics in general. Stackdriver itself has a lot to offer and, although clunky at times, it can be a valuable tool in every developer toolbox working with Kubernetes on GCP or other Google computing resources.

--

--