When I first started working with Kubernetes, I’ve heard that it is an incredible tool that eases out the development and deployment process. It comes with a bunch of solutions that make it easier to create a reliable and scalable application. Over the past three years, I came to the conclusion that I know everything about scaling in k8s. Actually, I came to this conclusion several times, which brought me to the point, when I want to share all the checkpoints I’ve reached already. I wonder how many are still to be discovered…

The basics

Running your application in Kubernetes should be pretty easy, as long as you’ve already put it into a Docker image on some registry before. All you need to do at this point is to create a manifest file for K8s and apply it to your cluster. This will trigger pulling a Docker image from wherever it is (it may be stored locally) and actually running it.

Your sample deployment manifest can look as follows:

# manifest.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-scale-app
spec:
  selector:
    matchLabels:
      app: kube-scale-app
  template:
    metadata:
      labels:
        app: kube-scale-app
    spec:
      containers:
      - name: kube-scale-app
        image: gcr.io/kubernetes-scaling-268016/worker-app:deed569  

You can spin it up by running:

$ kubectl apply -f manifest.yaml

As you can see, the deployment definition contains a description of the containers (a Docker image), as well as the names and labels that indicate that specific pods belong to this deployment.

$ kubectl get pods
NAME                                  READY   STATUS              RESTARTS   AGE
pod/kube-scale-app-7f6db7d4b4-5xk4w   1/1     Running             0          1m

While it's very cool to have the app up and running, it's not very scalable at the moment.

Setting up more

The first, most obvious way of scaling a Kubernetes application is using a replicas property on the deployment's spec:

# manifest.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-scale-app
spec:
  replicas: 2
  ...

Once you run kubectl apply -f manifest.yaml you can see that another pod is being created.

$ kubectl get pods
NAME                                  READY   STATUS              RESTARTS   AGE
pod/kube-scale-app-7f6db7d4b4-5xk4w   1/1     Running             0          19m
pod/kube-scale-app-7f6db7d4b4-8wx4q   0/1     ContainerCreating   0          2s

That wasn't that hard, right? But there is one problem with this solution - it requires manual action from a human being. You, as a maintainer, need to know when it should be scaled up and down, and how many replicas should be enough at a particular moment. Needless to say, this is not a good idea to worry about your app 24/7, especially since the middle of the night for you, might be peak hours for the clients across the globe. Spoiler alert: some possibilities here will be described in the advanced part of our guide.

But first, how will you be able to update the app without causing downtime for your precious users?

Rolling, rolling…

One of the most popular features of Kubernetes deployments is a rolling update during deployment. The idea behind is fairly simple: when you want to run a new, two-pod deployment of a service with a new version, you do it gradually: first, start one of the new pods, then kill one of the old ones, create the second pod with a new version, then kill the last old-versioned one. By doing so, you achieve two things: you always have at least two pods ready to serve the traffic, and in the process, you can observe if the new version is not broken (allowing you to stop and revert the deploy before all old, valid pods are deleted). In Kubernetes' jargon, we say that the deployment is using rolling update strategy. The following output of an interactive get pods command shows you exactly how the process is taking place:

kubectl get pods --watch                                 
NAME                              READY   STATUS    RESTARTS   AGE
kube-scale-app-86878d9c85-qfm89   1/1     Running   0          8s
kube-scale-app-86878d9c85-tsz5p   1/1     Running   0          9s
kube-scale-app-fb58cc9f-f9zgv     0/1     Pending   0          0s
kube-scale-app-fb58cc9f-f9zgv     0/1     Pending   0          0s
kube-scale-app-fb58cc9f-f9zgv     0/1     ContainerCreating   0          0s
kube-scale-app-fb58cc9f-f9zgv     1/1     Running             0          2s
kube-scale-app-86878d9c85-qfm89   1/1     Terminating         0          26s
kube-scale-app-fb58cc9f-x6hcl     0/1     Pending             0          0s
kube-scale-app-fb58cc9f-x6hcl     0/1     Pending             0          0s
kube-scale-app-fb58cc9f-x6hcl     0/1     ContainerCreating   0          0s
kube-scale-app-fb58cc9f-x6hcl     1/1     Running             0          1s
kube-scale-app-86878d9c85-tsz5p   1/1     Terminating         0          28s
kube-scale-app-86878d9c85-qfm89   0/1     Terminating         0          27s
kube-scale-app-86878d9c85-tsz5p   0/1     Terminating         0          29s
kube-scale-app-86878d9c85-tsz5p   0/1     Terminating         0          37s
kube-scale-app-86878d9c85-tsz5p   0/1     Terminating         0          37s
kube-scale-app-86878d9c85-qfm89   0/1     Terminating         0          36s
kube-scale-app-86878d9c85-qfm89   0/1     Terminating         0          36s

This is also a default behavior for all the deployments. If for some reason, you don't want to do that, but rather kill all old pods and then create all new ones, you should look into recreate strategy which can be defined in the manifest as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-scale-app
spec:
  replicas: 2
    strategy:
      type: Recreate
  ...

After applying this manifest, the deployment looks as follows:

kubectl get pods --watch
NAME                            READY   STATUS    RESTARTS   AGE
kube-scale-app-fb58cc9f-f9zgv   1/1     Running   0          2m20s
kube-scale-app-fb58cc9f-x6hcl   1/1     Running   0          2m18s
kube-scale-app-fb58cc9f-f9zgv   1/1     Terminating   0          2m36s
kube-scale-app-fb58cc9f-x6hcl   1/1     Terminating   0          2m34s
kube-scale-app-fb58cc9f-f9zgv   0/1     Terminating   0          2m37s
kube-scale-app-fb58cc9f-x6hcl   0/1     Terminating   0          2m35s
kube-scale-app-fb58cc9f-x6hcl   0/1     Terminating   0          2m35s
kube-scale-app-fb58cc9f-f9zgv   0/1     Terminating   0          2m42s
kube-scale-app-fb58cc9f-f9zgv   0/1     Terminating   0          2m42s
kube-scale-app-fb58cc9f-x6hcl   0/1     Terminating   0          2m40s
kube-scale-app-fb58cc9f-x6hcl   0/1     Terminating   0          2m40s
kube-scale-app-86878d9c85-kfvqq   0/1     Pending       0          0s
kube-scale-app-86878d9c85-mrlbb   0/1     Pending       0          0s
kube-scale-app-86878d9c85-kfvqq   0/1     Pending       0          0s
kube-scale-app-86878d9c85-mrlbb   0/1     Pending       0          0s
kube-scale-app-86878d9c85-kfvqq   0/1     ContainerCreating   0          0s
kube-scale-app-86878d9c85-mrlbb   0/1     ContainerCreating   0          0s
kube-scale-app-86878d9c85-kfvqq   1/1     Running             0          2s
kube-scale-app-86878d9c85-mrlbb   1/1     Running             0          2s

Before we move forward, there is one more thing worth mentioning with a rolling update strategy. Or one thing that is missing, actually. Imagine having a service that fetches some configuration from an external source at startup. Then the data in that source changes, and you want the service to read the data again. In this case, you need to restart all pods, but how should you do this? While there is no command like kubectl rolling restart, you can use a workaround that I've been using for some time now. What you need to do is make a slight change to the deployment definition that is small enough to not change the application behavior, but big enough to force Kubernetes to restart it. I use a patch subcommand for this and change one of the annotations. You can set some arbitrary value there, but you could also set the value to something that is constantly changing, like a timestamp. This would allow you to run that restart consistently eg. in some script. The command for patching a deployment looks as follows:

kubectl patch deployments/kube-scale-app --patch '{"spec": {"template": {"metadata": {"annotations": {"restart": "2"}}}}}'
deployment.apps/kube-scale-app patched

And this is enough to perform a poor man's rolling restart on a deployment.

To be continued

This wraps up the basics part of the guide. Stay tuned for the second part that will also talk about making your applications more resilient in case of various operations and accidents on a cluster level.