Dockerless, part 2: How to build container image for Rails application without Docker and Dockerfile

Illustration of a person holding their nose and disposing of a full trash bag into a bin, with a displeased cat and scattered litter nearby, indicating a smelly situation.

Disclaimer: at the moment of writing this article mkdev is not running containers in production. Images built below are only used for development, tests and CI system and are never run on production servers. Once mkdev decides to use containers in production, the contents and setup of our container images will change to be actually suitable for prod. Keep this in mind when reading this post.

UPDATE: After writing this series, I also made a video in which I show how to use Podman and Systemd to build and run containers. Check it out

DevOps consulting: DevOps is a cultural and technological journey. We'll be thrilled to be your guides on any part of this journey. About consulting

In the previous article we looked at all the reasons why you would want to taste a Dockerless life. We decided to try two new tools that will replace Docker: Buildah and Podman. In this article we will learn what Buildah is and how to use it to put your Ruby on Rails application into a container.

What is a container image?

Before we learn the tool, let's first learn what a container image is by reading the article A sysadmin's guide to containers. From there we learn that container image is a TAR file of two things:

Container root filesystem. To simply say, it's a directory with all the regular directories you would expect to be inside the container, like /usr, /home etc.
JSON file, a config file that defines how to run this root filesystem -- which commands to execute, which environment variables to set and so on.

Contents of container image are defined in OCI image spec, your go-to destination if you want to learn more about the structure of container images. It might sound crazy, but you don't have to use image-spec for container images, you can use it for other things too.

What is Buildah?

Buildah is a container image builder tool, that produces OCI-compliant images. It is distributed as a single binary and is written in Go. Buildah is available as a package in most of modern Linux distributions, just follow official installation instructions.

Buildah can only be used to manipulate images. It's job is to build container images and push them to registries. There is no daemon involved. Neither does Buildah require root privileges to build images. This makes Buildah especially handy as part of a CI/CD pipeline -- you can easily run Buildah inside a container without granting this container any root rights.

To me personally the whole Docker in Docker setup required on container-based CI systems (Gitlab CI with Docker executor, for example) just to be able to build new container image felt a bit of an overkill. With Buildah there is no need for this, due to the narrow focus on things it needs to do well and things it should not do at all.

One place where Buildah appears to be very useful is BuildConfigurations in OpenShift. Starting from OpenShift 4.0 BuildConfigs will rely on Buildah instead of Docker, thus removing the need to share any sockets or having privileged containers inside the OpenShift platform. Needless to say that it results in a more secure and cleaner way to build container images inside one of most popular container platforms out there.

Images built by Buildah can be used by Docker without any issues. They are not "Buildah Images", but rather just "Container Images", they follow OCI specification, which is understood by Docker as well. So how do we build an image with Buildah?

With Buildahfile

Just kidding, there is actually no Buildahfile involved. Instead, Buildah can just read Dockerfiles, making transition from Docker to Buildah as easy as it can get.

At mkdev we used to have Mattermost at the core of our messaging platform. It is important that we are able to run Mattermost locally to be able to easily develop integrations between primary web application and the messaging system.

Even though Mattermost already provides official Docker images, we had to build our own due to the way we prefer to configure it and also to make it easier to run ephemeral test instances of Mattermost. We also want to pre-install certain Mattermost plugins that our mentors rely on. So we took the official Dockerfile, modified it a bit and fed it to Buildah:

FROM alpine:3.9


# Some ENV variables
ENV PATH="/opt/mattermost/bin:${PATH}"
ENV MM_VERSION=5.8.0
# Set defaults for the config
ENV MMDBCON=localhost:5432 \
    MMDBKEY=XXXXXXXXXXXX \
    MMSMTPUSERNAME=postfix \
    MMSMTPPASSWORD=secrets \
    MMSMTPSALT=XXXXXXXXXXXX \
    MMGITHUBSECRET=secret \
    MMGITHUBHOOK=localhost
# Build argument to set Mattermost edition
ARG PUID=2000
ARG PGID=2000
# Install some needed packages
RUN apk add --no-cache \
 ca-certificates \
 curl \
 jq \
 libc6-compat \
 libffi-dev \
 linux-headers \
 mailcap \
 netcat-openbsd \
 xmlsec-dev \
 && rm -rf /tmp/*
## Get Mattermost
RUN mkdir -p /opt/mattermost/data /opt/mattermost/plugins /opt/mattermost/client/plugins \
    && cd /opt \
    && curl https://releases.mattermost.com/$MM_VERSION/mattermost-team-$MM_VERSION-linux-amd64.tar.gz | tar -xvz \
    && curl -L https://github.com/mattermost/mattermost-plugin-github/releases/download/v0.7.1/github-0.7.1.tar.gz -o /tmp/github.tar.gz \
    && cd /opt/mattermost/plugins \
    && tar -xvf /tmp/github.tar.gz
COPY files/entrypoint.sh /
COPY files/mattermost.json /opt/mattermost/config/config.json
RUN chmod +x /entrypoint.sh \
    && addgroup -g ${PGID} mattermost \
    && adduser -D -u ${PUID} -G mattermost -h /mattermost -D mattermost \
    && chown -R mattermost:mattermost /opt/mattermost /opt/mattermost/plugins /opt/mattermost/client/plugins
USER mattermost
# Configure entrypoint and command
ENTRYPOINT ["/entrypoint.sh"]
WORKDIR /opt/mattermost
CMD ["mattermost"]
# Expose port 8000 of the container
EXPOSE 8065
# Declare volumes for mount point directories
VOLUME ["/opt/mattermost/data", "/opt/mattermost/logs", "/opt/mattermost/config", "/opt/mattermost/plugins", "/opt/mattermost/client/plugins"]

If it looks to you just like any other regular Dockerfile then only because it is, in fact, just a regular Dockerfile. Let's run Buildah:

buildah bud -t docker.io/mkdevme/mattermost:5.8.0 .

The output that will follow is similar to what you see when you run docker build . command. The resulting image will be stored locally, you can see it when you run buildah images command. Nice little feature of Buildah is that your images are user-specific, meaning that only the user who built this image is able to see and use it. If you run buildah images as any other system user, you won't see anything. This is different from Docker, where docker images always list same set of images for all the users.

Once you built the image, you can push it to the registry. Buildah supports multiple transports to push your image. Some transport examples are docker-daemon -- if you still have Docker running locally and you want this image to be seen by Docker, docker -- if you want to push the image to Docker API compatible remote registry. There are other transports that are not Docker-specific: oci, containers-storage, dir etc.

Nothing stops you from using Buildah to push the image to Docker Hub, if that's your registry of choice. By using Buildah we are not thinking in terms of Docker Images. It's more like if we would have a Git repository that we could push to GitHub, GitLab or BitBucket. Same way we can push our Container Image to the registry of choice -- Docker Hub, Quay, AWS ECR and others.

Inspecting the image

One of the transports Buildah supports is dir. When you push your image to dir, which is just a directory on filesystem, Buildah will store there tarballs for the layers and configuration of your image and a JSON manifest file. This is only useful for debugging and perfect for seeing the internals of an image.

Create some directory and run buildah push IMAGE dir:/$(pwd). I don't expect you to actually build a Mattermost image, just use any other image. If you don't have any and don't want to build any, then just buildah pull any image from Docker Hub.

Once finished, you will see files with names like 96c6e3522e18ff696e9c40984a8467ee15c8cf80c2d32ffc184e79cdfd4070f6, which is actually a tarball. You can untar this file into a destination of your choice and see all the files inside this image layer. You will also see an image manifest.json file, in case of Mattermost it looks like this:

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:57ea4e4c7399849779aa80c7f2dd3ce4693a139fff2bd3078f87116948d1991b",
    "size": 1262
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar",
      "digest": "sha256:6bb94ea9af200b01ff2f9dc8ae76e36740961e9a65b6b23f7d918c21129b8775",
      "size": 2832039
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar",
      "digest": "sha256:96c6e3522e18ff696e9c40984a8467ee15c8cf80c2d32ffc184e79cdfd4070f6",
      "size": 162162411
    }
  ]
}

Image manifest is described by OCI spec. If you look closely at the example above, it defines two layers (vnd.oci.image.layer.v1.tar) and one config file (vnd.oci.image.config.v1+json). We can see that the config has a digest 57ea4e4c7399849779aa80c7f2dd3ce4693a139fff2bd3078f87116948d1991b. We have this file as well and though it looks just like layer files, it's actually a config file of the image.

This might be a bit confusing, but keep in mind that this structure was created for other software to store and process, not for the human eye to read. If you need to quickly figure which file in the image stores the config, always look at the manifest.json first:

{
  "created": "2019-05-12T16:13:28.951120907Z",
  "architecture": "amd64",
  "os": "linux",
  "config": {
    "User": "mattermost",
    "ExposedPorts": {
      "8065/tcp": {}
    },
    "Env": [
      "PATH=/opt/mattermost/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "MM_VERSION=5.8.0",
      "MMDBCON=localhost:5432",
      "MMDBKEY=XXXXXXXXXX",
      "MMSMTPUSERNAME=postfix",
      "MMSMTPPASSWORD=secrets",
      "MMSMTPSALT=XXXXXXXXXX",
      "MMGITHUBSECRET=secret",
      "MMGITHUBHOOK=localhost"
    ],
    "Entrypoint": [
      "/entrypoint.sh"
    ],
    "Cmd": [
      "mattermost"
    ],
    "Volumes": {
      "/opt/mattermost/client/plugins": {},
      "/opt/mattermost/config": {},
      "/opt/mattermost/data": {},
      "/opt/mattermost/logs": {},
      "/opt/mattermost/plugins": {}
    },
    "WorkingDir": "/opt/mattermost"
  },
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:f1b5933fe4b5f49bbe8258745cf396afe07e625bdab3168e364daf7c956b6b81",
      "sha256:462e838baed1292fb825d078667b126433674cdc18c1ba9232e2fb8361fc8ac2"
    ]
  },
  "history": [
    {
      "created": "2019-05-11T00:07:03.358250803Z",
      "created_by": "/bin/sh -c #(nop) ADD file:a86aea1f3a7d68f6ae03397b99ea77f2e9ee901c5c59e59f76f93adbb4035913 in / "
    },
    {
      "created": "2019-05-11T00:07:03.510395965Z",
      "created_by": "/bin/sh -c #(nop) CMD [\"/bin/sh\"]",
      "empty_layer": true
    },
    {
      "created": "2019-05-12T16:13:28.951120907Z"
    }
  ]
}

So, just a bunch of tarballs and json files -- that's the whole container image!

You say Dockerless but you still rely on Dockerfile!

Creators of Buildah intentionally decided not to introduce new DSL for defining container images. Buildah gives you two ways to define an image: a Dockerfile or a sequence of buildah commands. We will learn the second way shortly, but I must warn you that I don't think Dockerfiles will disappear anytime soon. And there is probably nothing that runs with them, except the name itself. Imagine investing into going Dockerless only to find yourself still writing Dockerfiles!

I wish they would be called Containerfiles or Imagefiles. That would be much less awkward for the community. But as of now, convention is to name this file a Dockerfile and we simply have to deal with it.

Building images with Buildah directly

The second way to build an image with Buildah is by using buildah commands. The way Buildah builds images is by creating a new container from a base image and then running commands inside this container. After all commands had run, you can commit this container to become an image. Let's build an image this way and then discuss if and when is this better than writing a Dockerfile.

We first need to start a new container from the existing image:

buildah from centos:7

If image doesn't exist yet, it will be pulled from the registry, just like when you use Docker. buildah from command will return the name of the container that was started, normally it's "IMAGE_NAME-working-container", and in our case it's centos-working-container. We need to remember to use this name for all of the future commands.

We can run commands inside this container with buildah run command:

buildah run centos-working-container -- yum install unzip -y

And we can configure various OCI-compliant options for the future image with buildah config command, for example environment variable:

buildah config -e ENVIRONMENT=test centos-working-container

We can also mount the complete container filesystem inside of the build server and manipulate it directly from the host with the tools installed on the host. It is useful when we don't want to install certain tools inside the image just to do some build-time manipulations. Keep in mind that in this case you need to make sure all these tools are installed on the machine of anyone who wants to build your image (which then kind of ruins the portability of your build script).

buildah mount centos-working-container

In return Buildah will give you the location of a mounted filesystem, for example /home/fodoj/.local/share/containers/storage/overlay/DIGEST/merged. Just to test, we can then create a file there: touch hello-from-host /home/fodoj/.local/share/containers/storage/overlay/DIGEST/merged/home/.

Once we are happy with the image, we can commit it:

buildah commit centos-working-container my-first-buildah-image

And remove the working container:

buildah rm centos-working-container

Note that even though Buildah does run containers, it provides no way to do it in a way that would be useful for anything but building images. Buildah is not a replacement for container engine, it only gives you some primitives to debug the process of building an image!

Images built by Buildah are visible to Podman, which will be the topic of the next article. For now, if you want to verify that the file hello-from-host really exists, run this:

image=$(buildah from my-first-buildah-image)
ls $(buildah mount $image)/home
$> hello-from-host

This will create another working container. Mount it and show the content of /home directory. The way we did it is actually the way to go if you want to build images with Buildah and without a Dockerfile. Instead of a Dockerfile you should write a shell script that invokes all the commands, commits the image and removes the working container. That's how the "Buildahfile" (that's really just a shell script) for mkdev looks like:

#!/bin/bash
set -x
mkdev=$(buildah from centos:7)
buildah run "$mkdev" -- curl -L http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm -o epel-release-latest-7.noarch.rpm
buildah run "$mkdev" -- curl -L https://downloads.wkhtmltopdf.org/0.12/0.12.5/wkhtmltox-0.12.5-1.centos7.x86_64.rpm -o wkhtmltopdf.rpm
buildah run "$mkdev" -- curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip"
buildah run "$mkdev" -- rpm -ivh epel-release-latest-7.noarch.rpm
buildah run "$mkdev" -- yum install centos-release-scl -y
buildah run "$mkdev" -- yum install unzip postgresql-libs postgresql-devel ImageMagick \
                       autoconf bison flex gcc gcc-c++ gettext kernel-devel make m4 ncurses-devel patch \
                       rh-ruby25 rh-ruby25-ruby-devel rh-ruby25-rubygem-bundler rh-ruby25-rubygem-rake \
                       rh-postgresql96-postgresql openssl-devel libyaml-devel libffi-devel readline-devel zlib-devel \
                       gdbm-devel ncurses-devel gcc72-c++ \
                       python-devel git cmake python2-pip chromium chromedriver which -y
buildah run "$mkdev" -- pip install ansible boto3 botocore
buildah run "$mkdev" -- yum install wkhtmltopdf.rpm -y
buildah run "$mkdev" -- ln -s /usr/local/bin/wkhtmltopdf /bin/wkhtmltopdf
buildah run "$mkdev" -- unzip awscli-bundle.zip
buildah run "$mkdev" -- ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
buildah run "$mkdev" -- yum clean all && rm -rf /var/cache/yum
git archive -o app.tar.gz --format=tar.gz HEAD
buildah add "$mkdev" app.tar.gz /app/
buildah add "$mkdev" infra/app/build/entrypoint.sh /entrypoint.sh
buildah config --workingdir /app "$mkdev"
buildah run "$mkdev" -- scl enable rh-ruby25 "bundle install"
rm app.tar.gz
buildah config --port 3000 "$mkdev"
buildah config --entrypoint '[ "/entrypoint.sh" ]' "$mkdev"
buildah run "$mkdev" -- chmod +x /entrypoint.sh
buildah config --cmd "bundle exec rails s -b '0.0.0.0' -P /tmp/mkdev.pid" "$mkdev"
buildah config --env LC_ALL="en_US.UTF-8" "$mkdev"
buildah run "$mkdev" -- rm -rf /app/
buildah commit "$mkdev" "docker.io/mkdevme/app:dev"
buildah rm "$mkdev"

This script probably looks extremely stupid to you if you ever produced a good container image in your life. Let me explain some of the things that are happening there:

We use Centos 7 as a base image because in production we run on Centos 7. Even if we don't run containers in production just yet, it makes sense to keep the development environment as close to production one as possible.
We do install a ridiculous number of packages, including AWS CLI, Chromium, Software Collections and what not. We do it because we use the resulting image in development environment and in our CI system. Both of these locations require extra tooling to run integration tests (Chromium) or perform some packaging and deployment tasks (AWS CLI and Ansible). Software Collections are used in our production environment and it's important we use the same Ruby version in all other envs as well.
We remove the code of the application itself at the very end. For this use case, we don't really need the code to be in the image. In both development environment and CI we need the latest version of the code, not something baked into the image.

We store this script inside the application repo, just like we would keep the Dockerfile there. Once we decide we want to run mkdev in containers in production, we can modify this script to do different things depending on the environment.

You can use this approach only if your build server is able to run the shell script. This is not a problem because Windows has WSL, for example. Your host system doesn't have to be Linux based as long as it is able to run some kind of Linux inside! Will it work one day for MacOS users without extra Linux VM? Who knows, let's hope Buildah developers are working on it.

How does Buildah work internally?

Both Podman and Buildah work quite similar internally. They both make use of Linux kernel features, specifically user namespaces and network namespaces to make it possible to run containers without any root privileges. I won't talk about it in this article, but if you can't wait, then start by reading following resources:

What's next

I hope you've learned a lot about container images today. Buildah is a great tool not only for local development, but for any kind of automation around building container images. It's not the only one available, Kaniko from Google being another example, though Kaniko is a bit more focused on Kubernetes environments.

Now that we have an image in place, it's time to run it. In the next article I will show you how to use Podman to completely automate local development environment for a Ruby on Rails application. We will learn how to use Kube YAML feature of Podman to describe all the services in a Kubernetes-compliant YAML definition, how to run a Rails application a container and how to run tests of the Rails application in this container. Containers and Podman in particular will become really handy when we will start creating ephemeral Mattermost instances just for the integration testing.

Feel free to ask any questions in the comments below, I will make sure to reply to them directly or extend this article!

Series "Dockerless"