Containerizing Ruby on Rails Applications

Five Improvements when Deploying Ruby on Rails Apps to Kubernetes

Apr 28, 2021 | Michael Orr

As the world moved rapidly towards container technology, we took our time. We only recently started migrating our applications to our Kubernetes platform. To migrate those applications, we had to containerize them. Containerization includes the technical step of packaging an application’s code with it’s dependencies, and changes to the application itself to make it container-friendly. Applications that were previously built and run on standard cloud servers usually need a few modifications to make them work well in a container.

As we prepared our applications to run in containers, the line between build-time and run-time became less blurry and much more strict. This distinction was less critical with our previous platform and as such, our existing Rails applications did not need to worry about what was available at build-time versus run-time. To transition these applications to our new platform, we had to make a few modifications and we’ll walk through the top 5 here.

  1. Logging to STDOUT
  2. Gems with JS Runtime Dependencies
  3. Skipping Initializers at Build-Time
  4. Chef Linked Files
  5. Optimizing the Web Server

1. Logging to STDOUT

Typically, Rails applications that are deployed to standard cloud servers are set up to log to a file. Unlike standard cloud servers, the expectation with containers is that they are ephemeral and the platform will treat them as such - containers can be destroyed, restarted, or rescheduled at any point in time. For this reason, the application should not be logging to a file in the container or those logs will be lost. The container runtime expects logs to be sent to standard output so that they can be collected for storage. We don't have to worry about losing the log files when containers are scaled down and destroyed because the data is being shipped out of the container to some external storage.

We build our application container images using Paketo Buildpacks. In our CI, we run the pack CLI to execute the buildpacks against a given application’s source code and publish the container image to our registry. When the image is being built, there is a custom environment variable enabled called RAILS_LOG_TO_STDOUT that the applications can use to determine if it should be logging to standard output. An application would use it in their environment configuration as such:

# config/environments/production.rb

config.log_formatter = ::Logger::Formatter.new

if ENV["RAILS_LOG_TO_STDOUT"].present?
  logger           = ActiveSupport::Logger.new(STDOUT)
  logger.formatter = config.log_formatter
  config.logger    = ActiveSupport::TaggedLogging.new(logger)
end

Once we’ve ensured that application logs are being safely redirected to an external location, we can proceed with building the application container image.

2. Gems with JS Runtime Dependencies

For security and performance reasons, a majority of Rails application container images should not contain NodeJS at run-time. Not having NodeJS installed reduces the surface area of potential security risks because. A program can not be exploited if it isn't there. Not having it installed also results in a smaller container image. Smaller container images can be fetched more quickly than larger ones and any time saved during an autoscaling deploy is a win.

This means that the application needs to ensure that it's not requiring gems in production that rely on NodeJS. Such gems are often necessary for development, testing, or asset compilation at build-time. If the application is requiring gems in production at run-time that rely on NodeJS, you'll probably see a failure that looks like:

/layers/paketo-buildpacks_bundle-install/gems/ruby/2.6.0/gems/execjs-2.7.0/lib/execjs/runtimes.rb:58:in `autodetect': Could not find a JavaScript runtime. See https://github.com/rails/execjs for a list of available runtimes. (ExecJS::RuntimeUnavailable)

The uglifier gem is an example of this type of issue. Uglifier compresses assets at build-time but should not be required at run-time. If you simply have the gem in your Gemfile then it is required at run-time because all gems have a default require value of true. We can set this to require: false and then require the gem during precompilation.

We provide another custom environment variable called PRECOMPILING_ASSETS=true to require the gem when we are building an application’s container image in CI. This environment variable indicates that we are at a stage where the Rails Asset Pipeline will be executed and any gems required at build-time should be included. When this variable is set to false, these gems should not be required to run the application.

The uglifier entry in the Gemfile would look like:

gem "uglifier", require: false

And the assets initializer would only require the gem during asset precompilation:

# config/initializers/assets.rb
if ENV["PRECOMPILING_ASSETS"] == "true"
  require "uglifier"
end

In some cases, a javascript runtime may be required by your application at run-time, in this case we use and recommend miniracer instead of NodeJS.

3. Skipping Initializers at Build-Time

We found that we needed to stop some initializers from running unnecessarily when we are building the container image and precompiling assets, such as those that connect to external services that are unavailable at build-time. We use the same PRECOMPILING_ASSETS environment variable to control this behavior as well. We added the following code to the top of several initializers.

return if ENV["PRECOMPILING_ASSETS"] == "true"

4. Chef Linked Files

Prior to migrating to a container architecture, we were using Chef to copy configuration files into some applications. For example, Scout APM is a part of our monitoring solution and it needs a configuration file called scout_apm.yml to exist in the application. Using Chef, we do not have to have the file in our source tree because Chef was putting the file onto the server for us. When our migration to Kubernetes is complete, we will no longer be using Chef to manage the state of servers. With Kubernetes we do not keep container images around and make changes to them, so we wouldn't be able to have Chef run and change things on them. Instead, we build new container images each time a change is made. Each code change or change to the server environment, such as a new package being installed, results in a new container image. This means these configuration files need to be on the container image from the beginning.

There are a few options for handling these kinds of files. The easiest way to address this issue is to check each configuration file into the source tree for your application. Another solution would be to have the image building process copy that configuration file from a centralized location and bundle it into the image at build-time.

5. Optimizing the Web Server

Our Rails applications use Puma as the web server and it has two basic modes, single mode and cluster mode. Single mode will run one Puma process that can have many threads. Cluster mode will run multiple processes, each with their own thread pool. We determined that it was best to run it in single mode for our metrics collection with Prometheus.

Prometheus has three built-in ways of storing data. SingleThreaded is the fastest data store, but it is only suitable for single-threaded scenarios. We knew we wanted to run Puma with multiple threads so that was out. DirectFileStore is another data store option. It is designed to work across multiple processes and threads so it would work in cluster mode. The statistics for each process are stored separately however because the processes can’t share memory. Prometheus would then have to run an additional web server per Puma process and use that to collect the metrics and then aggregate them. That strategy can have a noticeable impact on memory usage, so we looked to the next available option, the default data store Synchronized. It is thread safe but only works for single-process scenarios. We didn’t want to take the memory hit associated with DirectFileStore so we went with Synchronized and had to put Puma into single mode to use it.

You can configure Puma to run in single mode by adding this line to config/puma.rb (or editing the value if the workers line is already present in your file).

workers 0

Summary

The process of transitioning to a container-based platform at Doximity is ongoing. As new applications migrate, we continue to find changes that are needed, and learn about how best to implement them across our many applications. We continue to iterate on our existing solutions to make the transition as seamless as possible for our development teams. Hopefully our running list of changes will help you as you build your Rails applications as container images.


Be sure to follow @doximity_tech if you'd like to be notified about new blog posts.