Read-Only Mode For Better Rails Downtime

Recently I was looking to upgrade the Postgres version on an application I’ve been working on. This would require a small amount of downtime, likely about 10 minutes.

The default solution I’d reach for in these cases would be to go into Heroku’s maintenance mode, which serves an HTML maintenance page with a 503 Service Unavailable status code. This works but makes the application entirely unusable during the upgrade, and I was hoping to find a better solution. In this particular case, I also wanted to be able to provide JSON responses as the application mainly provides an API for a mobile app.

After exploring a handful of half-baked options, I settled on using a read-only connection to the database to still allow reads but prevent any writes from occurring. While using the read-only connection, the Postgres adapter will raise an error any time we attempt to change data in the database, but we can easily rescue this specific error and convert it to a user-facing notice. I felt a bit odd using exceptions as the core of this workflow, but in the end, it worked out really well, so I wanted to share the specifics.

It’s worth noting that this solution is particularly well suited to this specific application, which only provides an API and has very read-heavy usage, but I imagine it could be extended to work with other styles of app as well.

Configuring Rails to Use the Read-Only Connection

If present, Rails will use the connection string in a DATABASE_URL env var to connect to the database. Following the Connection Preference notes in the Rails guides, I realized that I could make this DATABASE_URL usage explicit and allow for a temporary override. To do this, I added an explicit url property for the production environment with desired connection preference:

# config/database.yml

production:
  <<: *default
  url: <%= ENV["DATABASE_URL_READ_ONLY"] || ENV["DATABASE_URL"] %>

With this in place, I can enable the read-only mode simply by setting the DATABASE_URL_READ_ONLY env var:

heroku config:set \
  DATABASE_URL_READ_ONLY='postgres://read_only_user:abc123...' \
  --remote production

Likewise, to disable the read-only mode, I can use:

heroku config:unset DATABASE_URL_READ_ONLY --remote production

Note: I was able to use Heroku’s Postgres Credentials interface to create the read-only user, but if you’re not working with Heroku you should be able to use these instructions to create your read-only user.

Error Handling

With other approaches I considered I found that I had to close off multiple different potential ways to issue writes to the database, but the read-only connection worked well to cut everything off in one change. That said, it was only half the solution, as I certainly didn’t want the errors making it to users.

Thankfully it was relatively straightforward to provide a centralized rescue that would allow me to handle all the errors. First, I created a module using Rails’s ActiveSupport::Concern functionality:

# app/controllers/concerns/read_only_controller_support.rb
module ReadOnlyControllerSupport
  extend ActiveSupport::Concern

  included do
    if ENV["DATABASE_URL_READ_ONLY"].present?
      rescue_from ActiveRecord::StatementInvalid do |error|
        if error.message.match?(/PG::InsufficientPrivilege/i)
          render(
            status: :service_unavailable,
            json: {
              info: "The app is currently in read-only maintenance mode. Please try again later.",
            },
          )
        else
          raise error
        end
      end
    end
  end
end

When included, this module will use Rails’s rescue_from method to capture potentially relevant errors, and then we do a quick check within that block to make sure we’re only capturing the relevant errors.

Note, the rescue_from logic is only enabled when the DATABASE_URL_READ_ONLY is set, so we’re able to reuse the existence of that variable as a way to scope this behavior.

I was then able to include that module in any relevant base controller:

# app/controllers/application_controller.rb
class ApplicationController < ActionController::Base
  include ReadOnlyControllerSupport
end

# app/controllers/api/base_controller.rb
class Api::BaseController < ActionController::Base
  include ReadOnlyControllerSupport
end

Non-API Error Handling

My initial use case for this read-only mode only needed to support API requests, but I could imagine extending it to HTML and form-based interfaces.

The first thing I would consider would be adding a sitewide banner that stated that we were in a read-only maintenance mode to alert users to the current status.

With that in place, I think we could extend the error handling in the ReadOnlyControllerSupport module to redirect the user back and display a relevant message:

rescue_from ActiveRecord::StatementInvalid do |error|
  if error.message.match?(/PG::InsufficientPrivilege/i)
    respond_to do |format|
      format.json do
        # JSON erorr message as shown above
      end

      format.html do
        redirect_back(
          fallback_location: root_path,
          alert: "The app is currently in read-only maintenance mode. Please try again later.",
        )
      end
    end
  else
    raise error
  end
end

Scheduler and Background Jobs

One additional consideration here would be around background jobs and scheduler processes. For background jobs things are relatively straightforward – we just need to scale our worker pool down to zero for the read-only period.

Scheduler processes are a little trickier as I didn’t have a mechanism for globally enabling or disabling them. With that in mind, I think the ideal solution would be to only ever have scheduler processes enqueue jobs but not actually do any work beyond that.

Migrations

The final sticking point we ran into was migrations. We have a release command defined in our Procfile that was configured to run rake db:migrate. Unfortunately, it turns out that even if no migrations run, Rails will still attempt to write to the ar_internal_metadata table as part of the db:migrate command, and Heroku will run the release command any time we change an env. In my initial attempt, Heroku failed when I attempted to set the DATABASE_URL_READ_ONLY as the associated release command hit the read-only error when running rake db:migrate.

To work around this I wrote a small script that first checks if there are any migrations that need to be run, and only if there are, then runs rake db:migrate:

#!/bin/bash

set -e

if bin/rails db:migrate:status | grep '^\s\+down\s'; then
  bin/rails db:migrate
fi

This script was added to the repo as bin/migrate-if-needed, and then we replaced our call to rake db:migrate with bin/migrate-if-needed

Update (Oct 14, 2020)

After sharing this post, a commenter on Hacker News pointed out the rails_failover gem that their team at Discourse maintains. It seems to offer similar functionality, but in a more robust and fully thought out way. Looks like a great option to implement this sort of system.

Chris Toomey