Rails is multi-threaded, but can your redis connection handle it?

TLDR: Use connection_pool gem.

Activerecord - rails' database access library - comes with inbuilt connection pool. We can change the pool size via config/database.yml:

production
  adapter: postgresql
  pool: 20

We use redis for variety of purposes like caching, queuing, pubsub, etc. But when it comes to connecting to redis, we don't have any inbuilt connection pool. So generally we end up using just one direct connection, mostly via a configuration or application-level global variable.

# config/initializers/redis.rb

REDIS = Redis.new(host: "10.0.1.1", port: 6380, db: 15)

# accessing data
REDIS.get(<key>)
REDIS.set(<key>, <value>)

Thanks to the ludicrous speed of redis, most queries return in milliseconds & we don't see any issues even with multiple parallel connections. But note that this is a shared blocking resource. Shared because we have just one connection across multiple application threads & blocking because on a single connection redis will block the next query until the previous one returns. Which means that this can have cascading effects.

Problem

Let's do some benchmarking with that single connection:

require 'benchmark'

# simulate blocking shared resource
REDIS = Mutex.new

# blocks for 10 milliseconds
incr = -> { sleep(0.010) }

Benchmark.bm do |x|
  x.report("single") { incr.call } # returns in 10 milliseconds

  threads = []
  x.report("multi-threaded") do
    100.times do
      threads << Thread.new {
        REDIS.synchronize { incr.call }
      }
    end

    # join to depict how much time the last waiting thread takes
    threads.each { |thr| thr.join }
  end
end

And we are in for a surprise!

       user     system      total        real
single  0.000055   0.000018   0.000073 (  0.012773)
multi-threaded  0.014485   0.023569   0.038054 (  1.197625)

With 100 parallel queries, our result times jumped from 10ms to 1 second. This is cascading effect and the load time of 1 second for redis is just plain bad.

Of-course this is hypothetical, but note that I have considered RTT (round-trip time) for request 10ms considering local connection. For remote connections over HTTP, these can go up to 250-300ms for a single query.

Solution

Use connection_pool gem. It comes from the creator of sidekiq and is pretty easy to use:

With rails we need something like this:

# config/initializers/redis.rb
pool_size = 20
REDIS = ConnectionPool.new(size: pool_size) do
  Redis.new(host: <host>, port: <port>)
end

# accessing data
REDIS.with do |conn|
  conn.get(<key>)
  conn.set(<key>, <value>)
end

Here's how this code (and connection pools in general) work:

  • .with will pick a connection from the pool. A new connection is created if no free connection is available. This connection is passed to the block & returned back to the pool upon completion.
  • pool_size is the max number of connections that will be established at any point of time. Threads calling .with will have to wait for free connection if this limit is reached.

It seems reasonable to keep pool_size equal to max number of threads of our application server.

Let's check benchmarks with connection_pool:

require 'benchmark'
require 'connection_pool'

# simulate blocking shared resource - but now with a pool of them
REDIS = ConnectionPool.new(size: 100) do
  Mutex.new
end

# blocks for 10 milliseconds
incr = -> { sleep(0.010) }

Benchmark.bm do |x|
  x.report("single") { incr.call } # returns in 10 milliseconds

  threads = []
  x.report("multi-threaded") do
    100.times do
      threads << Thread.new {
        REDIS.with { |conn|
          conn.synchronize { incr.call }
        }
      }
    end

    # join to depict how much time the last waiting thread takes
    threads.each { |thr| thr.join }
  end
end

And here are the results:

       user     system      total        real
single  0.000048   0.000040   0.000088 (  0.010704)
multi-threaded  0.005984   0.006500   0.012484 (  0.017754)

Much better!

And this is why activerecord also comes with a default connection pool.