Learn Where The Usage Of Memoization In Ruby On Rails Is Limited

Memoization is a wonderful concept in programming world. It helps in writing clean code which execute faster.

Example:

def slow_method
  @result ||= perform_slow_method
end

In the above code, slow_method will cache perform_slow_method in @resultvariable, therefore perform_slow_method will execute only once.

So, if memoization is wonderful then why not to use it always? that’s the question I am going to answer in this post.

Memoization should be avoided if result of memoized function is going to change over time and your business logic relies on latest value. For example, if perform_slow_method is doing DB call to fetch data from database and your business logic needs latest data then using memoization may give you cached data which is not desired.

Is it safe to use Memoization ?

I have read multiple times that ||= operator is not thread safe because it is performing 2 operations, which is correct. Does it mean that memoization is not thread safe? Answer to this question is Yes & No 😉. Actually, it depends on scenario where we are using it.

If you are wondering how performing 2 operations can make code not thread safe, let me explain it to you first & then we will see scenarios where memoization is thread safe & not thread safe.

x ||= y will expand to:
x || x = y

Which means 

if !x # nil or false
  x = y
end

If multiple threads are executing then they will read & write value of x together which can lead to inconsistent value of x. Atomic operations can solve this.

Luca Guidi has written a nice blog to explain it

If we design write operations in a way that while they’re running, other threads can’t read nor alter the state we’re modifying, that change is thread safe.

We can use Mutex to make it Atomic

mutex.synchronize do
  x ||= y
end

Now, let’s see scenarios where memoization is thread safe & not thread safe in rails code.

There are 2 scenarios when memoization is not thread safe

  1. Using class variable or class instance variable to cache result.
  2. Spawning new thread explicitly in your code.

Note: Do not confuse class instance variable with an object instance variable.

Using class variable or class instance variable to cache result

Class variable and class instance variable are shared by multiple thread therefore using it for memoization will create inconsistent state

Example:

class GlobalConfig
  def self.config # config is class instance variable
    @config ||=  {global: 0}
  end
end
class Config < GlobalConfig
  def self.config
    @config ||=  super.merge({config: 1})
  end
end
class App < Config
  def self.config
   @config ||= super
  end
end

2.times.map do |i|
  Thread.new do
    p " #{i} : #{App.config}"
  end
end.each(&:join)

Output of this program is:

" 0 : {:global=>0, :config=>1}"
" 1 : {:global=>0, :config=>1}"

But sometime output is:

" 0 : {:global=>0}"
" 1 : {:global=>0, :config=>1}"

Reason for this behavior: Both threads are trying to manipulate config variable. When second thread started execution, its possible that first thread has already set config to {:global=>0} therefore merge operation didn’t happen for second thread.

Rails had similar issue which is resolved in this PR

Spawning new thread explicitly in your code

If you are spawning thread in your code then you need to be careful while sharing objects & using memoization.

Example:

class Config
  attr_reader :connection_count
  
  def initialize
    puts 'initialized'
    @connection_count = 0
  end
  def increment!
    @connection_count += 1
  end
end
class Application
  def connect
    config.increment!
  end
  def config
    @config ||= Config.new
  end
  
  def total_connection
    config.connection_count
  end
end
app = Application.new

5.times.map do |i|
  Thread.new do
    app.connect
  end
end.each(&:join)

puts app.total_connection

Output of this code is:

initialized5

Sometimes output will be:

initialized
initialized
initialized
initialized
initialized1

When spawning new threads, both threads are sharing app object, as we don’t know order of execution therefore it will lead to unexpected result.

Puma or any other multi threaded application server process requests in separate threads therefore we don’t see this problem when using memoization in Rails application, provided we are not spawning thread and sharing objects between threads in our application.

Conclusion

  1. Memoization should be avoided when:
  • Memoized function is going to change over time and your business logic relies on latest value.
  • Spawning new thread in application.

2. Class variable and class instance variable should not be used for caching.

One thought on “Learn Where The Usage Of Memoization In Ruby On Rails Is Limited

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.