Memoization is a wonderful concept in programming world. It helps in writing clean code which execute faster.
Example:
def slow_method
@result ||= perform_slow_method
end
In the above code, slow_method
will cache perform_slow_method
in @result
variable, therefore perform_slow_method will execute only once.
So, if memoization is wonderful then why not to use it always? that’s the question I am going to answer in this post.
Memoization should be avoided if result of memoized function is going to change over time and your business logic relies on latest value. For example, if perform_slow_method
is doing DB call to fetch data from database and your business logic needs latest data then using memoization may give you cached data which is not desired.
Is it safe to use Memoization ?
I have read multiple times that ||=
operator is not thread safe because it is performing 2 operations, which is correct. Does it mean that memoization is not thread safe? Answer to this question is Yes & No 😉. Actually, it depends on scenario where we are using it.
If you are wondering how performing 2 operations can make code not thread safe, let me explain it to you first & then we will see scenarios where memoization is thread safe & not thread safe.
x ||= y will expand to:
x || x = y
Which means
if !x # nil or false
x = y
end
If multiple threads are executing then they will read & write value of x together which can lead to inconsistent value of x. Atomic operations can solve this.
Luca Guidi has written a nice blog to explain it
We can use Mutex to make it Atomic
mutex.synchronize do
x ||= y
end
Now, let’s see scenarios where memoization is thread safe & not thread safe in rails code.
There are 2 scenarios when memoization is not thread safe
- Using class variable or class instance variable to cache result.
- Spawning new thread explicitly in your code.
Note: Do not confuse class instance variable with an object instance variable.
Using class variable or class instance variable to cache result
Class variable and class instance variable are shared by multiple thread therefore using it for memoization will create inconsistent state
Example:
class GlobalConfig
def self.config # config is class instance variable
@config ||= {global: 0}
end
end
class Config < GlobalConfig
def self.config
@config ||= super.merge({config: 1})
end
end
class App < Config
def self.config
@config ||= super
end
end
2.times.map do |i|
Thread.new do
p " #{i} : #{App.config}"
end
end.each(&:join)
Output of this program is:
" 0 : {:global=>0, :config=>1}"
" 1 : {:global=>0, :config=>1}"
But sometime output is:
" 0 : {:global=>0}"
" 1 : {:global=>0, :config=>1}"
Reason for this behavior: Both threads are trying to manipulate config
variable. When second thread started execution, its possible that first thread has already set config to {:global=>0} therefore merge operation didn’t happen for second thread.
Rails had similar issue which is resolved in this PR
Spawning new thread explicitly in your code
If you are spawning thread in your code then you need to be careful while sharing objects & using memoization.
Example:
class Config
attr_reader :connection_count
def initialize
puts 'initialized'
@connection_count = 0
end
def increment!
@connection_count += 1
end
end
class Application
def connect
config.increment!
end
def config
@config ||= Config.new
end
def total_connection
config.connection_count
end
end
app = Application.new
5.times.map do |i|
Thread.new do
app.connect
end
end.each(&:join)
puts app.total_connection
Output of this code is:
initialized5
Sometimes output will be:
initialized
initialized
initialized
initialized
initialized1
When spawning new threads, both threads are sharing app
object, as we don’t know order of execution therefore it will lead to unexpected result.
Puma or any other multi threaded application server process requests in separate threads therefore we don’t see this problem when using memoization in Rails application, provided we are not spawning thread and sharing objects between threads in our application.
Conclusion
- Memoization should be avoided when:
- Memoized function is going to change over time and your business logic relies on latest value.
- Spawning new thread in application.
2. Class variable and class instance variable should not be used for caching.
Looved reading this thank you