Ruby Memoization

Momoization

If you are unfamiliar with memoization, it is a technique for improving a program’s execution time (at times other purposes) by storing the results of expensive function calls and returning the stored value when the function is called with the same inputs. This is a speific form of caching. The technique was first described in 1968 by Donald Michie in the paper Memo Functions and Machine Learning. You can learn more from Wikipedia.

Ruby’s Conditional Assignment Operator

Ruby provides the ||= operator, which is often called the conditional assignment operator, or the “or-equals sign”. It can be thought of as a simple memoization operator as well. Assuming my_shirt ||= get_a_shirt_from_the_closet, if the variable my_shirt points to and object that is “truthy” then sending that object the message ||= causes my_shirt to return iteself. However, if my_shirt is “falsy”, then ||= evaluates the method get_a_shirt_from_the_closet and sets the result to the variable my_shirt in the present context.

Note that a ||= b is not logically equivalent to a = a || b, it can more accurately be described as a || a = b, as long as a is a bound variable. The ||= operator may also be used to bind the result of the right hand side to an unbound variable on the left hand side.

This works quite well when get_a_shirt_from_the_closet takes no arguments and is otherwise idempotent. If however, you expect to pass arguments to the mothod, or get different results at different times, based on the lastest data from a database for example, then ||= will keep returning stale data.

Hash Based Memoization

Another strategy for memoization in ruby is to use a hash to store values, which is useful when you have both expensive computations and wnat to pass arguments to the method that is performing the work. Consider the following code.

  class Fib
    def initialize
      @answers = {}
    end

    def cached(n)
      @answers[n]
    end

    def fibonacci(n)
      return n if n <= 1
      @answers[n] ||= (fibonacci(n - 1) + fibonacci(n - 2))
    end
  end

On initialization, the class creates an empty hash that will be used to store previously computed values. When Fib.fibonacci(n) is called for n greater than the base cases of (0,1), the method first checks @answers[n], to see if fibonacci(n) has already been computed. This is a perfectly good use of memoization, but this implementation is a recursive approach to calculating the fibonacci number, so there’s an added twist. If n hasn’t been pre-calculated, but n - 1 or n - 2 have been, those recursive calls return memoized values. If neither is memoized, then the function keeps recursively doing work until it hits a previously calculated value. This approach has the advantage of becoming faster over time, with the tradeoff that it requires O(n) space.

A Word About Caching

This type of memoization is a simple and unsophisticated approach to caching. If you want to persist data across time, servers, http requests or something similar, you should be using a more robust caching solution. Rails now comes with a really nice built in caching layer, and there are a number of gems availale for caching like Redis-Store.

In Closing

If you need to store the calculations performed by an idempotent method that will get reused within a class, then consider using the ||= operator. If you need to cache for multiple values, then a hash may be appropriate. However, if you have more complicated needs, then use a purpose built caching solution.

Further Reading:

Peter Cooper’s What Ruby’s ||= (Double Pipe / Or Equals) Really Does

Justin Weiss’ 4 Simple Memoization Patterns in Ruby

David Fayram’s excellent * Rubyists Already Use Monadic Patterns *


If you enjoyed this post, follow me on twitter @ChaseGilliam, sometimes I'm funny. You can also find me on Github.