DEV Community

Cover image for How to do caching, when Redis is just too much.
Luca Gesmundo
Luca Gesmundo

Posted on

How to do caching, when Redis is just too much.

TL;DR
There will be times when you will be in need of some caching. But the overhead of setting up a dedicated DB could be not worth it.
I made Ricordo, a micro caching / memo library to address this problem.

https://github.com/lucagez/ricordo

I had to work on an API. A pretty basic one. When hitting the route belonging to a specific user, you get a JSON response containing all his products.
When hitting the route belonging to a product, you get a JSON file with its info.

I started noticing that just a few products where requested way more times than others.
So I started thinking about some sort of caching.

Well, this API is hosted on a 5$ instance on digitalocean. On the same droplet is hosted:

  • NGINX, as a reverse proxy and for serving some static assets as well.
  • Postgres, the DB.
  • my Node API.

So, for caching just a handful of JSON responses (totaling at around 50mb), the increased overhead of Redis on this 5$ droplet.. was just too much.

Then I started thinking about memoization as a viable technique for some little caching.
Memoization is a technique consisting in storing the result of an expensive computation and returning the same result if the computation is invoked with the same inputs.

A micro example:

const memo = func => {
  // cache object
  const cache = new Map();

  // Returning function that look up in cache.
  // If nothing is found => a new computation is fired.
  return arg => {
    if (cache.has(arg)) return cache.get(arg);

    // New computation
    const result = func(arg);
    cache.set(arg, result);
    return result;
  };
};

// Super mega expensive function (:
const hello = a => `hello ${a}`;

const cached = memo(hello);

cached('world'); // `hello world` => cache insertion.
cached('world'); // `hello world` => retrieved from cache.

But, what if we start using it for storing DB responses? 🤔
That would behave exactly like a caching system.
This way we could start thinking to caching on a function basis.

We have some problems though. Right now, our primitive cache implementation,
is storing everything we throw at it. Keeps the keys and the results forever.
So, we will run out of memory pretty quickly and our memory will be freed only when our process ends.
And if we want a service to be pretty available this is not what we want.

So, we should adjust our implementation with some real-caching like behavior, e.g. TTL.

Time-to-live is the lifespan of a cached result. When the lifespan ends the key will be deleted from our cache.


const memo = (func, ttl) => {
  const cache = new Map();
  return arg => {
    if (cache.has(arg)) return cache.get(arg);

    // Spawning timeout on new insertion
    // => delete key / result after lifespan 
    setTimeout(() => cache.delete(arg), ttl);

    const result = func(arg);
    cache.set(arg, result);
    return result;
  };
};

A little bit better, now we won't have trillions of keys stored forever.
But unfortunately, we have yet another problem 😫
In the realm of high-level languages we don't have full control over how memory is allocated. So, after the deletion of a key, we are not sure if Node.js has decided
that this is the right time to deallocate some memory.
There is nothing that prevents our cheap 5$ droplet to run out of memory again.

Unfortunately, in JS we have no way to determine how much space an object is holding in our ram. (At least no way without wasting the same amount of stored memory, pretty useless in this context).
So we have to rely on an estimate of how much memory will be eaten from a stored result and decide a limit on the number of stored keys to prevent further insertions when the limit is hit.


const memo = (func, ttl, limit) => {
  const cache = new Map();
  return arg => {
    if (cache.has(arg)) return cache.get(arg);
    const result = func(arg);

    // Now we are waiting for some key deletions before inserting other results.
    if (cache.size < limit) {
      setTimeout(() => cache.delete(arg), ttl);
      cache.set(arg, result);
    }
    return result;
  };
};

Now, if we have estimated correctly the size of our stored results, our 5$ droplet won't run out of memory 🎉🎉🎉

But wait a second, how about our most used keys? How can we keep track of them?
How can we store only the n most used items?
What if we would like to destroy cache or a single key at a given point in time?

Well, I faced these problems so I decided to make a tiny npm package to solve them.
Hopefully making your micro caching needs, a little less painful.

This is the github repo:
https://github.com/lucagez/ricordo

Happy caching to everyone! ✌

Top comments (0)