MongoDB configurationIn this post, we will be discussing what to do when you add more memory to your MongoDB deployment, a common practice when you are scaling resources.

Why Might You Need to Add More Memory?

Scaling resources is a way of adding more resources to your environment.  There are two main ways this can be accomplished: vertical scaling and horizontal scaling.

  • Vertical scaling is increasing hardware capacity for a given instance, thus having a more powerful server.
  • Horizontal scaling is when you add more servers to your architecture.   A pretty standard approach for horizontal scaling, especially for databases,  is load balancing and sharding.

As your application grows, working sets are getting bigger, and thus we start to see bottlenecks as data that doesn’t fit into memory has to be retrieved from disk. Reading from disk is a costly operation, even with modern NVME drives, so we will need to deal with either of the scaling solutions we mentioned.

In this case, we will discuss adding more RAM, which is usually the fastest and easiest way to scale hardware vertically, and how having more memory can be a major help for MongoDB performance.

Try Now: Ensure data availability for your applications with Percona Distribution for MongoDB

How to Calculate Memory Utilization in MongoDB

Before we add memory to our MongoDB deployment, we need to understand our current Memory Utilization.  This is best done by querying serverStatus and requesting data on the WiredTiger cache.

Since MongoDB 3.2, MongoDB has used WiredTiger as its default Storage Engine. And by default, MongoDB will reserve 50% of the available memory – 1 GB for the WiredTiger cache or 256 MB whichever is greater.

For example, a system with 16 GB of RAM, would have a WiredTiger cache size of 7.5 GB.

The size of this cache is important to ensure WiredTiger is performant. It’s worth taking a look to see if you should alter it from the default. A good rule is that the size of the cache should be large enough to hold the entire application working set.

How do we know whether to alter it? Let’s look at the cache usage statistics:

 

There’s a lot of data here about WiredTiger’s cache, but we can focus on the following fields:

  • wiredTiger.cache.maximum bytes configured: This is the current maximum cache size.
  • wiredTiger.cache.bytes currently in the cache – This is the size of the data currently in the cache.   This is typically 80% of your cache size plus the amount of “dirty” cache that has not yet been written to disk. This should not be greater than the maximum bytes configured.  Having a value equal to or greater than the maximum bytes configured is a great indicator that you should have already scaled out.
  • wiredTiger.cache.tracked dirty bytes in the cache – This is the size of the dirty data in the cache. This should be less than five percent of your cache size value and can be another indicator that we need to scale out.   Once this goes over five percent of your cache size value WiredTiger will get more aggressive with removing data from your cache and in some cases may force your client to evict data from the cache before it can successfully write to it.
  • wiredTiger.cache.pages read into cache – This is the number of pages that are read into cache and you can use this to judge its per-second average to know what data is coming into your cache.
  • wiredTiger.cache.pages written from cache – This is the number of pages that are written from the cache to disk.   This will be especially heavy before checkpoints have occurred.  If this value continues to increase, then your checkpoints will continue to get longer and longer.

Looking at the above values, we can determine if we need to increase the size of the WiredTiger cache for our instance.  We might also look at the WiredTiger Concurrency Read and Write Ticket usage.  It’s fine that some tickets are used, but if the number continues to grow towards the number of cores then you’re reaching saturation of your CPU.  To check your tickets used you can see this in Percona Monitoring and Management Tool (PMM) or run the following query:

 

 

The wiredTiger.cache.pages read into cache value may also be indicative of an issue for read-heavy applications. If this value is consistently a large part of your cache size, increasing your memory may improve overall read performance.

Example

Using the following numbers as our example starting point, we can see the cache is small and there is definitely memory pressure on the cache:

We also are using the default wiredTiger cache size, so we know we have 16 GB of memory on the system (0.5 * (16-1)) = 7.5 GB.   Based on our knowledge of our (imaginary) application, we know the working set is 16 GB, so we want to be higher than this number.  In order to give us room for additional growth since our working set will only continue to grow, we could resize our server’s RAM from 16 GB to 48 GB.  If we stick with the default settings, this would increase our WiredTiger cache to 23.5 GB. (0.5 * (48-1)) = 23.5 GB.  This would leave 24.5 GB of RAM for the OS and its filesystem cache.  If we wanted to increase the size given to the WiredTiger cache we would set the storage.wiredTiger.engineConfig.cacheSizeGB to the value we wanted.   For example, say we want to allocate 30 GB to the wiredTiger cache to really avoid any reads from disk in the near term, leaving 18 GB for the OS and its filesystem cache.   We would add the following to our mongod.conf file:

For either the default setting or the specific settings to recognize the added memory and take effect, we will need to restart the mongod process.

Also note that unlike a lot of other database systems where the database cache is typically sized closer to 80-90% of system memory, MongoDB’s sweet spot is in the 50-70% range.  This is because MongoDB only uses the WiredTiger cache for uncompressed pages, while the operating system caches the compressed pages and writes them to the database files.  By leaving free memory to the operating system, we increase the likelihood of getting the page from the OS cache instead of needing to do a disk read.

Summary

In this article, we’ve gone over how to update your MongoDB configuration after you’ve upgraded to more memory.   We hope that this helps you tune your MongoDB configuration so that you can get the most out of your increased RAM.   Thanks for reading!

Additional Resources:

MongoDB Best Practices 2020 Edition

Tuning MongoDB for Bulk Loads