Puma 4.2.0 will be out soon. Here's the preliminary release notes.

Building your own Ruby VM dashboard, and how to find transactions allocating lots of objects

Last email we talked about how to read New Relic's Ruby VM tab. It's a great way to discover whether or not your Ruby application has issues with object allocation. This week, we'll talk about figuring out what transactions are causing memory issues, and then discuss how to build similar functionality to New Relic's Ruby VM tab into your own production perf monitoring setup.

New Relic's Ruby VM tab is all about averages - the average amount of allocations, etc across all transactions. However, it's usually not all transactions that are experiencing an issue. It's usually just a small handful of transactions that are causing your Ruby VM metrics to get out of the acceptable range.

Often, it's an N+1 that's only triggered by a particular user or type of user, or by a very-little-used route/controller action.

Unfortunately, New Relic makes it really difficult to figure out which transaction this is, because you can't view memory metrics on a per-transaction or per-endpoint basis. You can only look at averages for the entire server over a period of time.

Amongst their competitors, Scout offers these metrics, but they don't offer the entire-server statistics that New Relic does! So what are we to do?

If you're using New Relic, you can usually get an idea of which transactions are causing a high number of allocations by using time as a proxy for memory use. Sort New Relic's transaction list by "longest response time", and look at the list. Usually, the slowest-on-average transactions also have really bad N+1s which are causing tens of thousands (usually millions!) of allocations.

If you're using Scout, you can simply sort the transaction lists by "max allocations". This stat says the maximum number of objects each transaction type has ever allocated during a single transaction. For example, if UsersController#index's max allocations was 10 million, that means that 1 request to this endpoint allocated 10 million objects.

You can "build your own Ruby VM tab" by using some internal Ruby APIs to get the same information.

GC.stat is a hash that contains some of this info. It's accessible at any time in any MRI Ruby process:

GC.stat[:count] is how many GCs have been performed since this process started. There's also GC.stat[:minor_gc_count] and GC.stat[:major_gc_count]
GC.stat[:total_allocated_objects] is how many objects have been allocated since process start.
GC.stat[:heap_live_slots] is the total number of live objects.

Using this information, you can get a little bit of New Relic's RubyVM tab anywhere. I suggest you log it every few minutes or so, or send it to an external stats service like Datadog, Librato or statsd.

I hope this has been useful - these are important metrics that you should understand and be aware of, because they're so useful for identifying and tracking down memory usage issues.

You can share this email with this permalink: https://mailchi.mp/railsspeed/understanding-ruby-vm-stats-part-two-now-what-to-do-about-it?e=[UNIQID]

Copyright © 2019 Nate Berkopec, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.