Earlier this month, I decided to gemify some code I wrote that makes large object tree comparisons as simple as possible. It's called tree_diff, and lets you compare attributes as call chains, avoiding any knowledge of ActiveRecord relationships. The real-world example is that we needed to keep a document up-to-date when an attribute changed on any associated record to a root object. This process creates a new version of that document and notifies the client of an update. Clearly, we wouldn't want to waste someone's time upon an irrelevant update, and we don't need to needlessly consume resources to create the PDF, which is an expensive operation.

The code replaced a previous attempt that leveraged ActiveModel::Dirty, which is an excellent tool for diffing inside of a model, but challenging to use meaningfully across many of them. You also aren't able to see changes involving a deleted child record. Say your root object is an Order and it has many Items. If you are watching Items attributes, for example, and you remove an item, there is no Item instance in which to call comparison methods like saved_changes. Short of making the Order watch its items, you're of out of luck.

Though you can call it externally, ActiveModel::Dirty usage tends to live in callbacks because they draw in quick solutions for invoking model changes after record creation or update. I think doing this makes it easy to forget the scope of applicability, and leads to executing code in several places that have nothing to do with the original need. Before tree_diff, I was watching all attribute changes in several places that had nothing to do with our comparison where really needed it in only one place. Here's an oversimplified example of how this worked:

# Define a list of all "watchers" in the codebase.
class WatchedAttributes
  CLASSES = [SpecialDocumentWatcher]
end

# Define a watcher as a list of all models and their respective attributes we want to track.
class SpecialDocumentWatcher
  def attributes
    {OurModel => [:some_attribute, :some_other_attribute]}
  end
end

class OurModel
  attr_reader :watched_attributes_flags

  after_update do
    # Iterate all watchers, look up this model within each to get a list of attribute names,
    # and check if any have changed per ActiveModel::Dirty.
    # Results in @watched_attributes_flags = {SpecialDocumentWatcher => true | false}
    @watched_attributes_flags = WatchedAttributes::CLASSES.each.with_object({}) do |klass, h|
      attributes_to_watch = klass.attributes.fetch(self.class)
      was_changed = !!attributes_to_watch.detect { |a| attr.in?(saved_changes.keys) }
      h[klass] = was_changed
    end
  end
end

This makes a holding place in a model for keeping track of which groups of attributes saw a change. Apply the after_update block to each associated model involved in the comparison, and it's then possible to reach in to all of them from the root model for a final diff decision. The characteristics of ActiveModel::Dirty influenced my not-so-ideal design and prompted a need to express these sets in a way that doesn't scatter out dependency so much.

TreeDiff's approach

TreeDiff aims to address each of these points. Watched/observed attributes are expressed as the whole tree for one purpose or use case. Call it from model callbacks, or avoid polluting them by invoking it externally. It sees destroyed associations by nature of not being within the destroyed instance— you would get nil on one side. And, for bonus points, it doesn't have any dependency any Rails libraries.

To apply TreeDiff to the common Railsy record update, you could define the comparison class like this. Define each attribute similarly to how strong params handles nested sets.

class SpecialDocumentDiff < TreeDiff
  observe :invoice_date, :business_opens_at, :business_closes_at,
          :reference_number, :instructions, :weight,
          order_details: [:carrier_name, :additional_flags],
          contact_details: [:business_name, :phone_number,
                            address: [:address_1, :city, :state, :zip]],
          items: [:quantity, :description, :weight, :price],
          address: [:address_1, :city, :state, :zip]
end

You can bring in the Diff class in your controller, and republish your PDF or perform whatever expensive operation you need:

class OrdersController
  def update
    order = Order.find(params[:id])
    special_document_diff = SpecialDocumentDiff.new(order)

    if order.update(order_params)
      republish_pdf if special_document_diff.saw_any_change?
      redirect_to order, notice: 'Updated order.'
    end
  end

  private

  def republish_pdf
    SomePdfClass.new.rerender_and_save(params[:id])
  end
end

Finally, call special_document_diff.changes to get the actual changes as an array of hashes with keys path, old, and new:

[{path: [:invoice_date], old: Sun, 07 Jul 2019 11:12:43 PDT -07:00, new: Sun, 10 Jul 2019 00:00:00 PDT -07:00},
 {path: [:reference_number], old: ["XYZ-123"], new: [""]},
 {path: [:items, :quantity], old: [3, 10], new: [4, 10]},
 {path: [:items, :description], old: ["Foo", "Thing thing"], new: ["Bar", "Thing thing"]},
 {path: [:contact_details, :address, :address_1], old: ["123 Test Street"], new: ["321 Test Street"]}]

Hope this finds a use case for you at some point.

More blog posts