Red Green Repeat Adventures of a Spec Driven Junkie

Understanding Library: ActiveRecord Import

It’s important to use external programming libraries, as it improves your output. At the same time, not fully understanding the library can get you in trouble. This is exactly what happened to me recently. I realize I need to understand a library.

I share an example of how I went through to understanding ActiveRecord Import’s on_duplicate_update function. Sharing the tests I used to validate behavior.

This article will take you less than three minutes to read.

Marble column from the Temple of Artemis at Sardis source and more information

Introduction

When using activerecord-import, I had questions on its behavior. We were using it as:

Model.import(items, on_duplicate_keys_update: { conflict_target: [:column_1, :column_2] })

When asked what exactly was it doing, I realize I didn’t understand how it worked. I didn’t have a razor sharp understanding of: conflict_target and the columns listed.

I read the documentation but it wasn’t clear to me. So, I made tests to prove out what’s going on.

The situation came up when inserting the same item with different values. What would happen?

item = {
  column_1: 'item author,
  column_2: 'item name',
  column_3: 'item description'
}

items = [item]

Model.import(items, on_duplicate_keys_update: { conflict_target: [:column_1, :column_2] })

expect(Model.count).to eq(1)
expect(Model.first.column_1).to eq(item.column_1)

That works, now, what if we import another item with a new value for column_3?

new_item = item.merge({ column_3: 'new item description' })

items = [new_item]

Model.import(items, on_duplicate_keys_update: { conflict_target: [:column_1, :column_2] })

expect(Model.count).to eq(1)
expect(Model.first.column_3).to_not eq(item.column_3)
expect(Model.first.column_3).to     eq(newitem.column_3)

Oh, this didn’t update column_3 to the new value. Why?

Read the Manual!

The documentation for ActiveRecord Import lists out options for on_duplicate_keys_update such as:

  • conflict_target
  • columns

One way to update column_3 to a new value would be:

Model.import(items, on_duplicate_keys_update: { conflict_target: [:column_1, :column_2], columns: [:column_3] })

This will update column_3 values to the value in items.

Just update it :all?

Is there a way to update all columns all the time? In this example, there’s only one, when there are bazillion columns, specifying each one gets annoying.

Yes, there is! The documentation is not explicit in updating all columns on on_duplicate_keys_update.

The syntax to update all column values on import is:

Model.import(items, on_duplicate_key_update: { columns: :all })

This will work, regardless of uniqueness constraints.

conflict_target??

Why is conflict_target used in this case? That seems to be the source of confusion in the first place.

It’s to satisfy any uniqueness constraint setup in the database and specify the columns to use as key identifiers.

In this case:

Model.import(items, on_duplicate_keys_update: { conflict_target: [:column_1, :column_2] })

As the column attribute is not present, no updates would happen, even if there is a conflicting entry. Effectively having a configuration of updating no columns.

To maintain column_1 and column_2 as the key identifiers and have updates done on column_3, the syntax would be:

Model.import(items, on_duplicate_keys_update: {
					  conflict_target: [:column_1, :column_2],
					  columns: [:column_3]
					}
			)

This is an option to control updates to columns and conflict resolution.

(Of course, columns: :all would have the same behavior here too.)

Conclusion

Using external libraries are an effective manner to increase one’s output when programming. Effectively, another team of programmers are working on solving that problem.

When one doesn’t fully understand how external library works, they can get into trouble and make wrong decisions to cause double (or more) work later on.