When Not To Use A Background Job

Services And Transactions Are Just Jobs You Run Synchronously

It's often popular to wrap up a set of steps in a "service". Maybe after you complete a checkout, a bunch of services have to be notified and a lot of state needs to be updated. You put all of this logic inside of a class called CheckoutCompleter and call CheckoutCompleter.call(checkout_object) in a controller, or something like that.

Here's the thing: this is almost identical to a background job. Call it CheckoutCompleter.perform() instead and you start to see the similarity. Consider the most basic possible ActiveJob:

class GuestsCleanupJob < ApplicationJob
queue_as :default

def perform(*guests)
# Do something later
end
end

If you remove the inheritance from ApplicationJob and the queue_as call, it's just a "service object". Jobs are just "service objects" which optionally can be run asynchronously. So, anywhere you would insert a "service object", you can insert a background job.

I'm using the "service object" terminology here because it's easier to see the similarity with a background job, but this is just as applicable to the Vanilla Rails style.

There is a Minimum Viable Async Unit

When shouldn't you async things in a background job?

1. You need the result of the job to render the response (my original tweet)
2. The code chunk you're looking at takes less than ~10 milliseconds to run.
3. Asyncing this would put too much pressure on your background job infrastructure, either memory or load.

If the chunk of code you're considering making async just always takes less than 10 milliseconds to run, you're adding cost (complexity, memory usage in Redis or the background job store, additional workers required to run these jobs) without much benefit.

There's also some types of potential jobs which are just too difficult to make a job. For example, if a chunk of logic needs a lot of data (say, more than 128kb worth) then it's probably true that the time and memory usage of serializing all the data into the background job store is just not worth it and would be too slow.

"It's a devilishly complicated problem that looks deceptively simple. It's been a while since I thought about checkout, and fortunately never had to dive too deep into the inventory code at Shopify—so this isn't particularly inspired by how that works. Some loose thoughts!

(1) Failed payments are a problem. You also have to decide if you want reservations, i.e. preventing people from typing in their address to get an inventory error later. I think you should in your flash sale scenario.

Ideally, to implement reservations, you wrap the entire checkout flow in a serializable transaction (or `SELECT FOR UPDATE` at lower isolation levels), then decrement the inventory at the end. This doesn't scale. It means throughput is the average length for a human to complete checkout.

Instead, you could claim inventory at the beginning of checkout. On a relational database, the first thing I can then think of would be to have an inventory table where you do `SELECT FOR UPDATE COUNT(*) FROM inventory_claims WHERE created ≥ NOW() - interval 10 minute`, and if it's below the product's stock level (which is updated when the payment goes through), then you `INSERT` an inventory claim for your session in the same transaction you did the serializable COUNT(*), and go through checkout. I'm probably missing some detail here, but this is the first thing that comes to mind.

After checkout in a job when the payment has been authorized, you remove your inventory claim and update the inventory level of the product. If you need to minimize lock contention further, you could shard the inventory claims table giving each product e.g. 50 claims per table, or move the lock to Redis (which, due to its single-threaded nature and Lua, would probably be more straightforward). Your claim expires after 10 minutes. A background job repeatedly cleans out this table asynchronously. It could even be that job that updates the inventory directly on the product.

This should scale pretty well as long as you aggressively prune the table and keep indexes to a minimum; this should be able to do 1,000s of claims per second which is most likely sufficient in this case. You likely want at least to put the actual processing of the checkout behind a semaphore to limit concurrency slightly so that the database can prioritize the inventory claims. These transactions are generally heavy as they usually create several objects like a customer, order, fulfillment objects, etc. immediately after a payment, which takes a while too.

(2) I think the best customer experience is to push as much of this as possible into a background job. For legitimate customers, their credit cards generally wouldn't get declined. If the credit card gets declined, you could increase their inventory claim time by, e.g. 30 min to allow them to fix it (unless the fraud rate is suspected to be high)."

When should you not add a new background job to a controller?

Services And Transactions Are Just Jobs You Run Synchronously

There is a Minimum Viable Async Unit

You Can Async Almost Anything If You Try Hard Enough

Enter the Danish Scaling Master, Simon: