You're reading the Ruby/Rails performance newsletter by Speedshop.

Looking for an audit of the perf of your Rails app? I've partnered with Ombu Labs to do do just that.

I recently spoke at RubyConf Thailand. My slides are here, but the talk will be uploaded to YouTube soon.

日本語は話せますか？このニュースレターの日本語版を購読してください。

It works on my machine - but how do you test a DDoS attack?

A few weeks back, I was woken up at 4:30am by my wonderful 1-year-old daughter. I went to the kitchen and opened up my laptop to get my day started. I found I had a Twitter DM from my internet-friend Jon Yongfook, and it was sent just a few minutes prior. He said he had a big problem and wanted my help. Jon is also located in Asia, so I knew he was also up extra early - this must be a serious issue.

It turned out Jon was experiencing a DDoS attack. Someone was trying to ransom his site (and unfortunately, a few other indie makers sites' as well). Eventually, he got it under control by migrating to Cloudflare. Jon wrote up the entire experience here.

Moral of the story? If you have a digital asset that can be DDoSed, it eventually will. So, put your stuff behind Cloudflare or another web application firewall (WAF) before the hackers make you do it.

There's one line in Jon's post I wanted to expand on:

"Bannerbear is built with Ruby on Rails and we use a gem called Rack Attack to apply rate limiting. My first thought, why wasn't this doing its job. I thought maybe I had configured it wrong - shoutout to Nate Berkopec who helped me apply some patches to my existing Rack Attack config."

If you're not familiar with rack-attack, it's a Rack middleware for blocking traffic. Jon had been trying a bunch of rules in there for blocking this DDoS, but it seemed completely ineffective. He confirmed that these rules were working locally, but we were still seeing logs which suggested that this traffic was not being properly rejected by rack-attack.

The rack-attack config was something like:

Rack::Attack.throttle("requests by ip", limit: ENV['RACK_ATTACK_THROTTLE_PER_MIN'], period: 1) do |request|
request.ip
end

So, a throttle for more than X requests per minute per IP address. Why wasn't this working at all?

Turns out, if you look at rack-attack's readme, you'll see this:

And Jon, when he installed rack-attack, had copy-pasted this line either from here or somewhere else. Unfortunately, he even also had Rails.cache configured to a Redis store, which means that without this extra line, rack-attack might have worked a bit better here.

Locally, this works fine.

But in production, your application is now a distributed system. You have X number of Ruby application processes, most of them on different machines. It's really hard to test the behavior of a distributed system, and probably impossible for one-man shows like Jon.

ActiveSupport::Cache::MemoryStore is a in-memory store. That is, it stores keys and values just in the RAM allocated to the Ruby process. This means you don't have one cache store, you have one cache store per Ruby process.

An in-memory store/a cache store per process doesn't work very well for a throttler like rack-attack:

The requests-per-minute limit becomes a limit per process per minute, effectively allowing X times as much traffic from each IP address, where X is the number of processes you have deployed.
In a DDOS situation, processes may be restarting frequently due to timeouts. When this happens, the cache of the process is wiped, meaning a new process starts up to replace it with no throttler information.

So, we changed that and ended up still having issues. The volume of traffic was just too high. Even with rack-attack quickly rejecting traffic, there was enough IP addresses that it just didn't make it work. Switching to Cloudflare and blocking above the application layer was the inevitable solution.

Jon's Rack::Attack config would have passed any test suite. Almost no one tests their apps in such a way as to catch a problem like this. So what are we to do?

I guess I'm not sure. DDoS attacks are rare but costly, and testing infrastructure that can accurate simulate one is not cheap in time or money.

Some things will probably never be testable, and those things may be complicated to reason through, especially if they are the emergent behavior of a complex distributed system.

Important systems - like rack-attack - deserve a careful, watchful eye and a quick pencil-sketch at your desk of the consequences of your configuration decisions. Finally, defense in depth (i.e. a web application firewall _and_ application-level blocking) can catch holes in one layer at another layer in the stack.

Until next week,

-Nate

You can share this email with this permalink: https://mailchi.mp/railsspeed/rack-attack-a-tale-of-complex-systems-and-a-ddos-attack?e=[UNIQID]

Copyright © 2022 Nate Berkopec, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.