I like my code DAMP

Jul 20

“Can you DRY this up, please?”

We’ve all received that comment on our pull requests (aka: Don’t Repeat Yourself). We’ve accepted it industry-wide. Linters try to enforce it. We strive to make what little changes will allow us to cut a single perceived-to-be-useless line out of our code. And yet, I find my subconscious fighting against it.

Its core tenet, as written, might be stated as: The same conceptual task should not be implemented twice in the same codebase.

Its core tenet, in practice: Code should be as concise as humanly possible.

These are different! The DRY principle was written with maintainability in mind— if the logic behind a given task becomes necessary to change, we would want one specific function to implement that change and have it automatically be used by the rest of the codebase. And to be clear, the intended purpose behind DRY is near-undeniable. It is obviously bad if I replace ____@____.com with `arjits_placeholder@example.net` 14 different times in my code without a central utility function. Future changes to this duplicated functionality would be needlessly tough to search and fix.

In practice, though, “DRY” often leads us down paths it didn’t mean to take us. The problem of over-abstraction is well-documented. But there’s another problem too— it enforces our human intuition that redundancy == verbosity == waste. And unfortunately, it leaks into both our subconscious and our linters. So, I propose a new way of thinking: DAMP.

I’m not refuting DRY, but reframing our thinking when it comes to the code we write every day, while adhering to the core tenet that DRY was founded with.

DAMP code should be:

Descriptive: Use clear and meaningful names for variables, functions, and classes. It should aim to read like English, enabling developers to understand the purpose and behavior of the code with minimal effort.
Adaptable: Allow for easy modifications and extensions. It should be designed with flexibility in mind, making it easier to accommodate changes in requirements. However, it's important to note that adaptability does not mean consolidating too much varied functionality in one place through excessive abstraction.
Maintainable: Code should be easy to update, debug, and enhance. It should be structured in a way that minimizes complexity, promotes easy comprehension, and supports the DRY principle of preventing logic replication.
Pragmatic: Focus on practicality and readability. It acknowledges that sometimes it's better to have slight repetition for the sake of clarity and understanding.

Examples

Fighting off the rubocop

I recently ran into the most basic of examples— a nested if/else block that I wrote. It was something like:

object = 
  if A
    X
  else
    if B
      Y
    elsif C
      X
    else
      Z
    end
  end

There’s 2 extremely common and fair suggestions here:

Why not turn the else -> if into an elsif
Why not do if A || C do X

It’d be easy to accept these suggestions, but on this day I had my DAMP hat on. So I considered the context.

object = 
  if connection.is_a? FileSystemConnection
    X
  else
    if some_hash_has_some_value
      Y
    elsif that_hash_has_some_other_value
      X
    else
      raise Z
    end
  end

To me, there’s 2 logical paths here. Either the connection is a FileSystemConnection, or it isn’t. And based on that fact, different things can happen. That logical separation led me to take the pragmatic approach to the suggestions above.

I rejected (1 - turn the else if into an elsif) because I want to make the code as descriptive as possible. Reading my code should feel like English. In this case, keeping the indentation for the if not a FileSystemConnection logic kept its logical path clear.

I rejected (2 - write if A || C do X) for similar reasons. It is much, much less descriptive for me to write a code block like if connection.is_a? FileSystemConnection || that_hash_has_some_other_value. An anonymous reader in the future would have quite the difficult time understanding how some hash and the connection type had any correlation that led to the same outcome. And as much as possible, I want to avoid putting future travelers through that confusion.

So, I added a rubocop:disable, and moved on feeling confident that I was leaving my code as DAMP as possible for the next person.

Builders aka implicit conditionals

A relatively popular coding pattern is builders. I take great pleasure in removing these. Let me explain why.

Builders allow you to construct an object in an extremely pleasant-to-write format. Suppose there were a UserBuilder, defined as

class UserBuilder
  attr_accessor :name, :email, :age, :address

  def initialize
    @user = User.new
  end

  def set_name(name)
    @user.name = name
    self
  end

  def set_email(email)
    @user.email = email
    self
  end

  def set_age(age)
    @user.age = age
    self
  end

  def set_address(address)
    @user.address = address
    self
  end

  def build
    @user
  end
end

This allows you to then build the User piece-by-piece, calling whichever set_ functions are relevant at the time:

user = UserBuilder.new
  .set_name('John')
  .set_email('john@example.com')
  .set_age(30)
  .set_address('123 Main St')
  .build

This seems great! It’s extremely adaptable (you can call whichever initializer functions you want) and during invocation it is descriptive (could not be any more English).

My DAMP hat was on though, and while digging through our codebase, I found that it felt very unmaintainable. Let’s say the UserBuilder was being called in 8 different spots. Maybe it was being called from 5 different controllers, and in each case, it was calling a slightly different combination of set_ functions. This, in essence, is spreading implicit conditionals across the codebase! When I open the builder file, I have no understanding of under which circumstances each set_ will be called, much less the reasoning behind the combinations. In each invocation of the UserBuilder too, there could be a million different ways they decide if they should call each set_ function.

This violates the principles of descriptiveness (inside the builder) and maintainability.

Therefore, I ventured to take out these implicit conditionals scattered across our code. What a builder is trying to accomplish is flexibly supporting multiple ways to build the same object. A couple builder invocations may look like:

# eu_controller.rb

def create_eu_user
  user_builder = UserBuilder.new
    .set_name(given_name)
    .set_email(given_email)

  user_builder.build
end

# arbitrary_api_controller.rb

def create_arbitrary_user
  is_na_user = (given_region == "na")

  user_builder = UserBuilder.new
    .set_name(given_name)
    .set_email(given_email)

  if is_na_user
    user_builder.set_address(given_address)
  end

  user_builder.build
end

The distinction here is that the code tries to only set the address if the user is located in North America. And in just these 2 implementations, I’ve spread that implicit conditional across the code. I have no central definition for it, and nothing in my Builder code hinting at its existence.

The better implementation here, in my opinion, is the Factory pattern. In the factory pattern, I can move this implicit conditional inside the factory. This will make sure that A) every implementation adheres to it (maintainable) and B) I can view the conditional in plain sight when reading the factory code (descriptive).

Here’s what my code would look like in the Factory world:

class UserFactory
  def self.build_user(name, email, region = nil)
    user = User.new
    user.name = name
    user.email = email

    if region == "na"
      user.address = given_address
    end

    user
  end
end

# eu_controller.rb

def create_eu_user
  UserFactory.build_user(given_name, given_email, "eu")
end

# arbitrary_api_controller.rb

def create_arbitrary_user
  UserFactory.build_user(given_name, given_email, given_region)
end

Both the definition of the Factory and the implementations now adhere to the DAMP principles, and I can move on feeling happy that I’ve written some maintainable code!

Blocks aka codepath whiplash

Ruby has graciously given us the yield keyword, which allows the block syntax. I take great pleasure in removing these too. Here’s an example

Definition:

def self.do_some_stuff(default_attributes, workspace, &blk)
    new_thing = CoolObject.new(default_attributes)
    new_thing.workspace = workspace

    # do a bunch of stuff

    yield new_thing

    # do a ton more stuff

    new_thing
end

Caller:

def initialize
  Blah.do_some_stuff(@attributes, @current_workspace) do |blah_stuff_doer|
    blah_stuff_doer.make_sashimi
    blah_stuff_doer.build_a_house
    blah_stuff_doer.crash_my_car(@super_special_attributes, @interesting_attributes)
  end
end

At first glance, this seems pretty neat! What’s the issue? If you can’t tell by now, my M.O. is that I would like to easily understand code that I read. And in this case, I can’t tell what’s happening from neither the caller nor the definition individually.

If I’m reading through a file and happen upon the caller code, I have tons of questions. What’s blah_stuff_doer? What’s its type? What type do I need to be returning from my block? When in the do_some_stuff function does the yield occur? To answer the last question especially, there’s no telling how far I might have to dive; the &blk attribute can be passed many functions deep, and it could take some spelunking to figure out where the yield occurs.

If I’m reading through the definition code, it’s slightly clearer, but I have no clue what’s going on when I yield. I still need to check any callers to figure out what functionality might be occurring within the blocks the function is receiving.

I don’t find this code easy to read, hence my DAMP-alarms go off, highlighting the lack of descriptiveness in this approach.

In most cases, I’ve found these block syntaxes to just be hiding conditionals (similarly to the Builder pattern in the previous section). Let’s look at an example implementation:

def self.do_some_stuff(default_attributes, workspace, &blk)
    new_thing = CoolObject.new(default_attributes)
    new_thing.workspace = workspace

    # do a bunch of stuff

    yield new_thing

    # do a ton more stuff

    new_thing
end

def some_implementing_function
  Blah.do_some_stuff(@attributes, @current_workspace) do |blah_stuff_doer|
    blah_stuff_doer.make_sashimi
  end
end

def some_other_implementing_function_that_hates_sushi
  Blah.do_some_stuff(@attributes, @current_workspace) do |blah_stuff_doer|
    blah_stuff_doer.build_a_house
    blah_stuff_doer.crash_my_car(@super_special_attributes, @interesting_attributes)
  end
end

This is just a hidden conditional!! All this yield is really accomplishing in this case is saying “if sushi, make sashimi. if not, do some other stuff!”. And so, it’s remarkably easy to get rid of this yield / &blk syntax.

def self.do_some_stuff(default_attributes, workspace, likes_sushi)
  new_thing = CoolObject.new(default_attributes)
  new_thing.workspace = workspace

  # do a bunch of stuff

  if likes_sushi
    new_thing.make_sashimi
  else
    new_thing.build_a_house
    new_thing.crash_my_car(@super_special_attributes, @interesting_attributes)
  end

  # do a ton more stuff

  new_thing
end

def some_implementing_function
  Blah.do_some_stuff(@attributes, @current_workspace, true)
end

def some_other_implementing_function_that_hates_sushi
  Blah.do_some_stuff(@attributes, @current_workspace, false)
end

Boom! I’ve made the code much, much more easy to follow by removing the mental overhead of the block syntax. There is no longer a yield that I have to follow through the code, and instead I can explicitly write the if likes_sushi within the function.

Obviously, not all block syntax implementations will be this straightforward & silly. Some are written for a great reason. It’s important, though, to put your DAMP hat on, and just make sure the code you’re writing is as descriptive/adaptable/maintainable/pragmatic as possible.

Splats 👎

When I am reading through code, trying my best to understand it, nothing grinds my gears more than splats. Splats in code are:

Extremely undescriptive. I have no chance of understanding the arguments being passed without diving in to understand the whole function.
Wildly unmaintainable. Future contributors will likely have difficulty tracking what is being splatted into the function. Also, it is rare for developers to strongly type their splatted arguments. For those who do, thank you! But in most cases, you lose important type safety right off the bat.

sig { params(data: T::Hash[Symbol, T.untyped]).void } 
def process_data(**data)
  @some_other_object.set_interval(data[:interval])
  @some_other_object.set_rate_limit_header(data[:rate_limit_header])
  # More code to process the data
end

big_hash = {
  url: "https://api2.frontapp.com",
  permitted_requests: 95,
  interval: 60,
  rate_limit_header: "X-RateLimit-Limit"
}

process_data(**big_hash)

Issues here can be seen right away. Firstly, if I were looking only at the function definition, I would have no idea what might exist inside the data hash. Secondly, I have no safety on the types of data[:interval] or data[:rate_limit_header].

This code would be infinitely easier to read if the arguments process_data wanted to use were explicitly provided, rather than splatted in.

To improve the process_data function and make it more descriptive and maintainable, you can modify it to explicitly take the required arguments instead of using splats. Here's how you can refactor the code:

sig { params(interval: Integer, rate_limit_header: String, permitted_requests: Integer, url: String).void } 
def process_data(interval:, rate_limit_header:, permitted_requests:, url:)
  @some_other_object.set_interval(interval)
  @some_other_object.set_rate_limit_header(rate_limit_header)
  @some_other_object.set_permitted_requests(permitted_requests)
  @some_other_object.set_url(url)
end

big_hash = {
  url: "https://api2.frontapp.com",
  permitted_requests: 95,
  interval: 60,
  rate_limit_header: "X-RateLimit-Limit"
}

process_data(
  interval: big_hash[:interval],
  rate_limit_header: big_hash[:rate_limit_header],
  permitted_requests: big_hash[:permitted_requests],
  url: T.must(big_hash[:url])
)

Now, I know exactly what I’m passing into my function. I can explicitly enforce types on the arguments I’m passing to the process_data function. I can add an individual T.must() to the url argument, to make sure it raises if the value is missing from big_hash. And most importantly, I can read the function signature of process_data and know exactly what’s being passed in. Overall, I’ve made my code much more DAMP than the splatted format.

Conclusion

Embracing the DAMP approach encourages us to prioritize code that is easy to understand, modify, and enhance, while still respecting the core tenets of the DRY principle.

By focusing on code that is descriptive, adaptable, maintainable, and pragmatic, we can strike a healthy balance between reducing redundancy and prioritizing readability.

Put on your swim trunks— it’s time to start getting DAMP.

Arjit Jaiswal

Arjit is a software engineer at Census, and an aspiring jack of all trades. Previously served Autopilot & Vision Automation @ Tesla. Outside of work, he enjoys jetsetting, conjuring sports-related hot-takes, and listening to every genre of music known to humankind.

https://www.linkedin.com/in/arjitj2/