Abstractions Are In The Eye Of The Beholder

One of the most common debates I see, is on the right level of abstraction to use when coding. The line between over-engineered and unnecessarily-verbose is a very fuzzy one, and is the source of never-ending debates.

Unfortunately, this debate is unlikely to ever get resolved. For one simple reason. The correct answer is both objective, and differs for each person. More specifically, code simplicity depends intrinsically on the reader’s ability to grasp abstractions. What’s over-engineered to one person, is perfectly concise and clear to another. Both of them are perfectly justified in praising/decrying the code, and any changes will come at one of their expense.

A Simple Illustration

Some people think that all abstractions detract from simplicity, but this is not true. For example, skim through the following 2 pieces of code:

public void populate(String name, String email) {
  Map<String, Set<String>> nameToEmails = getNamesToEmails();
  Set<String> emails = nameToEmails.get(name);
  if (emails == null) {
    emails = new HashSet<>();
    nameToEmails.put(name, emails);
  }
  emails.add(email);
}

As compared to:

public void populate(String name, String email) {
  SetMultimap<String, String> nameToEmails = getNamesToEmails();
  nameToEmails.put(name, email);
}

The first piece of code doesn’t require you to understand any “Multimap” abstraction at all. All the same functionality from the 2nd example, is achieved in the 1st example, using only a simple Map and nothing else. Such simplicity! If performance is not a concern, perhaps we could do away with the Map abstraction as well, and use just primitive arrays instead?

And yet, many people would find the 2nd piece of code simpler and easier to work with. Especially anyone who is familiar with Multimaps. Once you’ve understood the abstraction, implementing it once and reusing it everywhere will lead to simpler code – a single put() “chunks” together a wealth of knowledge, into an easily digestible form. In psychology this is referred to as “chunking”, and it is an important tool in our brain’s arsenal for managing complexity.

A Real-Life Example

For a perfect illustration of the above, see the following example provided by Linus about his preferred coding practices. See the link for more details, but suppose you have a LinkedList and you’re trying to implement the remove method, which of the following options seems more simple?

// Option 1
void remove_cs101(IntList *l, IntListItem *target)
{
    IntListItem *cur = l->head, *prev = NULL;
    while (cur != target) {
        prev = cur;
        cur = cur->next;
    }
    if (prev) {
        prev->next = cur->next;
    } else {
        l->head = cur->next;
    }
}

// Option 2
void remove_elegant(IntList *l, IntListItem *target)
{
    IntListItem **p = &l->head;
    while ((*p) != target) {
        p = &(*p)->next;
    }
    *p = target->next;
}

In the article, Linus and the author argue that the second option is preferable to the first.

“The more elegant version has less code and … there is no need for a special case or branching and a single iterator is sufficient to find and remove the target item.”

In terms of code conciseness or cyclomatic complexity, option #2 is clearly the simpler option. And yet, reading through the discussion threads on HackerNews and Reddit, there are numerous programmers (including myself) who have limited experience with “pointers to pointers”, and find option #1 easier to understand and reason about.

Openness to Abstractions

The downside of the 2nd approach in both examples above, is that whenever someone who hasn’t grokked a particular abstraction encounters this code, it will appear even more complex than the 1st approach. Not only do they have to understand how this abstraction is being applied, they have to first learn about the abstraction itself.

Even worse, they have to go elsewhere to understand this abstraction. In the first example, they have to review a whole other method/class, which may in turn reference yet other methods and classes. And in the second example, they have to learn the nuances of language features such as pointer-addresses, and any other language features transitively involved. And once that is done, they then have to come back again to the original code, and understand how the abstraction is being used in this specific context.

And therein lies the rub. What appears very simple to one person, instead seems over-engineered to someone else. And they are both right. It is false to say that the code is universally “simple” or “complex”. It is only “simple” or “complex” to that reader.

The above examples may seem contrived. Especially with the first example, I purposely picked an abstraction that is easy to understand, to illustrate that abstractions can make code more simple. This is a point worth emphasizing, because it’s very easy to fall into the trap of thinking that abstractions always make the code more complex. When it comes to abstractions which we already understand, it’s easy to see that this is not true. The real challenge comes with abstractions that the reader has not seen before.

Suppose in your codebase, there is some… pattern… that repeats itself numerous times.
Person-A recognizes that this pattern can be extracted out into a meaningful abstraction. That this abstraction can be moved into its own class, or utility method, so that it can be logically chunked and concisely invoked from many places. 
Person-B is able to understand this abstraction relatively quickly, and thinks that the code has become much simpler.
Person-C has a much harder time understanding this abstraction, and thinks the new code is an over-engineered mess.

A huge debate ensues over whether the code is “good” or “bad“. Everyone think that they are right and that the code should be updated to reflect their beliefs. Neither side realizes that they are both right – the same code that person B finds easier to understand, is now harder to understand for person C.

The Role of Tooling

And that’s only half the story. Tooling can play a very similar role as well. Using abstractions, helper methods, and 3rd party utilities, can all boost readability tremendously for developers who are using advanced tooling such as IDEs. Viewing or navigating to the utility in question, in order to better understand its nuances, is simply one click away.

In contrast, if a developer prefers coding in Notepad, this same code will take exponentially longer to understand. They have to use tools like grep to first find the specific file containing the code, open that file, navigate to the specific method, and repeat this entire process for every method/abstraction being referenced recursively.

The former crowd is much more likely to favor conciseness and information-hiding via OOP techniques such as helper classes and methods. The latter crowd tends to favor inlining everything. “Just put everything in one place so I can read it all in one go!” A huge debate again ensues on what’s “good” and what’s “bad”. But neither camp is right or wrong – they are both optimizing for their own productivity, given the constraints of their tooling.

What’s your Abstraction Level?

Which brings me to the inconvenient truth behind abstractions. Being objective and egalitarian engineers, we love to believe that “good” code is universally “good”, and that all programmers will be capable of recognizing and appreciating this “good” code.

Unfortunately, this is simply not true. “Good” code is entirely relative. What is “good” for one person will be an “over-engineered monstrosity” for another, and a “duplicated verbose mess” to a 3rd.

And this isn’t because “good code” is subjective. It isn’t. Good code, by definition, has very objective effects on programmer productivity. Unfortunately, the same code can boost productivity for some developers, while simultaneously reducing productivity for others.

A useful analogy here is reading level. People’s reading levels range widely, and their reading preferences are strongly correlated with their reading level. Some readers are best served by the use of more concise and abstract words, to represent complex ideas. Such readers would appreciate and benefit from the use of words such as “capitulate”, “solidarity” and even “abstract”. In contrast, others would find such words jarring, and are best served by easier alternatives. Alternatives such as “give in to”, “support”, “high level” … or simply replacing the word with an entire phrase that expresses the same idea. Alternatives that are more verbose, or do not represent the same richness of meaning.

The same can be said of code-reading ability as well. Some programmers “read” at a graduate level. They are able to quickly understand abstractions, even multi-layered abstractions, and appreciate the conciseness and logic-chunking they bring. They tend to prioritize principles such as Don’t-Repeat-Yourself, Single-Level-of-Abstraction and Small-Functions. And some other programmers “read” at a grade 6 level. They have a very hard time understanding abstractions, and fitting different levels of abstractions in their head. They prefer fewer levels of abstractions, even if it means verbosely duplicating and inlining all of their functionality.

Both programmers will claim that their goal is simplicity. And they are both right. The real difference is that one uses abstractions as a tool for fighting complexity, whereas the other sees abstractions as a cause of complexity.

Finding the Middle Ground

We spend a lot of time discussing the “right” and “wrong” way to write code, and in many cases, there is a lot of merit to this. Some abstractions provide great conciseness and chunking, while still being simple and easy to understand. Abstractions such as ArrayLists, or HashMaps or Heaps, which no one would dream of inlining.

And then there are other abstractions, usually hidden away in poorly maintained code bases, that provide minimal conciseness while still being complex and hard to understand.

In many instances, the latter can be refactored into the former, in such a way that the new code is strictly superior to the old one. Where such opportunities exist, we should definitely make use of them and hone our ability to build well-designed abstractions. Writers are often exhorted to use simpler words whenever possible, and the same is definitely true for programming as well.

However, no matter how much we try, there will always exist a tension between the benefits provided by abstractions, and the effort involved in understanding these abstractions. When such tensions come up, recognize that there is no “right” or “wrong” answer, and you have to play to the abilities of your audience. 

If you’re someone who dislikes the abstractions being employed by your peers, ask yourself if there’s a way to simplify those abstractions, while still preserving their benefits. If so, you have found a strictly superior solution, and should recommend that as an alternative. If not, maybe you should challenge yourself to up your “reading level”. It’s a skill that will serve you well over the course of your career.

If you’re someone whose peers are constantly complaining about your code being “overengineered”, ask yourself if there’s a way to make it simpler, without adding too much verbosity. If so, that is definitely the way to go. If not, accept that your abstractions might simply be too hard for your peers to understand. Meet your audience where they are, not where you would like them to be.

There’s a reason why the NYTimes writes at a 10th grade reading level, while Donald Trump speaks at a 4th grade level. Neither one of them is “right” or “wrong” – they are both best serving their target audience. Aim to do the same with your code.

3 thoughts on “Abstractions Are In The Eye Of The Beholder

  1. Thanks for one’s marvelous posting! I genuinely enjoyed reading it, you are a great author.I will make certain to bookmark your blog and may come back in the foreseeable future. I want to encourage you continue your great job, have a nice day!

    Liked by 1 person

Leave a comment