Why “Less Code” Matters

…being able to do task X with 50 lines of code is preferable to needing 500 lines of code to do task X. Less code takes longer to write, but the real benefits are around maintenance: less code means less of a chance of bugs, less to keep in your head, less for someone else (or yourself 6 months later) to read through and learn, less to test, and less to modify when you change the rest of the system.

- Alan Keefer, Syntax Matters

I’d like to expand on that. I don’t think it’s clear how important “less code” is, or how harmful more code is. So let’s take an example written in a Blub-y language, and see how well we can refactor it.

(I know this post is kind of long, but it’s mostly Blub code, and it should scan quickly.)

Let’s make a sandwich.

routine makeSandwich
    look for the peanut butter in the cabinet
    if it's not there, look for it in the other cabinet
    put the peanut butter on the counter

    look for the jelly in the fridge
    if it's not there, look for it in the cabinet
    if it's not there, look for it in the other cabinet
    put the jelly on the counter

    find a napkin
    put the napkin on the counter

    find the bread in the bread drawer
    untie the bread bag
    take two pieces of bread from the bag
    close the bread bag
    put the bread back in the bread drawer
    put the two pieces of bread on the napkin

    find a butter knife
    put the butter knife on the napkin

    open the peanut butter jar
    stick the butter knife into the peanut butter jar
    with the butter knife, scoop out some peanut butter
    spread the peanut butter on one piece of bread
    close the peanut butter jar
    put the peanut butter back in the cabinet

    wipe the butter knife on the other piece of bread

    open the jelly jar
    stick the butter knife into the jelly jar
    with the butter knife, scoop out some jelly
    spread the jelly on one the other piece of bread
    close the jelly jar
    put the jelly back in the fridge

    put the knife in the sink

So much work! No wonder I seldom cook. Can we improve that at all? Well, the “looking for in 2 cabinets” seems to be a pattern, so let’s Extract Method:

routine lookForInTwoCabinets (lookFor)
    look for the lookFor in the cabinet
    if it's not there, look in the other cabinet
    return it

routine makeSandwich
    lookForInTwoCabinets (peanut butter)
    put the peanut butter on the counter

    look for the jelly in the fridge
    if it's not there, lookForInTwoCabinets(jelly)
    put the jelly on the counter
    ...

Can we move the “put it on the counter” inside lookForInTwoCabinets? I don’t know…it would work for the peanut butter, but what if we find the jelly in the fridge? In that case, we wouldn’t call lookForInTwoCabinets(jelly), so we might never put the jelly on the counter. Besides, the name doesn’t really imply anything about what we do after we find the thing. We should probably leave it outside. Yeah, it’s not so DRY, but let’s move on.

That big block where we look for bread, we can’t really compress it at all…but we can extract it, just to wrap the whole sequence of steps up with a name.

...
routine getBread
    find the bread in the bread drawer
    untie the bread bag
    take two pieces of bread from the bag
    close the bread bag
    put the bread back in the bread drawer
    put the two pieces of bread on the napkin

routine makeSandwich
    ...
    find a napkin
    put the napkin on the counter

    getBread

    find a butter knife
    put the butter knife on the napkin
    ...

Ok, we’re making progress. What about spreading the peanut butter & jelly on the bread? Can we extract another method?

routine spread (topping, breadSlice)
    open the topping jar
    stick the butter knife into the topping jar
    with the butter knife, scoop out some topping
    spread the topping on breadSlice
    close the topping jar
    put the topping back in the cabinet

routine makeSandwich
    ...
    find a butter knife
    put the butter knife on the napkin

    spread (peanut butter, one piece of bread)

    wipe the butter knife on the other piece of bread

    spread (jelly, the other piece of bread)

    put the knife in the sink

Great! Except we just introduced a bug: after closing the topping jar, spread always puts the topping back in the cabinet, and the jelly goes in the fridge (moldy jelly is a Bad Thing). Introduce Parameter:

routine spread (topping, breadSlice, returnToppingTo)
    open the topping jar
    stick the butter knife into the topping jar
    with the butter knife, scoop out some topping
    spread the topping on breadSlice
    close the topping jar
    put the topping back in returnToppingTo

routine makeSandwich
    ...
    find a butter knife
    put the butter knife on the napkin

    spread (peanut butter, one piece of bread, the cabinet)

    wipe the butter knife on the other piece of bread

    spread (jelly, the other piece of bread, the fridge)

    put the knife in the sink

Ok, I think we’re done. (Does it make sense to send a “return topping to” parameter to a method that’s just spreading? Not now, we’re almost ready to commit…) Let’s step back and admire our craft:

routine lookForInTwoCabinets (lookFor)
    look for the lookFor in the cabinet
    if it's not there, look in the other cabinet
    return it

routine getBread
    find the bread in the bread drawer
    untie the bread bag
    take two pieces of bread from the bag
    close the bread bag
    put the bread back in the bread drawer
    put the two pieces of bread on the napkin

routine spread (topping, breadSlice, returnToppingTo)
    open the topping jar
    stick the butter knife into the topping jar
    with the butter knife, scoop out some topping
    spread the topping on breadSlice
    close the topping jar
    put the topping back in returnToppingTo

routine makeSandwich
    lookForInTwoCabinets (peanut butter)
    put the peanut butter on the counter

    look for the jelly in the fridge
    if it's not there, lookForInTwoCabinets(jelly)
    put the jelly on the counter

    find a napkin
    put the napkin on the counter

    getBread

    find a butter knife
    put the butter knife on the napkin

    spread (peanut butter, one piece of bread, the cabinet)

    wipe the butter knife on the other piece of bread

    spread (jelly, the other piece of bread, the fridge)

    put the knife in the sink

31 lines down to…32 lines. Ok, well, even if it’s longer, is it better? makeSandwich is shorter, that’s good. But it doesn’t feel like we’ve really made the job any easier — we moved stuff around, but it’s still all there. There’s no semantic compression. It’s still 3 + 3 + 3 + 3 + 3 + 3, instead of 3 * 6.

What did we think about? We had to ask ourselves whether to move “put it on the counter” into lookForInTwoCabinets. The value of getBread is questionable. We had the bug with spread putting the jelly in the cabinet, and we had to wonder about its “return topping to” parameter. Every time we consider refactoring, we risk introducing a crappy abstraction that confuses, when it should clarify. Every decision point, we have to think about it, and we might get it wrong. But that’s why they pay us the big bucks, right? Software development is hard, after all!

No. We’re looking at accidental complexity, not essential complexity. Here’s the same code, in a higher-level language, that removes some of the accidental complexity:

put peanut butter on a piece of bread
put jelly on another piece of bread
stick the peanut butter to the jelly

Essential complexity is when you start thinking, why jelly? Why not cinnamon and raisins with the peanut butter? Or currants? What kind of bread? Let’s use multigrain. Would peanut butter with jelly and banana be overkill? What to drink? Essential complexity looks at the problem, not the solution. Accidental complexity is when you say “I really want to do THIS but dammit, my language just won’t let me.” Or, “Gosh, we have so much code to move around, I can barely see what it does.” Or when you just can’t figure out where to put that parameter, or method, or class.

So what does this have to do with “less code”?

This is why we say YAGNI. If you add that method on a hunch that it’ll be helpful, you have more stuff to move around, more accidental complexity, more decisions to make about your housekeeping, all for a speculative benefit. It’s like playing lotto – you pay up front, and if you’re really lucky, you’ll win. But if you lose, you’ve wasted resources, and now you have something you need to throw away.

Each of the possible ways to code and refactor that sandwich code is pretty valid…any of them could be in our source control repository. A new hire is going to have to read through whichever one we coded, and try to mentally get from there, to the 3-liner at the end, before he can really be effective. Why don’t we just start him there?

Let’s take that 3 + 3 + 3 + 3 + 3 + 3 example again. What if we don’t use multiplication? We could still refactor it. The first two threes are kind of together, let’s group them: 6 + 3 + 3 + 3 + 3. And the last one looks kind of bulky, so let’s decompose it: 6 + 3 + 3 + 3 + 1 + 1 + 1. Could we move some of the numericality from the middle 3 to an earlier one? 6 + 3 + 4 + 2 + 1 + 1 + 1. Oh, and let’s sort, so it’s easier to find the numbers you want: 1 + 1 + 1 + 2 + 3 + 4 + 6. There! Is it immediately obvious to you that this is the same as 3 * 6? Of course not. Ralph Johnson calls refactoring “wiping dirt off a window,” and you just put more dirt on.

Why We Abstract, and What To Do When We Can’t

Whenever you see yourself writing the same thing down more than once, there’s something wrong and you shouldn’t be doing it, and the reason is not because it’s a waste of time to write something down more than once. It’s because there’s some idea here, a very simple idea, which has to do with the Sigma notation…not depending upon what it is I’m adding up. And I would like to be able to always…divide the things up into as many pieces as I can, each of which I understand separately. I would like to understand the way of adding things up, independently of what it is I’m adding up.

- Gerald Sussman, SICP Lecture 2a, “Higher-order Procedures” (emphasis added)

The purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise.

- Edsger W. Dijkstra, The Humble Programmer

What Larry Wall said about Perl holds true: “When you say something in a small language, it comes out big. When you say something in a big language, it comes out small.” The same is true for English. The reason that biologist Ernst Haeckel could say “Ontogeny recapitulates phylogeny” in only three words was that he had these powerful words with highly specific meanings at his disposal. We allow inner complexity of the language because it enables us to shift the complexity away from the individual utterance.

- Hal Fulton, The Ruby Way, Introduction (emphasis added)

Programming is our thoughts, and with better ways to express them, we can spend more time thinking them, and less time expressing them.

3 + 3 + 3 + 3 + 3 + 3 is hard…hard to read (how many threes?), hard to get right (I lost count!), hard to reason about (piles of operations!). 3 x 6 is easy, once you learn multiplication. This is a good trade-off. We should look for ways to add abstractions, new semantic levels, to our programs.

If you’re doing the same thing twice, stop, and look for the common idea. Peel the idea away from the context, from the details. Grasp the idea, and then use it over and over. As a bonus, you’ll type less, re-use code, and debug less.

“But I can’t find ways to do that!”

When you look at similar bits of code, and can’t find a good way to remove the duplication, you’re hitting the limits of either your language, or your knowledge.

Programming languages put up very real walls, they force you down their paths, often by leaving out features. A language without recursion puts up a wall in front of recursive solutions; a language without first-class functions makes it tough to write higher-order functions. Language limitations are the cause of Greenspun’s Tenth Rule.

Sometimes, the language is not the problem. Sometimes you just can’t find your way through. This is why you read Refactoring, and Design Patterns, but really, this is why you learn other programming languages. Think about the right way to factor the problem.

If you can’t remove the duplication, you need to work around your language, or learn some new tricks.