Introduction of “The Art of Readable Code” Book (2)

This book of Dustin Boswell and Trevor Foucher focuses simple and practical techniques for writing better code which you can apply every time you write code. The authors introduces the key ideas:

i. Code should be easy to understand.

ii. Code should be written to minimize the time it would take for someone else to understand it.

Using easy-to-digest code examples from different languages, each chapter dives into a different aspect of coding and demonstrates how you can make your code easy to understand:

I. Simplify naming, commenting and formatting with tips that apply for every line of code

II. Refine your program's loops, logic and variables to reduce complexity and confusion

III. Attack problems at the function level, such as reorganizing blocks of code to do one task at a time.

IV. Write effective test code that is thorough and concise, as well as readable.

II. Refine your program’s loops, logic and variables to reduce complexity and confusion

1. Making Control Flow Easy to Read

There are a number of things you can do to make your code’s control flow easier to read. When writing a comparison (while (bytes_expected > bytes_received)), it’s better to put the changing value on the left and the more stable value on the right (while (bytes_received > 10)

1.2. The Order of if/else Blocks

For instance, you can either write it like:

if (a == b) {
  // Case One ...
} else {
  // Case Two ...
}

or as:

if (a != b) {
  // Case Two ...
} else {
  // Case One ...
}

You may not have given much thought about this before, but in some cases there are good reasons to prefer one order over the other:

• Prefer dealing with the positive case first instead of the negative—e.g., if (debug) instead of if (!debug).

• Prefer dealing with the simpler case first to get it out of the way. This approach might also allow both the if and the else to be visible on the screen at the same time, which is nice.

• Prefer dealing with the more interesting or conspicuous case first.

Sometimes these preferences conflict, and you have to make a judgment call. But in many cases, there is a clear winner.

1.3. The ?: Conditional Expression (a.k.a. “Ternary Operator”)

KEY IDEA: Instead of minimizing the number of lines, a better metric is to minimize the time needed for someone to understand it.

1.4. Avoid do/while Loops

What’s weird about a do/while loop is that a block of code may or may not be reexecuted based on a condition underneath it. Typically, logical conditions are above the code they guard—this is the way it works with if, while, and for statements. Because you typically read code from top to bottom, this makes do/while a bit unnatural. Many readers end up reading the code twice. while loops are easier to read because you know the condition for all iterations before you read the block of code inside. But it would be silly to duplicate code just to remove a do/while. Another reason to avoid do/while is that the continue statement can be confusing inside it. For instance, what does this code do?

do {
  continue;
} while (false);

1.5. Returning Early from a Function

Some coders believe that functions should never have multiple return statements. This is nonsense. Returning early from a function is perfectly fine—and often desirable. For example:

public boolean Contains(String str, String substr) {
  if (str == null || substr == null) return false;
  if (substr.equals("")) return true;
}
...

Implementing this function without these “guard clauses” would be very unnatural.

1.6. The Infamous goto

The problems can come when there are multiple goto targets, especially when their paths cross.In particular, gotos that go upward can make for real spaghetti code, and they can surely be replaced with structured loops. Most of the time, goto should be avoided.

1.7. Minimize Nesting

We have following code:

if (user_result == SUCCESS) {
  if (permission_result != SUCCESS) {
    reply.WriteErrors("error reading permissions");
    reply.Done();
    return;
  }
  reply.WriteErrors("");
} else {
  reply.WriteErrors(user_result);
}
reply.Done();

1.7.1. How Nesting Accumulates

Look at your code from a fresh perspective when you’re making changes. Step back and look at it as a whole.

1.7.2. Removing Nesting by Returning Early

We can improve above code by following

if (user_result != SUCCESS) {
  reply.WriteErrors(user_result);
  reply.Done();
  return;
}

if (permission_result != SUCCESS) {
  reply.WriteErrors(permission_result);
  reply.Done();
  return;
}

reply.WriteErrors("");
reply.Done();

This code only has one level of nesting, instead of two. But more importantly, the reader never has to “pop” anything from his mental stack—every if block ends in a return.

1.7.3.Removing Nesting Inside Loops

The technique of returning early isn’t always applicable. For example, here’s a case of code nested in a loop:

for (int i = 0; i name != "") {
  cout << "Considering candidate..." << endl;
}
}
}
...

Inside a loop, the analogous technique to returning early is to continue:

for (int i = 0; i name == "") continue;
cout << "Considering candidate..." << endl;
...
}

1.8. Can You Follow the Flow of Execution?

Programming construct How high-level program flow gets obscured
threading it’s unclear what code is executed when.
signal/interrupt handlers Certain code might be executed at any time.
exceptions Execution can bubble up through multiple function calls.
function pointers & anonymous functions It’s hard to know exactly what code is going to run because that isn’t known at compile time.
virtual methods object.virtualMethod() might invoke code of an unknown subclass

2. Breaking Down Giant Expressions

Giant expressions are hard to think about. This chapter showed a number of ways to break them down so the reader can digest them piece by piece.

One simple technique is to introduce “explaining variables” that capture the value of some large sub-expression. This approach has three benefits:

• It breaks down a giant expression into pieces.

• It documents the code by describing the subexpression with a succinct name.

• It helps the reader identify the main “concepts” in the code.

Another technique is to manipulate your logic using De Morgan’s laws—this technique can sometimes rewrite a boolean expression in a cleaner way (e.g., if(!(a && !b)) turns into if (!a || b)).

In fact, all of the improved-code examples in this chapter had if statements with no more than two values inside them. This setup is ideal. It may not always seem possible to do this—sometimes it requires “negating” the problem or considering the opposite of your goal.

2.1. Explaining Variables

The simplest way to break down an expression is to introduce an extra variable that captures a smaller sub-expression. This extra variable is sometimes called an “explaining variable” because it helps explain what the subexpression means.

Here is an example:

if line.split(':')[0].strip() == "root":
...

Here is the same code, now with an explaining variable:

username = line.split(':')[0].strip()
if username == "root":
...

2.2. Summary Variables

For example, consider the expressions in this code:

if (request.user.id == document.owner_id) {
// user can edit this document...
}
...
if (request.user.id != document.owner_id) {
// document is read-only...
}

The main concept in this code is, “Does the user own the document?” That concept can be stated more clearly by adding a summary variable:

final boolean user_owns_document = (request.user.id == document.owner_id);
if (user_owns_document) {
// user can edit this document...
}
...
if (!user_owns_document) {
// document is read-only...
}

2.3. Using De Morgan’s Laws

If your code is:

if (!(file_exists && !is_protected))
  Error("Sorry, could not read file.");

It can be rewritten to:

if (!file_exists || is_protected)
  Error("Sorry, could not read file.");

2.4. Abusing Short-Circuit Logic

Here is an example of a statement once written by one of the authors:

assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());

In English, what this code is saying is, “Get the bucket for this key. If the bucket is not null, then make sure it isn’t occupied.” Even though it’s only one line of code, it really makes most programmers stop and think. Now compare it to this code:

bucket = FindBucket(key);
if (bucket != NULL) assert(!bucket->IsOccupied());

It does exactly the same thing, and even though it’s two lines of code, it’s much easier to understand.

KEY IDEA: Beware of “clever” nuggets of code—they’re often confusing when others read the code later.

2.5. Breaking Down Giant Statements

void AddStats(const Stats& add_from, Stats* add_to) {
  add_to->set_total_memory(add_from.total_memory() + add_to->total_memory());
  add_to->set_free_memory(add_from.free_memory() + add_to->free_memory());
  add_to->set_swap_memory(add_from.swap_memory() + add_to->swap_memory());
  add_to->set_status_string(add_from.status_string() + add_to->status_string());
  add_to->set_num_processes(add_from.num_processes() + add_to->num_processes());
...
}

Once again, your eyes are faced with code that’s long and similar, but not exactly the same. After ten seconds of careful scrutiny, you might realize that each line is doing the same thing, just to a different field each time:

add_to->set_XXX(add_from.XXX() + add_to->XXX()); In C++, we can define a macro to implement this:


void AddStats(const Stats& add_from, Stats* add_to) {

# define ADD_FIELD(field)

  add_to->set_##field(add_from.field() + add_to->field())
}

3. Variables and Readability

This chapter is about how the variables in a program can quickly accumulate and become too much to keep track of. You can make your code easier to read by having fewer variables and making them as “lightweight” as possible. Specifically:

3.1. Eliminate variables that just get in the way.

In particular, they showed a few examples of how to eliminate “intermediate result” variables by handling the result immediately.


var remove_one = function (array, value_to_remove) {
  var index_to_remove = null;
  for (var i = 0; i < array.length; i += 1) {
    if (array[i] === value_to_remove) {
      index_to_remove = i;
      break;
    }
  }
  if (index_to_remove !== null) {
    array.splice(index_to_remove, 1);
  }
};

can be changed to


var remove_one =  function (array, value_to_remove) {
  for (var i = 0; i < array.length; i += 1) {
    if (array[i] === value_to_remove) {
      array.splice(i, 1);
      return;
    }
  }
};

**3.2. Reduce the scope of each variable to be as small as possible. **

Move each variable to a place where the fewest lines of code can see it. Out of sight is out of mind.

For example, suppose you have a very large class, with a member variable that’s used by only two methods, in the following way:


class LargeClass {
  string str_;
  void Method1() {
    str_ = ...;
    Method2();
  }
  void Method2() {
    // Uses str_
  }
};

can be rewritten in


class LargeClass {
  void Method1() {
    string str = ...;
    Method2(str);
  }
  void Method2(string str) {
    // Uses str
  }
};
// Now other methods can see str.

Another way to restrict access to class members is to make as many methods static as possible. Static methods are a great way to let the reader know “these lines of code are isolated from those variables.” Or another approach is to break the large class into smaller classes.

3.3. if Statement Scope in C++

Suppose you have the following C++ code:


PaymentInfo* info = database.ReadPaymentInfo();
if (info) {
  cout << "User paid: " < amount() << endl;
}
// Many more lines of code below ...

should be rewritten like below


if (PaymentInfo* info = database.ReadPaymentInfo()) {
  cout << "User paid: " < amount() << endl;
}

**3.4. Prefer write-once variables. **

Variables that are set only once (or const, final, or otherwise immutable) make code easier to understand. Variables that are a “permanent fixture” are easier to think about. Certainly, constants like: static const int NUM_THREADS = 10;

don’t require much thought on the reader’s part. And for the same reason, use of const in C++ (and final in Java) is highly encouraged.

(Continues)