The Art of Readable Code

I. Code should be easy to understand

II. Packing information into names

1. Choosing specific words

Ex, ‘getPage(url)’ method. The word "get" doesn't really say much. Does this method get a page from a local cache, from a database or from the Internet? a more specific name might be 'fetchPage(), downloadPage()'…

Finding more colorful words

Key idea: It’s better to clear and precise than to be cute

Ex:

- send : deliver, dispatch, route, distribute …

- find : search, extract, locate, …. 

- start : launch, begin, create, open ….

- make :  create, setup, build, new, add …

2. Avoiding generic names like tmp, retval, …

Because it doesn’t pack much information. However, some cases where generic names do carry meaning. There are, however, some cases where generic names do carry meaning. Let’s take a look at when it makes sense to use them. Ex:

        if (right < left) {
            tmp = right;
            right = left:
            left = tmp;
        }
Or
        String tmp = user.name();
        tmp += " " + user.phone_number();
        tmp += " " + user.email();
        template.set("user_info", tmp);

Advice : the name ‘tmp’ should be used only in cases when being short-live or temporary

Loop Iterators

With names like (i, j, k) , another choise would be (club_i, member_i, user_i) or (ci, mi, ui) For instance, the following loops find which users belong to which clubs:

        for (int i = 0; i < clubs.size(); i++)
        for (int j = 0; j < clubs[i].members.size(); j++)
            for (int k = 0; k < users.size(); k++)
                if (clubs[i].members[k] == users[j])
                    cout << "user[" << j << "] is in club[" << i << "]" << endl;

In the if statement, members[] and users[] are using the wrong index. Bugs like these are hard to spot because that line of code seems fine in isolation:

        if (clubs[i].members[k] == users[j])

In this case, using more precise names may have helped. Instead of naming the loop indexes (i,j,k), another choice would be (club_i, members_i, users_i) or, more succinctly (ci, mi, ui). This approach would help the bug stand out more:

    if (clubs[ci].members[ui] == users[mi]) # Bug! First letters don't match up.

When used correctly, the first letter of the index would match the first letter of the array:

    if (clubs[ci].members[mi] == users[ui]) # OK. First letters match.

3. Using concrete names instead of abstract names

When naming a variable, function, or other element, describe it concretely rather than abstractly. For example, suppose you have an internal method named ServerCanStart(), which tests whether the server can listen on a given TCP/IP port. The name ServerCanStart() is somewhat abstract, though. A more concrete name would be CanListenOnPort(). This name directly describes what the method will do.

4.Attaching extra infomation to a name by using a suffix or prefix

So if there’s something very important about a variable that the reader must know, it’s worth attaching an extra “word” to the name. For example, suppose you had a variable that contained a hexadecimal string:

        string id; // Example: "af84ef845cd8"

You might want to name it hexid instead, if it’s important for the reader to remember the ID’s format.

Values with Units

If your variable is a measurement (such as an amount of time or a number of bytes), it’s helpful to encode the units into the variable’s name. Ex:

        var start = (new Date()).getTime();
        ‘start’ should be rename → start_ms (milisecond)

        createCache(int size)  : size → size_mb

        throttleDownload(float limit)  :   limit → max_kbps
Encoding other important attributes
Situation Variable name Better name
Bytes of html have been converted to UTF-8 html html_utf8
Incoming data has been url encoded data data_urlenc

5. How long should a name be?

When picking a good name, there’s an implicit constraint that the name shouldn’t be too long. The longer a name is, the harder it is to remember, and the more space it consumes on the screen, possibly causing extra lines to wrap.

On the other hand, programmers can take this advice too far, using only single-word (or singleletter) names. So how should you manage this trade-off? How do you decide between naming a variable d, days, or days_since_last_update?

This decision is a judgment call whose best answer depends on exactly how that variable is being used. I have a suggestion shorter names are Okay for shorter scope.

III. Names that can’t miscontrued

**Key idea **: actively scrutinize your names by asking yourself, “what other meanings could someone interpret from this name?”

Ex: Filter() method

    result = Database.all_objects.filter(“year <= 2017”);

Question: What does results now contain?

  • Objects whose year is <= 2017
  • Objects whose year is not <= 2017

It’s unclear whether it means “to pick out” or “to get rid of.” It’s best to avoid the name filter because it’s so easily misconstrued. If you want to pick out, a better name is ‘select()’. If you want to get rid of, a better name is ‘exclude()’.

Prefer min and max for (Inclusive) Limits

Ex : Let’s say your shopping cart application needs stop people from buying more than 10 items at once:

        CART_TOO_BIG_LIMIT = 10
        if shopping_cart.number_items() >  CART_TOO_BIG_LIMIT
            Error(“Too many items in cart”)

The root proplem is that CART_TOO_BIG_LIMIT is an ambiguous name, it’s not clear whether you mean “up to” or “up to and including”

Advice: The clearest way to name a limit is to put max_ or min_ in front of the thing being limited.

In this case, the name should be MAX_ITEMS_IN_CART. The new code is simple and clear:

        MAX_ITEMS_IN_CART = 10
        if shopping_cart.number_items() >   MAX_ITEMS_IN_CART
            Error(“Too many items in cart”)
Prefer begin and end for Inclusive/Exclusive Ranges

Naming Booleans When picking a name for boolean variable or a function that returns a boolean, be sure it’s clear what true or false really mean. Here’s a dangerous example:

        boolean read_password = true;

Depending on how you read it (non pun intended), there are two different interpretations:

  • we need to read the password
  • the password has already been read

In this case, it’s best to avoid the word ‘read’, and name it ‘need_password’ or ‘user_is_authenticated’ instead. In general, adding words like ‘is, has, can, or...’ should can make booleans more clear.

Matching Expectations of Users

Some names are misleading because the user has a preconceived idea of what the name means, even though you mean else.

Ex: get()

Many programmers are used to the convention that methods starting with get are ‘lightweight accessor’ that simple return an internal member. Going against this convention is likely to mislead those users.

Here’s an example, in Java, of what not to do:

        public class StatisticsCollector { 
            public void addSample(double x) { ... } 

            public double getMean() { 
                // Iterate through all samples and return total / num_samples 
            }
            ... 
        }

In this case, the implementation of ‘getMean()’ is to iterate over past data and calculate the mean on the fly. This step might be very expensive if there’s a lot of data. But an unsuspecting programmer might call ‘getMean()’ carelessly, assuming that it’s an inexpensive call.

Instead, the method should be renamed to some thing like ‘computeMean()’, which sounds more like an expensive operation.