Skip to content

Instantly share code, notes, and snippets.

@viphat
Last active September 7, 2022 05:03
Show Gist options
  • Save viphat/7f883a18733ae5a7ebe45e74c348dbc0 to your computer and use it in GitHub Desktop.
Save viphat/7f883a18733ae5a7ebe45e74c348dbc0 to your computer and use it in GitHub Desktop.
The Art of Readable Code

The goal of this book is help you make your code better. And when we say "code", we literally mean the lines of code you are staring at in your editor. We’re not talking about the overall architecture of your project, or your choice of design patterns. Those are certainly important, but in our experience most of our day-to-day lives as programmers are spent on the “basic” stuff, like naming variables, writing loops, and attacking problems down at the function level. And a big part of this is reading and editing the code that’s already there.

KEY IDEA 1 - Code should be easy to understand.

KEY IDEA 2 - Code should be written to minimize the time it would take for someone else (may be you sixth months later) to understand it.

Is smaller always better?

The less code you write to solve a problem, the better. It probably takes less time to understand a 2000 line class than a 5000 line class

But fewer lines isn't always better! There are plenty of times when a one-line expressions like:

assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());

takes more time to understand than if it were two lines:

bucket = FindBucket(key);
if (bucket != NULL) assert(!bucket->IsOccupied());

Similarly, a comment can make you understand the code more quickly, even though it 'adds code' to the file:

// Fast version of "hash = (65599 * hash) + c"
hash = (hash << 6) + (hash << 16) - hash + c;

So even though having fewer lines of code is a good goal, minimizing the time-till-understanding is an even better goal.


Code efficient, well-architected, easy to test, etc. Don't these sometimes conflict with wanting to make code easy to understand?

Always important to step back and ask, Is this code easy to understand?. If so, it's probably fine to move on to other code.


Although each change may seem small, in aggregate they can make a huge improvement to a codebase. If your code has great names, well-written comments, and clean use of whitespace, your code will be much easier to read.


Naming Convention:

KEY IDEA 3 - Pack information into your names.

  • Choosing specific words
  • Avoiding generic names (or knowing when to use them)
  • Using concrete names instead of abstract names
  • Attaching extra information to a name, by using a suffix or prefix
  • Deciding how long a name should be
  • Using name formatting to pack extra information

Choose Specific Words

You have to choose the words that are very specific and avoiding 'empty' words. For example, the word get is very unspecific, as in this example:

def GetPage(url):

The word GetPage() doesn't really say much. Does this method get a page from a local cache, from a database, or from the Internet? If it's from the Internet, a more specific name might be FetchPage() or DownloadPage()

The name Size() doesn't convey much information. A more specific name would be Height(), NumNodes(), MemoryBytes(), etc.

The name Stop() is okay, but depending on what exactly it does, there might be a more specific name: Kill() if it's a heavyweight operation that can't be undone. Pause() if there is a way to Resume() it.

Finding more colorful words

Don’t be afraid to use a thesaurus or ask a friend for better name suggestions. English is a rich language, and there are a lot of words to choose from.

send ~ deliver, dispatch, announce, distribute, route find ~ search, extract, locate, recover start ~ launch, create, begin, open make ~ create, set up, build, generate, compose, add, new

KEY IDEA 4 - It's better to be clear and precise than to be cute.

Avoid Generic Names like tmp and retval

Instead of using an empty name like this, pick a name that describes the entity's value or purpose

Using a generic name sometimes will help you to detect a bug

tmp or temp

if (right < left) {
    tmp = right;
    right = left;
    left = tmp;
}

In cases like these, the name tmp is perfectly fine. The variable's sole purpose is temporary storage, with a lifetime of only a few lines.

But here's a case where tmp is just used out of laziness:

String tmp = user.name();
tmp += " " + user.phone_number();
tmp += " " + user.email();
...
template.set("user_info", tmp);

Even though this variable has a short lifespan, being temporary storage isn't the most important thing about this variable. Instead, a name like user_info would be more descriptive.

In the following case, tmp should be in the name, but just as a part of it:

tmp_file = tempfile.NamedTemporaryFile()
...
SaveData(tmp_file, ...)

Notice that we named the variable tmp_file and not just tmp, because it is a file object. Imagine if we just called it tmp:

SaveData(tmp, ...)

Looking at just this one line of code, it isn’t clear if tmp is a file, a filename, or maybe even the data being written.

Loop Iterators

i, j, iter, it can be used as indices and loop iterators (In fact, if you used one of these names for some other purpose, it would be confusing - So, Don't do that). But sometimes there are better iterator names than i, j and k

for (int i = 0; i < clubs.size(); i++)
    for (int j = 0; j < clubs[i].members.size(); j++)
        for (int k = 0; k < users.size(); k++)
            if (clubs[i].members[k] == users[j])
                cout << "user[" << j << "] is in club[" << i << "]" << endl;

In the if statement, members[] and users[] are using the wrong index. Bugs like these are hard to spot because that line of code seems fine in isolation:

if (clubs[i].members[k] == users[j])

In this case, using more precise names may have helped. You can naming them as club_i, member_i, user_i or more succinctly (ci,mi,ui). This approach would help the bug stand out more:

if (clubs[ci].members[ui] == users[mi])  # Bug! First letters don't match up.

As you’ve seen, there are some situations where generic names are useful. A lot of the time, they’re overused out of pure laziness. This is understandable—when nothing better comes to mind, it’s easier to just use a meaningless name like foo and move on. But if you get in the habit of taking an extra few seconds to come up with a good name, you’ll find your naming muscle builds quickly.

Prefer Concrete Names over Abstract Names

For example, suppose you have an internal method named ServerCanStart(), which tests whether the server can listen on a given TCP/IP port. The name ServerCanStart() is somewhat abstract, though. A more concrete name would be CanListenOnPort(). This name directly describes what the method will do.

Please don't try to smash two orthogonal ideas into one. (Follow Single Responsibility Rule can help you easier to naming a method)

Attaching Extra Information to a Name

Values with Units

var start = (new Date()).getTime();  // top of the page
...
var elapsed = (new Date()).getTime() - start;  // bottom of the page
document.writeln("Load time was: " + elapsed + " seconds");

More explicit:

var start_ms = (new Date()).getTime();  // top of the page
...
var elapsed_ms = (new Date()).getTime() - start_ms;  // bottom of the page
document.writeln("Load time was: " + elapsed_ms / 1000 + " seconds");

Start(int delay) - delay -> delay_secs CreateCache(int size) - size -> size_mb ThrottleDownload(float limit) - limit -> max_kbps Rotate(float angle) - angle -> degrees_cw

Encoding Other Important Attributes

Many security exploits come from not realizing that some data you program receives is not yet in a safe state. For this, you might want to use variable names like untrustedUrl or unsafeMessageBody. After calling functions that cleanse the unsafe input, the resulting variables might be trustedUrl or safeMessageBody.

A password is in “plaintext” and should be encrypted before further processing - password - better name: plaintext_password

A user-provided comment that needs escaping before being displayed - comment - better name: unescaped_comment

Bytes of html have been converted to UTF-8 - html - better name: html_utf8

Incoming data has been "url encoded" - data - data_urlenc

You shouldn’t use attributes like unescaped_ or _utf8 for every variable in your program. They’re most important in places where a bug can easily sneak in if someone mistakes what the variable is, especially if the consequences are dire, as with a security bug. Essentially, if it’s a critical thing to understand, put it in the name.

How long should a Name be

How do you decide between naming a variable d, days or days_since_last_update

The answer depends on exactly how the variable is being used.

Shorter Names Are Okay for Shorter Scope

Identifiers that have a small scope (how many other lines of code can "see" this name) don't need to carry as much information.

if (debug) {
    map<string,int> m;
    LookUpNamesNumbers(&m);
    Print(m);
}

you can get away with shorter names because all that information (what type the variable is, its initial value, how it’s destroyed) is easy to see.

Even though m doesn’t pack any information, it’s not a problem, because the reader already has all the information she needs to understand this code.

Typing Long Names—Not a Problem Anymore

With auto completion, this is not a problem anymore.

Acronyms and Abbreviations may have the potential confusion.

Bạn có thể sử dụng từ viết tắt, nhưng chỉ nên dùng các từ viết tắt phổ biến và luôn nhớ tự đặt câu hỏi là liệu những người đồng đội mới có thể hiểu ý nghĩa tên bạn vừa đặt không? KEY IDEA - would a new teammate understand what the name means?

Throwing Out Unneeded Words

Có thể bỏ bớt một số từ mà nghĩa của chúng không hề bị mất đi, ví dụ: ConvertToString()ToString() DoServeLoop()ServeLoop

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Personal Style versus Consistency

There are certain aesthetic choices that just boil down to personal style. For instance, where the open brace for a class definition should go:

class Logger {
    ...
};

or

class Logger
{
    ...
};

If one of these styles is chosen over the other, it doesn’t substantially affect the readability of the codebase. But if these two styles are mixed throughout the code, it does affect the readability.

We’ve worked on many projects where we felt like the team was using the “wrong” style, but we followed the project conventions because we knew that consistency is far more important.

KEY IDEA 6 - Consistent style is more important than the “right” style.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

SUMMARY CHAPTER 4

Everyone prefers to read code that's aesthetically pleasing. By "formatting" your code in a consistent, meaningful way, you make it easier and faster to read.

Here are specific techniques we discussed:

  • If multiple blocks of code are doing similar things, try to give them the same silhouette.
  • Aligning parts of the code into "columns" can make code easy to skim through.
  • If code mentions A, B, and C in one place, don't say B, C, and A in another. Pick a meaningful order and stick with it.
  • Use empty lines to break apart large blocks into logical "paragraphs"

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 5. Knowing What to Comment

The goal of this chapter is to help you realize what you should be commenting. You might think the purpose of commenting is to “explain what the code does,” but that is just a small part of it.

KEY IDEA 7 - The purpose of commenting is to help the reader know as much as the writer did.

When you're writing code, you have a lot of valuable information in your head. When other people read your code, that information is lost - all they have is the code in front of them.

  • Knowing what not to comment
  • Recording your thoughts as you code
  • Putting yourself in the readers' shoes, to imagine what they'll need to know

What NOT to Comment

Reading a comment takes time away from reading the actual code, and each comment takes up space on the screen. That is, it better be worth it. So where do you draw the line between a worthless comment and a good one?

All of the comments in this code are worthless

// The class definition for Account
class Account {
  public:
    // Constructor
    Account();

    // Set the profit member to a new value
    void SetProfit(double profit);

    // Return the profit from this Account
    double GetProfit();
};

These comments are worthless because they don’t provide any new information or help the reader understand the code better.

KEY IDEA 8 - Don’t comment on facts that can be derived quickly from the code itself.

The word “quickly” is an important distinction, though. Consider the comment for this Python code

# remove everything after the second '*'
name = '*'.join(line.split('*')[:2])

Technically, this comment doesn’t present any “new information” either. If you look at the code itself, you’ll eventually figure out what it’s doing. But for most programmers, reading the commented code is much faster than understanding the code without it.

Don’t Comment Just for the Sake of Commenting

Some professors require their students to have a comment for each function in their homework code. As a result, some programmers feel guilty about leaving a function naked without comments and end up rewriting the function’s name and arguments in sentence form:

// Find the Node in the given subtree, with the given name, using the given depth.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

This one falls into the “worthless comments” category—the function’s declaration and the comment are virtually the same. This comment should be either removed or improved.

If you want to have a comment here, it might as well elaborate on more important details:

// Find a Node with the given 'name' or return NULL.
// If depth <= 0, only 'subtree' is inspected.
// If depth == N, only 'subtree' and N levels below are inspected.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

Don’t Comment Bad Names—Fix the Names Instead

A comment shouldn’t have to make up for a bad name. For example, here’s an innocent-looking comment for a function named CleanReply():

// Enforce limits on the Reply as stated in the Request,
// such as the number of items returned, or total byte size, etc.
void CleanReply(Request request, Reply reply);

Most of the comment is simply explaining what “clean” means. Instead, the phrase “enforce limits” should be moved into the function name:

// Make sure 'reply' meets the count/byte/etc. limits from the 'request'
void EnforceLimitsFromRequest(Request request, Reply reply);

This function name is more “self-documenting.” A good name is better than a good comment because it will be seen everywhere the function is used.

Here is another example of a comment for a poorly named function:

// Releases the handle for this key. This doesn't modify the actual registry.
void DeleteRegistry(RegistryKey* key);

The name DeleteRegistry() sounds like a dangerous function (it deletes the registry?!). The comment “This doesn’t modify the actual registry” is trying to clear up the confusion.

Instead, we could use a more self-documenting name like:

void ReleaseRegistryHandle(RegistryKey* key);

In general, you don’t want “crutch comments” - comments that are trying to make up for the unreadability of the code. Coders often state this rule as good code > bad code + good comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment