Skip to content

Instantly share code, notes, and snippets.

@viphat
Last active September 7, 2022 05:03
Show Gist options
  • Save viphat/7f883a18733ae5a7ebe45e74c348dbc0 to your computer and use it in GitHub Desktop.
Save viphat/7f883a18733ae5a7ebe45e74c348dbc0 to your computer and use it in GitHub Desktop.
The Art of Readable Code

The goal of this book is help you make your code better. And when we say "code", we literally mean the lines of code you are staring at in your editor. We’re not talking about the overall architecture of your project, or your choice of design patterns. Those are certainly important, but in our experience most of our day-to-day lives as programmers are spent on the “basic” stuff, like naming variables, writing loops, and attacking problems down at the function level. And a big part of this is reading and editing the code that’s already there.

KEY IDEA 1 - Code should be easy to understand.

KEY IDEA 2 - Code should be written to minimize the time it would take for someone else (may be you sixth months later) to understand it.

Is smaller always better?

The less code you write to solve a problem, the better. It probably takes less time to understand a 2000 line class than a 5000 line class

But fewer lines isn't always better! There are plenty of times when a one-line expressions like:

assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());

takes more time to understand than if it were two lines:

bucket = FindBucket(key);
if (bucket != NULL) assert(!bucket->IsOccupied());

Similarly, a comment can make you understand the code more quickly, even though it 'adds code' to the file:

// Fast version of "hash = (65599 * hash) + c"
hash = (hash << 6) + (hash << 16) - hash + c;

So even though having fewer lines of code is a good goal, minimizing the time-till-understanding is an even better goal.


Code efficient, well-architected, easy to test, etc. Don't these sometimes conflict with wanting to make code easy to understand?

Always important to step back and ask, Is this code easy to understand?. If so, it's probably fine to move on to other code.


Although each change may seem small, in aggregate they can make a huge improvement to a codebase. If your code has great names, well-written comments, and clean use of whitespace, your code will be much easier to read.


Naming Convention:

KEY IDEA 3 - Pack information into your names.

  • Choosing specific words
  • Avoiding generic names (or knowing when to use them)
  • Using concrete names instead of abstract names
  • Attaching extra information to a name, by using a suffix or prefix
  • Deciding how long a name should be
  • Using name formatting to pack extra information

Choose Specific Words

You have to choose the words that are very specific and avoiding 'empty' words. For example, the word get is very unspecific, as in this example:

def GetPage(url):

The word GetPage() doesn't really say much. Does this method get a page from a local cache, from a database, or from the Internet? If it's from the Internet, a more specific name might be FetchPage() or DownloadPage()

The name Size() doesn't convey much information. A more specific name would be Height(), NumNodes(), MemoryBytes(), etc.

The name Stop() is okay, but depending on what exactly it does, there might be a more specific name: Kill() if it's a heavyweight operation that can't be undone. Pause() if there is a way to Resume() it.

Finding more colorful words

Don’t be afraid to use a thesaurus or ask a friend for better name suggestions. English is a rich language, and there are a lot of words to choose from.

send ~ deliver, dispatch, announce, distribute, route find ~ search, extract, locate, recover start ~ launch, create, begin, open make ~ create, set up, build, generate, compose, add, new

KEY IDEA 4 - It's better to be clear and precise than to be cute.

Avoid Generic Names like tmp and retval

Instead of using an empty name like this, pick a name that describes the entity's value or purpose

Using a generic name sometimes will help you to detect a bug

tmp or temp

if (right < left) {
    tmp = right;
    right = left;
    left = tmp;
}

In cases like these, the name tmp is perfectly fine. The variable's sole purpose is temporary storage, with a lifetime of only a few lines.

But here's a case where tmp is just used out of laziness:

String tmp = user.name();
tmp += " " + user.phone_number();
tmp += " " + user.email();
...
template.set("user_info", tmp);

Even though this variable has a short lifespan, being temporary storage isn't the most important thing about this variable. Instead, a name like user_info would be more descriptive.

In the following case, tmp should be in the name, but just as a part of it:

tmp_file = tempfile.NamedTemporaryFile()
...
SaveData(tmp_file, ...)

Notice that we named the variable tmp_file and not just tmp, because it is a file object. Imagine if we just called it tmp:

SaveData(tmp, ...)

Looking at just this one line of code, it isn’t clear if tmp is a file, a filename, or maybe even the data being written.

Loop Iterators

i, j, iter, it can be used as indices and loop iterators (In fact, if you used one of these names for some other purpose, it would be confusing - So, Don't do that). But sometimes there are better iterator names than i, j and k

for (int i = 0; i < clubs.size(); i++)
    for (int j = 0; j < clubs[i].members.size(); j++)
        for (int k = 0; k < users.size(); k++)
            if (clubs[i].members[k] == users[j])
                cout << "user[" << j << "] is in club[" << i << "]" << endl;

In the if statement, members[] and users[] are using the wrong index. Bugs like these are hard to spot because that line of code seems fine in isolation:

if (clubs[i].members[k] == users[j])

In this case, using more precise names may have helped. You can naming them as club_i, member_i, user_i or more succinctly (ci,mi,ui). This approach would help the bug stand out more:

if (clubs[ci].members[ui] == users[mi])  # Bug! First letters don't match up.

As you’ve seen, there are some situations where generic names are useful. A lot of the time, they’re overused out of pure laziness. This is understandable—when nothing better comes to mind, it’s easier to just use a meaningless name like foo and move on. But if you get in the habit of taking an extra few seconds to come up with a good name, you’ll find your naming muscle builds quickly.

Prefer Concrete Names over Abstract Names

For example, suppose you have an internal method named ServerCanStart(), which tests whether the server can listen on a given TCP/IP port. The name ServerCanStart() is somewhat abstract, though. A more concrete name would be CanListenOnPort(). This name directly describes what the method will do.

Please don't try to smash two orthogonal ideas into one. (Follow Single Responsibility Rule can help you easier to naming a method)

Attaching Extra Information to a Name

Values with Units

var start = (new Date()).getTime();  // top of the page
...
var elapsed = (new Date()).getTime() - start;  // bottom of the page
document.writeln("Load time was: " + elapsed + " seconds");

More explicit:

var start_ms = (new Date()).getTime();  // top of the page
...
var elapsed_ms = (new Date()).getTime() - start_ms;  // bottom of the page
document.writeln("Load time was: " + elapsed_ms / 1000 + " seconds");

Start(int delay) - delay -> delay_secs CreateCache(int size) - size -> size_mb ThrottleDownload(float limit) - limit -> max_kbps Rotate(float angle) - angle -> degrees_cw

Encoding Other Important Attributes

Many security exploits come from not realizing that some data you program receives is not yet in a safe state. For this, you might want to use variable names like untrustedUrl or unsafeMessageBody. After calling functions that cleanse the unsafe input, the resulting variables might be trustedUrl or safeMessageBody.

A password is in “plaintext” and should be encrypted before further processing - password - better name: plaintext_password

A user-provided comment that needs escaping before being displayed - comment - better name: unescaped_comment

Bytes of html have been converted to UTF-8 - html - better name: html_utf8

Incoming data has been "url encoded" - data - data_urlenc

You shouldn’t use attributes like unescaped_ or _utf8 for every variable in your program. They’re most important in places where a bug can easily sneak in if someone mistakes what the variable is, especially if the consequences are dire, as with a security bug. Essentially, if it’s a critical thing to understand, put it in the name.

How long should a Name be

How do you decide between naming a variable d, days or days_since_last_update

The answer depends on exactly how the variable is being used.

Shorter Names Are Okay for Shorter Scope

Identifiers that have a small scope (how many other lines of code can "see" this name) don't need to carry as much information.

if (debug) {
    map<string,int> m;
    LookUpNamesNumbers(&m);
    Print(m);
}

you can get away with shorter names because all that information (what type the variable is, its initial value, how it’s destroyed) is easy to see.

Even though m doesn’t pack any information, it’s not a problem, because the reader already has all the information she needs to understand this code.

Typing Long Names—Not a Problem Anymore

With auto completion, this is not a problem anymore.

Acronyms and Abbreviations may have the potential confusion.

Bạn có thể sử dụng từ viết tắt, nhưng chỉ nên dùng các từ viết tắt phổ biến và luôn nhớ tự đặt câu hỏi là liệu những người đồng đội mới có thể hiểu ý nghĩa tên bạn vừa đặt không? KEY IDEA - would a new teammate understand what the name means?

Throwing Out Unneeded Words

Có thể bỏ bớt một số từ mà nghĩa của chúng không hề bị mất đi, ví dụ: ConvertToString()ToString() DoServeLoop()ServeLoop

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Summary Chapter 2

Pack information into your names

  • Use specific words - for example, instead of Get, words like Fetch or Download might be better, depending on the context.
  • Avoid generic names like tmp and retval, unless there’s a specific reason to use them.
  • Use concrete names that describe things in more detail - the name ServerCanStart() is vague compared to CanListenOnPort().
  • Attach important details to variable names - for example, append ms to a variable whose value is in milliseconds or prepend raw to an unprocessed variable that needs escaping.
  • Use longer names for larger scopes - don’t use cryptic one- or two-letter names for variables that span multiple screens; shorter names are better for variables that span only a few lines.
  • Use capitalization, underscores, and so on in a meaningful way - for example, you can append “_” to class members to distinguish them from local variables.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can’t Be Misconstrued

Example: Clip(text, length)

# Cuts off the end of the text, and appends "..."
def Clip(text, length):    
  ...

There are two ways you can imagine how Clip() behaves:

  • It removes length from the end.
  • It truncates to a maximum length.

The second way (truncation) is most likely, but you never know for sure. Rather than leave your reader with any nagging doubt, It would be better to name the function Truncate(text, length)

However, the parameter name length is also to blame. If it were max_length, that would make it even more clear.

But we're still not done. The name max_length still leaves multiple interpretations:

  • A number of bytes
  • A number of characters
  • A number of words

In this case, we mean “number of characters,” so instead of max_length, it should be max_chars.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can’t Be Misconstrued

In the previous chapter, we covered how to put a lot of information into your names. In this chapter, we focus on a different topic: watching out for names that can be misunderstood.

KEY IDEA 5: Actively scrutinize your names by asking yourself, “What other meanings could someone interpret from this name?”

Example: Filter()

results = Database.all_objects.filter("year <= 2011")

What does results now contain?

  • Objects whose year is <= 2011?
  • Objects whose year is not <= 2011?

The problem is that filter is an ambiguous word. It’s unclear whether it means “to pick out” or “to get rid of.” It’s best to avoid the name filter because it’s so easily misconstrued.

If you want “to pick out,” a better name is select(). If you want “to get rid of,” a better name is exclude().

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 2 - Use Name Formatting to Convey Meaning

The way you use underscores, dashes, and capitalization can also pack more information in a name.

For example (Ruby):

  • CamelCase for class names.
  • lower_separated for variable names.
  • CONSTANT_NAME
  • @ for instance variables, @@ for class variables and $ for global variables

Other Formatting Conventions

The "Javascript: the good parts" author suggests that “constructors” (functions intended to be called with new) should be capitalized and that ordinary functions should start with a lowercase letter.

var x = new DatePicker();  // DatePicker() is a "constructor" function
var y = pageHeight();      // pageHeight() is an ordinary function

when calling the jQuery library function (whose name is the single character $), a useful convention is to prefix jQuery results with $ as well:

var $all_images = $("img");  // $all_images is a jQuery object
var height = 250;            // height is not

Throughout the code, it will be clear that $all_images is a jQuery result object.

When giving an HTML tag an id or class attribute, both underscores and dashes are valid characters to use in the value. One possible convention is to use underscores to separate words in IDs and dashes to separate words in classes.

<div id="middle_column" class="main-content"> ...

Whether you decide to use conventions like these is up to you and your team. But whichever system you use, be consistent across your project.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can’t Be Misconstrued

"up to" or "up to and including"

CART_TOO_BIG_LIMIT = 10
if shopping_cart.num_items() >= CART_TOO_BIG_LIMIT:
  Error("Too many items in cart.")

But there is a potential bug, >= or > or by redefining CART_TOO_BIG_LIMIT to 11

CART_TOO_BIG_LIMIT is an ambiguous name—it’s not clear whether you mean “up to” or “up to and including.”

ADVICE

The clearest way to name a limit is to put max_ or min_ in front of the thing being limited.

MAX_ITEMS_IN_CART = 10
if shopping_cart.num_items() > MAX_ITEMS_IN_CART:    
  Error("Too many items in cart.")

Prefer min_, max_, first_, last_, begin_, end_ for inclusive range than start_, stop_, etc.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can't Be Misconstrued

Naming Booleans

When picking a name for a boolean variable or a function that returns a boolean, be sure it's clear what true and false really mean.

Here's a dangerous example:

bool read_password = true;

Depending on how you read it (no pun intended), there are two very different interpretations:

  • We need to read the password
  • The password has already been read

In this case, it's better to name it need_password or user_is_authenticated instead.

In general, adding words like is, has, can or should can make booleans more clear.

For example, a function named SpaceLeft() sounds like it might return a number. If it were meant to return a boolean, a better name would be HasSpaceLeft().

Finally, it’s best to avoid negated terms in a name. For example, instead of:

bool disable_ssl = false;

it would be easier to read (and more compact) to say:

bool use_ssl = true;

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can't Be Misconstrued

Matching Expectations of Users

Some names are misleading because the user has a preconceived idea of what the name means, even though you mean something else. In these cases, it’s best to just “give in” and change the name so that it’s not misleading.

get*() hoặc .size()

public class StatisticsCollector {
    public void addSample(double x) { ... }

    public double getMean() {
        // Iterate through all samples and return total / num_samples
    }
    ...
}

Với từ khóa get thường được dùng để trả về một giá trị nào đó (độ phức tạp O(1)... Tuy nhiên method getMean() lại phải duyệt qia tất cả các phần tử và tính toán trả về giá trị trung bình của các phần tử đó (O(n)). Vì vậy thay vì dùng getMean(), ta nên đặt tên là computeMean() để báo hiệu cho những người khác biết đây là một expensive operation chứ không phải là an inexpensive call như các method get khác.

Tương tự, nếu method size() của bạn có độ phức tạp là O(n) thì nên đổi tên thành countSize() hoặc countElements() sẽ giúp lập trình viên khác cẩn thận hơn khi gọi nó.

WHO’S THE WIZARD?
A while ago, one of the authors was installing the OpenBSD operating system. During the disk formatting step, a complicated menu appeared, asking for disk parameters. One of the options was to go to “Wizard mode.” He was relieved to find this user-friendly option and selected it. To his dismay, it dropped the installer into a low-level prompt waiting for manual disk formatting commands, with no clear way to get out of it. Evidently “wizard” meant you were the wizard!

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 3 - Names That Can't Be Misconstrued

Evaluating Multiple Name Candidates

When deciding on a good name, you might have multiple candidates that you’re considering. It’s common to debate the merits of each name in your head before settling on the final choice.

experiment_id: 101
the_other_experiment_id_I_want_to_reuse: 100

Here are four names to consider:

  1. template
  2. reuse
  3. copy
  4. inherit

-> copy_experiment and inherit_from_experiment_id are the best names, because they most clearly describe what's happening and are least likely to be misunderstood.

Some names that can cause misleading:

copy - copy this experiment 100 times or this is the 100th copy of something.
reuse - This experiment can be reused at most 100 times.
reuse_id - my id for reuse is 100.
template - I am a template or I am using this other template. template is often something abstract. template is just too vague in this situation.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Summary Chapter 3

The best names are ones that can’t be misconstrued - the person reading your code will understand it the way you meant it, and no other way. Unfortunately, a lot of English words are ambiguous when it comes to programming, such as filter, length, and limit.

Before you decide on a name, play devil’s advocate and imagine how your name might be misunderstood. The best names are resistant to misinterpretation.

When it comes to defining an upper or lower limit for a value, max_ and min_ are good prefixes to use. For inclusive ranges, first and last are good. For inclusive/exclusive ranges, begin and end are best because they’re the most idiomatic.

When naming a boolean, use words like is and has to make it clear that it’s a boolean. Avoid negated terms (e.g., disable_ssl).

Beware of users’ expectations about certain words. For example, users may expect get() or size() to be lightweight methods.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics (Tính thẩm mỹ)

Good source code should be just as “easy on the eyes.” In this chapter, we’ll show how good use of spacing, alignment, and ordering can make your code easier to read.

Specifically, there are three principles we use:

  • Use consistent layout, with patterns the reader can get used to.
  • Make similar code look similar.
  • Group related lines of code into blocks.

AESTHETICS VS. DESIGN

In this chapter, we’re concerned only with simple “aesthetic” improvements you can make to your code. These types of changes are easy to make and often improve readability quite a bit. There are times when larger refactoring of your code (such as splitting out new functions or classes) can help even more. Our view is that good aesthetics and good design are independent ideas; ideally you should strive for both.


Obviously it’s easier to work with code that’s aesthetically pleasing. If you think about it, most of your time programming is spent looking at code! The faster you can skim through your code, the easier it is for everyone to use it.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Rearrange Line Breaks to Be Consistent and Compact

public class PerformanceTester {
    public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator(
        500, /* Kbps */
        80, /* millisecs latency */
        200, /* jitter */
        1 /* packet loss % */);

    public static final TcpConnectionSimulator t3_fiber =
        new TcpConnectionSimulator(
            45000, /* Kbps */
            10, /* millisecs latency */
            0, /* jitter */
            0 /* packet loss % */);

    public static final TcpConnectionSimulator cell = new TcpConnectionSimulator(
        100, /* Kbps */
        400, /* millisecs latency */
        250, /* jitter */
        5 /* packet loss % */);
}

To make the code look more consistent, we could introduce extra line breaks

More easier to read:

public class PerformanceTester {
    public static final TcpConnectionSimulator wifi =
        new TcpConnectionSimulator(
            500,   /* Kbps */
            80,    /* millisecs latency */
            200,   /* jitter */
            1      /* packet loss % */);

    public static final TcpConnectionSimulator t3_fiber =
        new TcpConnectionSimulator(
            45000, /* Kbps */
            10,    /* millisecs latency */
            0,     /* jitter */
            0      /* packet loss % */);

    public static final TcpConnectionSimulator cell =
        new TcpConnectionSimulator(
            100,   /* Kbps */
            400,   /* millisecs latency */
            250,   /* jitter */
            5      /* packet loss % */);
}

Here’s a more compact way to write the class:

public class PerformanceTester {
    // TcpConnectionSimulator(throughput, latency, jitter, packet_loss)
    //                            [Kbps]   [ms]    [ms]    [percent]

    public static final TcpConnectionSimulator wifi =
        new TcpConnectionSimulator(500,    80,     200,    1);

    public static final TcpConnectionSimulator t3_fiber =
        new TcpConnectionSimulator(45000,  10,     0,      0);

    public static final TcpConnectionSimulator cell =
        new TcpConnectionSimulator(100,    400,    250,    5);
}

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Use Methods to Clean Up Irregularity

Suppose you had a personnel database that provided the following function:

// Turn a partial_name like "Doug Adams" into "Mr. Douglas Adams".
// If not possible, 'error' is filled with an explanation.
string ExpandFullName(DatabaseConnection dc, string partial_name, string* error);

and this function was tested with a series of examples:

DatabaseConnection database_connection;
string error;
assert(ExpandFullName(database_connection, "Doug Adams", &error)
    == "Mr. Douglas Adams");
assert(error == "");
assert(ExpandFullName(database_connection, " Jake  Brown ", &error)
    == "Mr. Jacob Brown III");
assert(error == "");
assert(ExpandFullName(database_connection, "No Such Guy", &error) == "");
assert(error == "no match found");
assert(ExpandFullName(database_connection, "John", &error) == "");
assert(error == "more than one result");

This code is not aesthetically pleasing. Some of the lines are so long that they wrap to the next line. The silhouette of this code is ugly and there is no consistent pattern.

But this is a case where rearranging the line breaks can only do so much. The bigger problem is that there are a lot of repeated strings like “assert(ExpandFullName(database_connection...,” and “error” that are getting in the way. To really improve this code, we need a helper method so the code can look like this:

CheckFullName("Doug Adams", "Mr. Douglas Adams", "");
CheckFullName(" Jake  Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy", "", "no match found");
CheckFullName("John", "", "more than one result");

Now it’s more clear that there are four tests happening, each with different parameters. Even though all the “dirty work” is inside CheckFullName(), that function isn’t so bad, either:

void CheckFullName(string partial_name,
                   string expected_full_name,
                   string expected_error) {
    // database_connection is now a class member
    string error;
    string full_name = ExpandFullName(database_connection, partial_name, &error);
    assert(error == expected_error);
    assert(full_name == expected_full_name);
}

Even though our goal was just to make the code more aesthetically pleasing, this change has a number of other side benefits:

  • It eliminates a lot of the duplicated code from before, making the code more compact.
  • The important parts of each test case (the names and error strings) are now by themselves, in plain sight. Before, these strings were interspersed with tokens like database_connection and error, which made it hard to "take in" the code in one eyeful.
  • Adding new tests should be much easier now.

The moral of the story is that making code "look pretty" often results in more than just surface improvements - it might help you structure your code better.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Use Column Alignment When Helpful

Straight edges and columns make it easier for readers to scan through text.

CheckFullName("Doug Adams"   , "Mr. Douglas Adams" , "");
CheckFullName(" Jake  Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy"  , ""                  , "no match found");
CheckFullName("John"         , ""                  , "more than one result");

In this code, it’s easier to distinguish the second and third arguments to CheckFullName().

Here is a simple example with a large group of variable definitions:

# Extract POST parameters to local variables
details  = request.POST.get('details')
location = request.POST.get('location')
phone    = equest.POST.get('phone')
email    = request.POST.get('email')
url      = request.POST.get('url')

As you may have noticed, the third definition has a typo (equest instead of request). Mistakes like these are more pronounced when everything is lined up so neatly.

In the wget codebase, the available command-line options (more than 100 of them) were listed as follows:

commands[] = {
    ...
    { "timeout",          NULL,                   cmd_spec_timeout },
    { "timestamping",     &opt.timestamping,      cmd_boolean },
    { "tries",            &opt.ntry,              cmd_number_inf },
    { "useproxy",         &opt.use_proxy,         cmd_boolean },
    { "useragent",        NULL,                   cmd_spec_useragent },
    ...
};

This approach made the list very easy to skim through and jump from one column to the next.

Should You Use Column Alignment?

Column edges provide “visual handrails” that make it easier to scan through. It’s a good example of “make similar code look similar.”

But some programmers don’t like it. One reason is that it takes more work to set up and maintain the alignment. Another reason is it creates a larger “diff” when making changes—a one-line change might cause five other lines to change (mostly just whitespace).

Our advice is to try it. In our experience, it doesn’t take as much work as programmers fear. And if it does, you can simply stop.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Pick a Meaningful Order, and Use It Consistently

There are many cases where the order of code doesn’t affect the correctness. For instance, these five variable definitions could be written in any order:

details  = request.POST.get('details')
location = request.POST.get('location')
phone    = request.POST.get('phone')
email    = request.POST.get('email')
url      = request.POST.get('url')

In situations like this, it’s helpful to put them in some meaningful order, not just random. Here are some ideas:

  • Match the order of the variables to the order of the fields on the corresponding HTML form.
  • Order them from “most important” to “least important.”
  • Order them alphabetically.

Whatever the order, you should use the same order throughout your code. It would be confusing to change the order later on:

if details:  rec.details  = details
if phone:    rec.phone    = phone     # Hey, where did 'location' go?
if email:    rec.email    = email
if url:      rec.url      = url
if location: rec.location = location  # Why is 'location' down here now?

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Organize Declarations into Blocks

class FrontendServer {
  public:
    FrontendServer();
    void ViewProfile(HttpRequest* request);
    void OpenDatabase(string location, string user);
    void SaveProfile(HttpRequest* request);
    string ExtractQueryParam(HttpRequest* request, string param);
    void ReplyOK(HttpRequest* request, string html);
    void FindFriends(HttpRequest* request);
    void ReplyNotFound(HttpRequest* request, string error);
    void CloseDatabase(string location);
    ~FrontendServer();
};
class FrontendServer {
  public:
    FrontendServer();
    ~FrontendServer();

    // Handlers
    void ViewProfile(HttpRequest* request);
    void SaveProfile(HttpRequest* request);
    void FindFriends(HttpRequest* request);

    // Request/Reply Utilities
    string ExtractQueryParam(HttpRequest* request, string param);
    void ReplyOK(HttpRequest* request, string html);
    void ReplyNotFound(HttpRequest* request, string error);

    // Database Helpers
    void OpenDatabase(string location, string user);
    void CloseDatabase(string location);
};

Break Code into “Paragraphs”

Written text is broken into paragraphs for a number of reasons:

  • It's a way to group similar ideas together and set them apart from other ideas.
  • It provides a visual 'stepping stone' - without it, it's easy to lose your place on the page.
  • It facilitates navigation from one paragraph to another.
# Import the user's email contacts, and match them to users in our system.
# Then display a list of those users that he/she isn't already friends with.
def suggest_new_friends(user, email_password):
    friends = user.friends()
    friend_emails = set(f.email for f in friends)
    contacts = import_contacts(user.email, email_password)
    contact_emails = set(c.email for c in contacts)
    non_friend_emails = contact_emails - friend_emails
    suggested_friends = User.objects.select(email__in=non_friend_emails)
    display['user'] = user
    display['friends'] = friends
    display['suggested_friends'] = suggested_friends
    return render("suggested_friends.html", display)

It may not be obvious, but this function goes through a number of distinct steps. So it would be especially useful to break up those lines of code into paragraphs:

def suggest_new_friends(user, email_password):
    # Get the user's friends' email addresses.
    friends = user.friends()
    friend_emails = set(f.email for f in friends)

    # Import all email addresses from this user's email account.
    contacts = import_contacts(user.email, email_password)
    contact_emails = set(c.email for c in contacts)

    # Find matching users that they aren't already friends with.
    non_friend_emails = contact_emails - friend_emails
    suggested_friends = User.objects.select(email__in=non_friend_emails)

    # Display these lists on the page.
    display['user'] = user
    display['friends'] = friends
    display['suggested_friends'] = suggested_friends

    return render("suggested_friends.html", display)

Notice that we also added a summary comment to each paragraph, which also helps the reader skim through the code.
As with written text, there may be multiple ways to break the code up, and programmers may prefer longer or shorter paragraphs.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 4 - Aesthetics

Personal Style versus Consistency

There are certain aesthetic choices that just boil down to personal style. For instance, where the open brace for a class definition should go:

class Logger {
    ...
};

or

class Logger
{
    ...
};

If one of these styles is chosen over the other, it doesn’t substantially affect the readability of the codebase. But if these two styles are mixed throughout the code, it does affect the readability.

We’ve worked on many projects where we felt like the team was using the “wrong” style, but we followed the project conventions because we knew that consistency is far more important.

KEY IDEA 6 - Consistent style is more important than the “right” style.

@viphat
Copy link
Author

viphat commented Feb 13, 2017

SUMMARY CHAPTER 4

Everyone prefers to read code that's aesthetically pleasing. By "formatting" your code in a consistent, meaningful way, you make it easier and faster to read.

Here are specific techniques we discussed:

  • If multiple blocks of code are doing similar things, try to give them the same silhouette.
  • Aligning parts of the code into "columns" can make code easy to skim through.
  • If code mentions A, B, and C in one place, don't say B, C, and A in another. Pick a meaningful order and stick with it.
  • Use empty lines to break apart large blocks into logical "paragraphs"

@viphat
Copy link
Author

viphat commented Feb 13, 2017

Chapter 5. Knowing What to Comment

The goal of this chapter is to help you realize what you should be commenting. You might think the purpose of commenting is to “explain what the code does,” but that is just a small part of it.

KEY IDEA 7 - The purpose of commenting is to help the reader know as much as the writer did.

When you're writing code, you have a lot of valuable information in your head. When other people read your code, that information is lost - all they have is the code in front of them.

  • Knowing what not to comment
  • Recording your thoughts as you code
  • Putting yourself in the readers' shoes, to imagine what they'll need to know

What NOT to Comment

Reading a comment takes time away from reading the actual code, and each comment takes up space on the screen. That is, it better be worth it. So where do you draw the line between a worthless comment and a good one?

All of the comments in this code are worthless

// The class definition for Account
class Account {
  public:
    // Constructor
    Account();

    // Set the profit member to a new value
    void SetProfit(double profit);

    // Return the profit from this Account
    double GetProfit();
};

These comments are worthless because they don’t provide any new information or help the reader understand the code better.

KEY IDEA 8 - Don’t comment on facts that can be derived quickly from the code itself.

The word “quickly” is an important distinction, though. Consider the comment for this Python code

# remove everything after the second '*'
name = '*'.join(line.split('*')[:2])

Technically, this comment doesn’t present any “new information” either. If you look at the code itself, you’ll eventually figure out what it’s doing. But for most programmers, reading the commented code is much faster than understanding the code without it.

Don’t Comment Just for the Sake of Commenting

Some professors require their students to have a comment for each function in their homework code. As a result, some programmers feel guilty about leaving a function naked without comments and end up rewriting the function’s name and arguments in sentence form:

// Find the Node in the given subtree, with the given name, using the given depth.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

This one falls into the “worthless comments” category—the function’s declaration and the comment are virtually the same. This comment should be either removed or improved.

If you want to have a comment here, it might as well elaborate on more important details:

// Find a Node with the given 'name' or return NULL.
// If depth <= 0, only 'subtree' is inspected.
// If depth == N, only 'subtree' and N levels below are inspected.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

Don’t Comment Bad Names—Fix the Names Instead

A comment shouldn’t have to make up for a bad name. For example, here’s an innocent-looking comment for a function named CleanReply():

// Enforce limits on the Reply as stated in the Request,
// such as the number of items returned, or total byte size, etc.
void CleanReply(Request request, Reply reply);

Most of the comment is simply explaining what “clean” means. Instead, the phrase “enforce limits” should be moved into the function name:

// Make sure 'reply' meets the count/byte/etc. limits from the 'request'
void EnforceLimitsFromRequest(Request request, Reply reply);

This function name is more “self-documenting.” A good name is better than a good comment because it will be seen everywhere the function is used.

Here is another example of a comment for a poorly named function:

// Releases the handle for this key. This doesn't modify the actual registry.
void DeleteRegistry(RegistryKey* key);

The name DeleteRegistry() sounds like a dangerous function (it deletes the registry?!). The comment “This doesn’t modify the actual registry” is trying to clear up the confusion.

Instead, we could use a more self-documenting name like:

void ReleaseRegistryHandle(RegistryKey* key);

In general, you don’t want “crutch comments” - comments that are trying to make up for the unreadability of the code. Coders often state this rule as good code > bad code + good comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment