Instantly share code, notes, and snippets.

@rikukissa /POST.md
Last active Nov 15, 2018

Embed
What would you like to do?
Collection of code simplification and refactoring tips #best practices #refactoring #code review

logo


Collection of code simplification and refactoring tips

Want to write clean code? Well here's your chance. The following set of examples are practices that I've come across during my times as a programmer and that I believe in. At least at the moment. Please feel free to add comments, criticise my thinking and contribute to everyone's knowledge. I'm trying to find a better platform for these, but for now a gist should do. I'm doing this solely to document, analyse and share my own thinking, which oftentimes is quite unstructured. These days I'm not too fussed about the syntactical structures of the code, but would rather concentrate on painting the largest guidelines and deciding the direction I want my programming style to move towards.

Contents:

Further reading:

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 8, 2018

Define value boundaries early, keep things flat

I've started moving some of these to their own Gists

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 8, 2018

Know what you're returning and be explicit about it

If your intention is not to return any value from this file, it's important to show it explicitly. With early return, it's always tempting to place the return statement on the same row with the last operation. Don't do this without thinking, because unexpected return values do no good for the person using your function. It harms readability and makes code inconsistent.

Instead of this:

function upload(file) {
  if (file.size >= 9999999) {
    return;
  }
  
  if (file.name.length <= 5) {
    return;
  }

  if (file.format === 'jpeg') {
    return saveFile('jpegs/' + file.name);
  }
  
  return saveFile('others/' + file.name);
}

Do this:

function upload(file) {
  if (file.size >= 9999999) {
    return;
  }
  
  if (file.name.length <= 5) {
    return;
  }

  if (file.format === 'jpeg') {
-   return saveFile('jpegs/' + file.name);
+   saveFile('jpegs/' + file.name);
+   return
  }
- return saveFile('others/' + file.name);  
+ saveFile('others/' + file.name);
+ return
}
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 9, 2018

If your ternary condition is duplicated in either one of the branches, the whole ternary is useless

Bad code:

object.isHereSomething ? object.isHereSomething  : 'default value';

Good code:

object.isHereSomething || 'default value';

Explanation

The same condition is seen in 2 different places of this expression. This is your first clue, which often means that the expression can be simplified with boolean logic operators.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 9, 2018

Simplified boolean logic

From this:

function exampleFunction(someConditional) {
  if(someConditional === true /* 1. */) {
    return true /* 2. */
  } else {
    return false
  }
}

To this:

function exampleFunction(someConditional) {
  return someConditional
}

If you find yourself with simplified code that looks as simple as this, please consider removing the function altogether. Indirection is always bad!


Explanation:

1. Never compare 2 booleans

function exampleFunction(someConditional) {
- if(someConditional === true /* 1. */) {
+ if(someConditional) {
    return true /* 2. */
  } else {
    return false
  }
}

2. If you're returning booleans from both branches of an if statement, it's usually not needed at all

There's rarely a need for returning true from one side of an if statement and false from the other.
In many cases the conditional statement can be used as the return value itself.

function exampleFunction(someConditional) {
+  return someConditional
-  if(someConditional) {
-    return true /* 2. */
-  } else {
-    return false
-  }
}
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 9, 2018

On useless try-catches, being overly defensive, I/O boundaries and variable scope

I've started moving some of these to their own Gists

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 9, 2018

Only one type of return value per function

Even though this might feel like a minor thing, it certainly isn't. As with propagation of thrown exceptions and asynchronous operations, return values also cause an effect greater than what it initially looks like.

Bad

function findUsers(userId) {
   const results = db.select('SELECT * FROM app_users WHERE id = ?', userId);
   if(results.length === 1) {
     return results[0]
   }
   return results
}

Good

function findUsers(userId) {
   return db.select('SELECT * FROM app_users WHERE id = ?', userId);
}

Explanation

By returning values in 2 different formats, you force the caller to check in which type the received value is.

const userOrUsers = findUser(101)
if(Array.isArray(userOrUsers)) {
  userOrUsers.forEach((user) => user.delete())
  return
} 
 
userOrUsers.delete()

Now as you can see, the caller has already taken 2 different routes for doing the same operation. Imagine if the proceeding operation would be even more complex, and would use calls to other functions with side effects. What a mess! Not cool at all, unlike this:

findUser(101).forEach((user) => user.delete())

The example above could also be written as

const userOrUsers = findUser(101)
[].concat(userOrUsers).forEach((user) => user.delete())

Which then again makes readability worse.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 9, 2018

Don't return null / nil / None / void / undefined

Empty values are the magic sauce for making your developer console go all red.
image

They cause unpleasant surprises. They break up families. They are known to breed fruit flies in your rubbish.

In dynamic programming languages it's often very difficult to keep track of which functions can and which cannot return these buggers and in the worst case you end up complicating your code with null-checks that are not actually needed.

There are few cases however when returning null is ~okay:

  • The value you would otherwise be returning is a singular value:
    One user instance, a number, a string etc. For a function named findUser it can be argued that null is a valid return value when a user is not found. In the case of findUsers that returns a list of found users the correct return value is an empty list. Now the consumer of your function only needs to write handling for a single type: a list. If the list is empty, the same logic still applies.

  • You have a typed programming language where null is treated as a type, that forces you to explicitly define that a function returns either a value or null and makes check the value before being able to manipulate it (TypeScript, Swift...).

    • With languages like Java, returning null can lead into NullPointerExceptions on runtime because the compiler will not let you know about references to values that can potentially be null. With these kinds of languages it's better to use checked exceptions.

But mmmm, I'm using JavaScript / Python / Ruby ... ?

Cool, so am I. What I'd suggest is, that you use a functor data type instead. To summarise, consider what I mentioned before about returning an empty list when there are no values to return. What's so nifty about it? Well, I'd argue it's the fact that the actual value (or the non-value) is wrapped inside a container of sort that help you with manipulating the values inside of it.

Instead of using an array for wrapping a singular value, a more suitable option could be a Maybe type. Or an Optional type if you're using Java (or maybe not..). Sure there are plenty of alternatives for Python and Ruby as well.

Besides making your code more reliable it also helps everyone trying to understand what your code is doing: the reader can now glance over the code without having to dive deep into the actual function definitions. It's immensely tiring having to read through all code line by line just to figure out the possible outcomes of the function.

console.log(findUser('danieljharvey')) // { id: 2, city: 'London' }

You console.log the return value of a function and it gives you a user object? Great, but how do I know it does it every time?

function findUser() {
  if(Math.random() > 0.5) {
    return { id: 2, city: 'London' };
  }
  return null; // lol fooled ya
}

What if it would give you a Maybe<User> instead? At least now you know you have to be prepared for the worst.

console.log(findUser('danieljharvey')) // Maybe({ id: 2, city: 'London' })

It's an abstraction for return values that many programming languages lack for one reason of another.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

Always prefer naming variables used as conditionals

From this:

if(player.x + player.width > enemy.x && player.x < enemy.x + enemy.width) {
  game.end()
}

To this:

const playerTouchesEnemy = player.x + player.width > enemy.x && player.x < enemy.x + enemy.width
if(playerTouchesEnemy) {
  game.end()
}
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

IDEA: do not use undefined in your own code

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

IDEA: prefer passing primitives (react-intl generalised)

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

IDEA: backend: no http stuff to service layer, no database stuff to http layer (generalised)

Like "function A calculates api path", "function B does the fetch" rather than everything in a bag.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

IDEA:

  • no configuration objects to functions
  • max 3 parameters per function
  • calculatePos(x1, y1, x2, y2) can be simplified by creating a Line type or a Point type - telegram convo 4.10
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

Shorter if branch first

This works in most cases, however I wouldn't only leave it as the end result is, but would combine it with Define value boundaries early, keep things flat.

If statements blocks with more than few lines of code in them are usually a bad sign!

From:

if (something) {
  /*
   *
   *
   * Loads of stuff here so you need to scroll
   *
   *
   * ...
   */
} else {
  // Only one or few lines
}

To:

if (!something) {
  // Only one or few lines
} else {
  /*
   *
   *
   * Loads of stuff here so you need to scroll
   *
   *
   * ...
   */
}
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 10, 2018

Always avoid unnecessary data structures, stop creating garbage piles for your data!

Consider the following JSON response from an HTTP API:

// GET users/123
{
  data: {
    username: 'rikukissa',
    level: 'noob',
    info: {
      age: 26,
      email: '...'
    }
  }
}

There's at least 2 different things wrong here: First of all the data wrapper is pretty much useless. Of course you could argue that there's need for meta data, error information or what ever, and that's a perfectly valid claim. In this case though, there's no information of that type included in my response. One could also argue, that error information should only be a part of erroneous responses and not valid ones so something like:

// GET users/123
{
  error: false,
  data: {
    username: 'rikukissa',
    level: 'noob',
    info: {
      age: 26,
      email: '...'
    }
  }
}

is completely pointless.

Stick to convention, always

There are even wilder examples like { data: [], errors: [], meta: [] }, and the question is: what kind of information would you store in the meta field? I bet you a pint all suggestions you'll come up with would be better off transferred in some other part of the HTTP protocol like in the response headers. My worry is that these unconventional responses are in reality only used because of the lack of knowledge about best practices and the features HTTP provides by default. There are for instance very few error cases that couldn't be mediated with only HTTP response code.

Don't create places to dump code

Okay, let's move on. This is what we have now:

// GET users/123
{
  username: 'rikukissa',
  level: 'noob',
  info: {
    age: 26,
    email: '...'
  }
}

It looks fairly good, although the need for a separate info section puzzles me. There are many names for these kinds of sections like meta, info, data, recycle bin, garbage, utils, common. Let's face it: it's just a unstructured place where you can dump your new field without re-thinking the overall structure. If we were to model the response as a data type User, we'd most likely not treat "age" any different from the rest of the fields

interface IUser {
  username: string;
  level: string;
  age: number;
  email: string;
}

You may have noticed that the name examples I gave such as utils and common aren't really related to this specific use case. And there's a reason for this: people tend to do this everywhere. A directory in your codebase called common is just a telltale sign of bad design. There are very few modules that are "common" for all other modules of your codebase. I feel like I always say this, but to me it's the same as creating a file called functions.js with all the functions of your app inside of it. Please don't.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 11, 2018

IDEA: Prefer passing data instead of functions as parameters

  • Special cases: high-level abstractions like map etc functions, callbacks..
  • Logic upwards propagation argument
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 11, 2018

Example

From this:

getDynamicSelectOptions = (field: IField, values: IFormData) => {
  if (field.dynamicOptions) {
    switch (field.name) {
      case "city":
        return addresses[values.state].cities;
      case "cityPermanent":
        return addresses[values.statePermanent].cities;
      case "addressLine4":
        if (
          addresses[values.state][values.city] &&
          addresses[values.state][values.city].areas
        ) {
          return addresses[values.state][values.city].areas;
        } else {
          return [];
        }
      default:
        return [];
    }
  } else {
    return [];
  }
};

To this:

getDynamicSelectOptions = (field: IField, values: IFormData) => {
  if (!field.dynamicOptions) {
    return [];
  }
  switch (field.name) {
    case "city":
      return addresses[values.state].cities;

    case "cityPermanent":
      return addresses[values.statePermanent].cities;

    case "addressLine4":
      const hasAreas =
        addresses[values.state][values.city] &&
        addresses[values.state][values.city].areas;

      if (!hasAreas) {
        return [];
      }

      return addresses[values.state][values.city].areas;

    default:
      return [];
  }
};

Combination of:
Define value boundaries early, keep things flat
Shorter if branch first

The code's still quite difficult to read, but more about that later

@makker

This comment has been minimized.

makker commented Sep 20, 2018

Thanks for the useluf tips. I especially like the 'Define value boundaries early, keep things flat' tip! I'll try to remember that.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Sep 20, 2018

Oh, thanks a lot for your feedback and for taking time to read through some of it :)

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Oct 4, 2018

How many levels of module dependency hierarchy is okay?

👍 Avoid creating dependency hierarchies deeper than 4 levels. Aim for 2-3.

Few times during my career I've come across projects with +10 levels of dependant modules. Most of them have been Java projects.
The issue is, code like that is extremely tiring to follow. I remember writing down the path I had taken whilst reading the code just so I won't forget how I ended up to the current module I'm reading. Not good.

I feel like the best way to approach this would be to try and describe the whole process from start to finish on the main level:


function getCake() {
  oven = getPreheatOven()
  pan = getGreasedPan()
  mixture = mixIngredients(getEggs(), getFlour())
  putMixtureIntoPan(pan, mixture)
  return bakeForMinutes(40, oven, pan)
}

As the alternative (bad) approach I see functions that instead of being written to flow from top to bottom,
are written to flow deeper and deeper the module tree

function getCake() {
  return getBakedCake(40)
}
function getBakedCake(minutes) {
  oven = getPreheatOven()
  pan = getPanWithMixture()
  return bakeForMinutes(minutes, oven, pan)
}
function getPanWithMixture() {
  pan = getGreasedPan()
  mixture = mixIngredients(getEggs(), getFlour())
  return putMixtureIntoPan(pan, mixture)
}

... and so on. The reader would have to go through 3 different files just to know what ingredients are used.

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Oct 10, 2018

How to solve cyclic dependencies between modules

AKA "why am I getting undefined when I import something from my module




Hey! A classic situation with a very generic and bad example (TODO). You've written few modules and soon notice that they depend on each other for one reason or another. Albeit being a **very strong indicator of bad design**, this is a common situation and one that's easy to solve:



user.ts

import { calculatePoints } from './points'
  
interface User {
   createdAt: number; // in milliseconds
   level: number;
}

export function updateUserPoints(user) {
  return {
    ...user,
    points: calculatePoints(user) // dependency to points.ts
  };
}

export function getUserRegisteredInYears(user) {
  return Math.floor((Date.now() - user.age) / 1000 / 3600 / 24 / 365); // Please, don't to this in real life
}

points.ts

import getUserAge from './user'
  
export function calculatePoints(user) {
  return getUserRegisteredInYears(user) * 3; // dependency to user.ts
}

There are few reasons why this happened in the first place:

  • Responsibilities of each module are too loosely defined
  • There are more concepts in play that you first imagined

And few solutions:

Responsibilities of each module are too loosely defined

Is there really a need for a point module? If not, the user module could look like this:

interface User {
   createdAt: number; // in milliseconds
   level: number;
}

function calculatePoints(user) {
  return getUserRegisteredInYears(user) * 3;
}

export function updateUserPoints(user) {
  return {
    ...user,
    points: calculatePoints(user)
  };
}

export function getUserRegisteredInYears(user) {
  return getYears(Date.now() - user.age);
}

However, if you answered "yes", could it be that the points module knows a bit too much about the internals of the user module? Could we, for example, turn the signature of calculatePoints into being:

- export function calculatePoints(user) {
+ export function calculatePoints(userAgeInYears) {

which would simplify the function to:

export function calculatePoints(userAgeInYears) {
  return userAgeInYears * 3;
}

There are more concepts in play from what you first imagined

In this very simple example, the only thing that the getUserRegisteredInYears function really provides us with is the conversion from milliseconds since the creation of the user to years. Could we add another module called time.ts? That way we could drop the getUserRegisteredInYears function altogether and refer to this new module from our points module:

points.ts

import getYears from './time'
  
export function calculatePoints(user) {
  return getYears(Date.now() - user.createdAt) * 3;
}
@rikukissa

This comment has been minimized.

Owner

rikukissa commented Oct 10, 2018

Stop repeating yourself!

You repeating yourself

👎

interface User {
 userCreatedAt: number;
 userPoints: number;
}

👍

interface User {
 createdAt: number;
 points: number;
}

Repeating the word user creates extra fuzz around the words that are actually relevant, without adding any value. The reader should be able to infer the context even when it's few lines above.


👎

it('should add two numbers together', ...)
it('should fail if divider is zero', ...)

👍

it('adds two numbers together', ...)
it('fails if divider is zero', ...)

By rephrasing the test case description, we can keep it as descriptive while making it more concise. Someone could even argue, that a test not only verifies that your code should do something, but that it actually does the thing it was written for.

Don't add unneeded context

@rikukissa

This comment has been minimized.

Owner

rikukissa commented Oct 10, 2018

Always name your constants

Got this idea while writing the bit about cyclic dependencies. You might have come across time calculation like this before:

// Returns true if token is older than 3 days
export function hasTokenExpired(token) {
  return (Date.now - token.createdAt) > 86400000 * 3;
}

The comment is pretty spot on, but we can do better:

export function hasTokenExpired(token) {
  const day = 24 * 3600 * 1000;
  return (Date.now - token.createdAt) > day * 3;
}

Not sure if you disagree, but to me it's much easier to read that something's greater than day * 3 (probably means 3 days) than that something's greater than 86400000 (possibly the mass of the Earth)

Great quote from clean-code-javascript:

We will read more code than we will ever write. It's important that the code we do write is readable and searchable. By not naming variables that end up being meaningful for understanding our program, we hurt our readers. Make your names searchable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment