Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Refactoring for maintainability

Refactoring for maintainability

There is no "right" way to do anything. Any best practice exists because it's been found by someone to be better than what they were doing before. Often there's a reason that one way is better... but no way is the perfect way. Everything has pitfalls.

If you've been working on a codebase — any code base — for more than a year, you have likely run into a few pitfalls. The decisions we make today have downstream consequences. It's difficult for young developers to tease out which choices will cause big problems in the future. It's difficult for seasoned developers to distinguish between how they usually do it and "the best way".

Best practices change all the time. Like any other type of technology, people are constantly finding ways to improve code. It can be a full time job keeping up with new developments... and emerging best practice don't always stick around. Anyone invested in the early days of ES6 may remember the hype around decorators. At the time they were the future. Today, they are an anti-pattern.

However, all hope is not lost. I like to think of it as being in the practice of refactoring. Refactoring is the practice of applying best practices. Want to keep your code base fresh? Keep your focus on constant improvement. Want to know which best practices are likely to stick around? Ask yourself if it will be easy to refactor later.

Key idea: Code that is easy to refactor is easy to maintain.

Rules of thumb

I like to keep some rules of thumb to help measure if a change is an improvement. How do I know if my code is refactorable? These are pretty generic rules but they cut to the core of what our code needs to be.

  1. Is it testable?
  2. Is it readable?
  3. Is it traceable?
  4. Is it editable?

Testability

Is it testable?

This is the driving force behind the functional programming revolution in JavaScript. Whether or not you understand the full history or depth of functional programming versus object oriented programming you can plainly see that simple functions are easy to write tests for.

Key idea: easy-to-write tests leads to easy-to-refactor code.

Anyone who is even vaguely familiar with Java knows about object oriented programming. For the past 20 or 30 years, the programming world has been embracing OOP at every turn. One of the biggest improvements in PHP in the early 2000s was the addition of classes. JavaScript was slow to adopt formal classes. In their absence the community invented classical structures on their own. Now JavaScript, with ES6, has them as first-class citizens.

There's only one problem: big classes are difficult to test. Class methods can become tightly interwoven and variables are difficult to trace. To test an individual method in a class might require setting up a complicated test suite to ensure all of the class properties are in the right state. Classes are not always easy to test.

By contrast, a pure function (one that get all variables as arguments and doesn't mutate anything) is very easy to test. This is so revolutionary that a framework like Redux and a template engine like React have taken over the world in a short amount of time. By contrast, Ember, a robust framework that is built using old-school OOP principles, fought for years to reach wide adoption but never really broke through.

The difference between a broad, sprawling framework like Ember and a terse, minimal framework like Redux is the difference between OOP and FP. Classes contain methods — functions that belong to that class. In functional programming, functions stand on their own; any function can use any other function. While OOP has an organizational appeal (it's clear where things go) its rigidity can prohibit sharing. The very notion of private methods and inheretance is the antithesis of functional programming.

What does any of this have to do with writing tests? Well, it's much easier to write tests around stand-alone functions than it is to write them for class methods. The emphasis on pure functions makes it even easier to test. With a pure function we only need to test that the inputs create the desired outputs. With a pure function we have a dramatically reduced testing suface — we only need to consider a single function while writing our test.

This has huge downstream consequequences for refactoring. We can update or eliminate a single function and have a narrow set of tests that need updated as a result. This is not always true in a densely packed class (or a huge, sprawling function), where many methods and properties can interoperate with each other. The perceived headache of reading through a huge 500+ line test file that relates to a huge 500+ line class is enough to discourage an engineer from even trying.

Key idea: When writing tests is easy, rewriting tests is easy too!

Testing a class

This is a simple class with a few setters. You have probably worked with classes that are more complicated than this. You can see that testing it isn't impossible but there is some boilerplate that needs to be set up in order to test the inner method.

class Hard {
  hard() {
    return this.a + this.b
  }
  setA(value) {
    this.a = value
  }
  setB(value) {
    this.b = value
  }
}

describe('hard', () => {
  it('adds two things', () => {
    const hard = new Hard()
    hard.setA(2)
    hard.setB(2)
    const result = hard.hard()
    expect(result).toBe(4)
  })

  it('adds two strings', () => {
    const hard = new Hard()
    hard.setA('two')
    hard.setB('two')
    const result = hard.hard()
    expect(result).toBe('twotwo')
  })
})

Testing a function

At the core of the class above is a simple method for adding two variables. You can see below that the core logic is easy to extract out into a stand-alone pure function. Testing that function is very straight-forward.

Key idea: simple functions are very easy to test.

const simple = (a, b) => a + b

describe('simple', () => {
  it('adds two things', () => {
    const result = simple(2, 2)
    expect(result).toBe(4)
  })

  it('adds two strings', () => {
    const result = simple('two', 'two')
    expect(result).toBe('twotwo')
  })
})

Readability

Is it readable?

One thing that came out of the Ruby world was trick programming. The ideas of chainable code leapt from Ruby into JavaScript in the form of jQuery. These days jQuery is a bit of a joke because its core utility — normalizing the DOM — is no longer necessary. However, jQuery had an out-sized impact on the development of ES6, namely in the realm of chainable functions.

Why am I talking about chainable functions in a section about readability? Because they are a neat trick that programmers love to abuse (see a problem with promises). It's possible to go crazy and make a big chainable mess that is difficult to untangle.

In Ruby, and in trick programming in general, there is a pressure to write everything tersely. In trick programming, a one liner is better. Terse programming has its benefits but overdoing it can lead to unreadable code.

One trick that new programmers love to abuse is ternary operators. They allow you to write an if statement in one line. How cool! This leads to rules like, "never nest a ternary". A few ternaries deep and it's impossible to tell what's going on! (Or, maybe nested ternaries are great!)

You can read more about "callback hell" for some amusing history of why promises are so popular and why chainability is important. You can read more about async/await functions for some even more amusing notes about why chainability is falling out of favor.

One-liners can obscure logic

If statements are a particular issue. Sometimes you need to check for a whole list of conditionals. The pressure to keep things in one line can lead to a huge if statement like this:

// This is hard to refactor.
if ((a !== undefined && typeof b === 'number' && a > b) && (c.contains(d) || e === f || a !== d)) {
  return a + b
}

Note: the code above is totally random and doesn't make any sense.

Self-documenting code expands refactoring surface

There is a notion in programming that your code should be self-documenting. Ever heard of javadocs? The idea was that you'd put a comment at the top of every function to document what variables it accepted and what values it returned. This has largely fallen out of favor because it's totally impractical. If your code is well-written you should be able to see the arguments and return values by reading the code. This is one reason people embrace static typing tools like typescript and flow: it allows you to annotate your code.

To that end, there is a real benefit to pulling out the core concepts of your function into well-named variables. This can help keep your logic terse and also aid in refactoring. As your function matures, you might consider moving some of those variables out into stand-alone functions.

Below we can see that — while it's still difficult to understand what's going on here — there are hints as to how the logic should work. Importantly, we can clearly see that the point of all that logic is to decide if we shouldAdd a and b. When it comes time to refactor this code we'll hopefully have an idea of where to start.

We've also expanded our refactoring surface immensely. Perhaps we don't hacve time to break each of those variables out into their own functions but a future developers could do this easily. Breaking small snippets of logic into small helper functions makes it easier to test each individual piece of logic.

// This is "readable"
const hasValidInputs = a !== undefined && typeof b === 'number' && a > b
const selected = c.contains(d)
const active = e === f
const notCurrent = a !== d
const shouldAdd = hasValidInputs && (selected || active || notCurrent)
if (shouldAdd) {
  return a + b
}

Just use prettier

These days we no longer have to bikeshed about the finer points of code-formatting. In JavaScript there are widely adopted tools like eslint that give you Microsoft Word style underlines in your code that point out formatting mistakes. More recently, tools like prettier have formalized most of the common formatting rules. So, if you want to make your code more readable by making sure it's consistently formatted, just use prettier.

Traceability

Is it traceable?

A sub-set of reading code is tracing the logic in the code. As you read through a file you need to understand the logic. This means following variables as they are passed from function to function. Anything that gets in the way of tracing where things come from is a problem.

Arg spreading is hard to trace

There are times when arg spreading is the best tool for the job. But typically you should avoid it unless you are doing something that specifically requires it.

In the example below we are spreading the arguments in our mapDispatchToProps function. Now I need to look in my component to see when onBlargle is called and look in my actions file to see what args blarge takes. This is another area where our code is no longer self-documenting.

Imagine: what if I was searching for every time we pass filters around? Does blargle deal with filters?

import { blargle } from 'modules/app/actions'

// what arguments blargle accept?
const mapDispatchToProps = (dispatch) => ({
  onBlargle: (...args) => dispatch(blargle(...args))
})

Prop spreading is hard to trace

Prop spreading is an invaluable tool for higher order components but it can seriously hinder traceability. When you spread props you make it nearly impossible to track down where a particular prop is coming from.

Imagine: what if I was searching for every time we pass filters as a prop? Does Thing deal with filters? Does it get filters from My component or somewhere else?

import React from 'react'

import Thing from './Thing'

// What props does My expect?
const My = (props) => (
  // What props does Thing receive?
  <Thing {...props} />
)

Import star is hard to trace

This one is a bit of a 50/50. One one hand it is very convenient to grab all of the exports from a module. And you can provide a kind of namespace to your imports, so that deeper in your code you can have some notion of where that was imported from. However, if we're focusing on readable and maintainable code we're probably trying to keep our file lengths small. In a 50-line file it should be clear where everyhting came from, namespacing or not.

Consider an example of using import *:

// What actions am I using?
import * as Actions from 'modules/app/actions'

// Although, it's clear where blargle comes from
Actions.blargle('hi!')

Versus an example of using a declarative import:

// I'm using blargle!
import { blargle } from 'modules/app/actions'

// But I have to look at the top of the file to figure out where this came from
blargle('hi!')

Large files are hard to trace

  • Over 100 lines is hard to trace
  • Large files tend to have more imports
  • Less than 50 lines is easy to understand
  • Small files have focused imports

Editability

Is it editable?

At the most basic level, refactoring code is simply editing code. It seems too obvious to mention but it becomes really important. Sometimes the reason to avoid a programming trick is enhance readability, but other times it's to enhance editability. When you write code you need to ask yourself, "how hard would it be to add something to this?"

Refactoring surface

What may not be immediately obvious in discussions about readable code is that the whole point to make the code easier to refactor. The idea of a "refactoring surface" is that there is an obvious place to put your cursor in order to make a change. Terse, trick code can mean you have to make a bunch of key strokes to start adding new code. So, a large refactoring surface means you have lots of reasonable places to insert new things without having to change a bunch of code.

This is the main idea behind dangling commas. If you always have dangling commas then it's easier to add new things to the list. And adding new things takes fewer edits. A simple dangling comma increases the refactoring surface.

Key idea: editable code provides a larger refactoring surface.

Premature optimization

It is possible to write very terse code, but it's not always a good idea. Imagine that I need to add a console.log to this code?

This code has no editing surfaces.

// How do I console.log(name)?
const My = ({ name, onBlargle }) => (
  <button onClick={onBlargle}>{name}</button>
)

I can de-optimize the code to get my editing surface back.

// How many lines did I have to change?
const My = ({ name, onBlargle }) => {
  console.log(name)
  return (
    <button onClick={onBlargle}>{name}</button>
  )
}

Leaving an editing surface

I might want to write terse things.

const mapStateToProps = (state) => ({
  name: selectName(state)
})

But I need to make changes over time. What if I need to start selecting the right name given the ID in the URL?

const mapStateToProps = (state, ownProps) => {
  const { match } = ownProps
  const { id } = match.params
  return {
    name: selectName(id)(state)
  }
}

And later I realize that I should let the selector grab the ID from ownProps. Should I kill the return? What if...?

const mapStateToProps = (state, ownProps) => {
  return {
    name: selectName(state, ownProps)
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.