Skip to content

Instantly share code, notes, and snippets.

@arthurnap24
Created February 5, 2023 17:24
Show Gist options
  • Save arthurnap24/8e2d2b3a849c0b73294483f80cc6da62 to your computer and use it in GitHub Desktop.
Save arthurnap24/8e2d2b3a849c0b73294483f80cc6da62 to your computer and use it in GitHub Desktop.

Clean Bash

10/20/2022

Consider this script:

# file name: docker_publish_os.sh

#!/bin/bash

OS=$1
if [ -z $DOCKER_USER ] || [ -z $DOCKER_PASSWORD ]; then
    exit 0;
fi
docker login --username $DOCKER_USER --password $DOCKER_PASSWORD
if [[ $OS == "alpine" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "ubuntu" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "all" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    exit 0
else
    echo "Error"
    exit 1
fi
docker tag $IMAGE $IMAGE_TAG
docker push $IMAGE_TAG

This seems readable enough especially if you have ample experience with Bash (and Docker). However these kinds of scripts overlook some issues:

  • The script name almost describes what the script does, but not really.
  • You have to read the code or external documentation to understand what the script does.
  • The fact that programmers typically specialize in different tech stacks.

Let's tackle the issues from top to bottom and see what we can do to address them.

Naming

I believe that programmers should have an idea of what a script should do, and the name of the script should match what it does. No other questions need to be asked as we see the title other than maybe granular details.

In our example, the file name is docker_publish_os.sh. There are inherently 3 questions that come to mind as we read this title.

  • What is docker?
  • What does it mean to publish something?
  • What does OS (operating system) have to do with publishing?

The first one is easily addressable: just Google what docker is. But then if we Google further we might see that Docker is just a flavor of a containerization tech stack. It might be a bit overkill to consider that we may swap Docker with other Open Container tools. No apparent improvement jumps to my mind after answering this question.

After learning about Docker through Googling and reading the script, I see that the script pushes the image to a repository determined by the operating system type of the container and under some specified account.

Alas, we found a potential improvement to the naming. We might morph the name as follows:

  • docker_publish_os.sh - original
  • docker_image_push_os.sh - let's define what publishing means
  • image_push_os.sh - let's consider that we might swap Docker with something else
  • image_push.sh - hmm, can we make the name sound like a verb?
  • push_image.sh - I like it!

New name is push_image.sh.

What does it do, are there docs?

Let's say we hire a programmer. The new hire:

  • specialize in Python and Java
  • needs to use the script as part of a new CI/CD pipeline

When the script is ran it will first not work. The new hire does not know:

  • DOCKER_USER and DOCKER_PASSWORD have to be defined as environment variables.
  • The script takes a command line argument that specifies the OS to tag the image to publish with.

New hire opens push_image.sh, and the first thing that pops up are Bash specific syntax. A slew of questions come:

  • What is $1?
  • What is -z?
  • Where are DOCKER_USER and DOCKER_PASSWORD defined?

Google, google, google. Takes about 5 mins? How about a junior engineer? 30 minutes? 1 hour?

For this we have an easy fix. A help script flag that shows us how to use the script.

# file name: docker_publish_os.sh

#!/bin/bash

# DOCKER_USER and DOCKER_PASSWORD as environment variable
OS=$1
if [[ $1 == "-h" ]]; then
    print_usage
    exit 0
fi
if [ -z $DOCKER_USER ] || [ -z $DOCKER_PASSWORD ]; then
    exit 0;
fi
docker login --username $DOCKER_USER --password $DOCKER_PASSWORD
if [[ $OS == "alpine" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "ubuntu" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "all" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    exit 0
else
    echo "Error"
    exit 1
fi
docker tag $IMAGE $IMAGE_TAG
docker push $IMAGE_TAG

print_usage() {
    echo -e """Usage:
\tpush_image.sh <operating-system>
Supported operating-system values:
alpine - tags the local alpine-build-machine:latest
ubuntu - tags the local ubuntu-build-machine:latest with \$DOCKER_USER environment variable
all - tags all listed supported images with \$DOCKER_USER environment variable
The Docker commands in this script requires these environment variables:
DOCKER_USER=<your Docker account's username>
DOCKER_PASSWORD=<your Docker account's password>
"""
}

Noted, let's add usage strings triggered by a help flag. Now we don't have to worry about another programmer getting bogged down by trying to understand what the script does.

Not all programmers use Bash

Let's say this same new hire would have really wanted to understand the code. Perhaps take ownership of it as part of maintaining the code base for the team.

Since the new hire in all intent and purposes will have a hard time coding in Bash, I as a programmer writing this code should ensure that readers of my code should not experience this issue. How do we do this? Well let's say the code should read like a general language (English in my locale).

So let's say that the script really does two things:

  • Authenticate to remote Docker repository
  • Push the image to the repository

The first lines I should see in the code should look something like:

if [[ $1 == "-h" ]]; then
    print_usage
    exit 0
fi
OS=$1
docker_authenticate
docker_push $OS

By extracting the main idea of the script. We can see that reading the code bottoms out faster. We can quickly infer that the remaining lines in this script are just the details of what is in the main idea. Furthermore we can reduce the size of the functions we write with this same concept of extracting main ideas from code. Doing this iteratively, we come up with small functions that are easier to name, more expressive, and allows our readers to infer what the functions do instead of reading more code.

This is an idea directly from the Clean Code book by Robert Martin which says that functions should be small and do exactly what it sounds like it does. So let's break down the last part of the script

if [[ $OS == "alpine" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "ubuntu" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
elif [[ $OS == "all" ]]; then
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
    exit 0
else
    echo "Error"
    exit 1
fi
docker tag $IMAGE $IMAGE_TAG
docker push $IMAGE_TAG

Hmmm..., $IMAGE repeats too many times, all this really does is push images tagged by the OS. How about I define push_image() that takes an argument for the OS. The result is:

push_image() {
    OS=$1
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
}

# logic block:
if [[ $OS == "alpine" ]]; then
    push_image $OS
elif [[ $OS == "ubuntu" ]]; then
    push_image $OS
elif [[ $OS == "all" ]]; then
    push_image "ubuntu"
    push_image "alpine
else
    echo "Error"
    exit 1
fi

Sweet! We reduced the logic block from 22 to 11 lines!

Now this is where I am going to express my own taste as a programmer. There's another idea in the Clean Code book that functions without parameters should be preferred to functions with parameters. Some of you might disagree, but I would sacrifice more lines of code to maintain push_image() and get rid of yet another cognitive load that the function parameter introduces.

docker_push() {
    OS=$1
    if [[ $OS == "alpine" ]]; then
        push_image_alpine
    elif [[ $OS == "ubuntu" ]]; then
        push_image_ubuntu
    elif [[ $OS == "all" ]]; then
        push_all_supported_images
    else
        os_not_supported_error
    fi    
}

push_image_alpine() {
    push_image "alpine"
}

push_image_ubuntu() {
    push_image "ubuntu"
}

push_all_supported_images() {
    push_image_alpine
    push_image_ubuntu
}

push_image() {
    OS=$1
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
}

os_not_supported_error() {
    print_not_supported
    exit 1;
}

print_not_supported() {
    echo """
The operating system is not supported, please run with -h flag to view all supported OS.
"""
}

Notice that more granular definitions appear further down the script. It's a nice analogy with "digging" down the code. We can reorder the calls of Bash functions so we can take advantage of this feature to make sure that whoever reads the script sees the main idea of the script and just dig down the code. Finally bottoming out to the usage string. Let's see my final version of push_image.sh

# file name: push_image.sh

#!/bin/bash

main() {
    if [[ $1 == "-h" ]]; then
        print_usage
        exit 0
    fi
    OS=$1
    docker_authenticate
    docker_push $OS
}

docker_authenticate() {
    check_has_auth_env_vars
    login
}

check_has_auth_env_vars() {
    if [ -z $DOCKER_USER ] || [ -z $DOCKER_PASSWORD ]; then
        print_usage
        exit 0;
    fi
}

login() {
    docker login --username $DOCKER_USER --password $DOCKER_PASSWORD
}

docker_push() {
    OS=$1
    if [[ $OS == "alpine" ]]; then
        push_image_alpine
    elif [[ $OS == "ubuntu" ]]; then
        push_image_ubuntu
    elif [[ $OS == "all" ]]; then
        push_all_supported_images
    else
        os_not_supported_error
    fi    
}

push_image_alpine() {
    push_image "alpine"
}

push_image_ubuntu() {
    push_image "ubuntu"
}

push_all_supported_images() {
    push_image_alpine
    push_image_ubuntu
}

push_image() {
    OS=$1
    IMAGE="${OS}-build-machine:latest"
    IMAGE_TAG="${DOCKER_USER}/${IMAGE}"
    docker tag $IMAGE $IMAGE_TAG
    docker push $IMAGE_TAG
}

os_not_supported_error() {
    print_not_supported
    exit 1;
}

print_not_supported() {
    echo """
The operating system is not supported, please run with -h flag to view all supported OS.
"""
}

print_usage() {
    echo -e """Usage:
\tpush_image.sh <operating-system>
Supported operating-system values:
alpine - tags the local alpine-build-machine:latest
ubuntu - tags the local ubuntu-build-machine:latest with \$DOCKER_USER environment variable
all - tags all listed supported images with \$DOCKER_USER environment variable
The Docker commands in this script requires these environment variables:
DOCKER_USER=<your Docker account's username>
DOCKER_PASSWORD=<your Docker account's password>
"""
}

main $@

After all the improvements, we can see that:

  • Anyone reading this code needs very little knowledge of Bash since most of the logic blocks are contained in functions whose names are straight English.
  • There is very little cognitive load for programmers to understand what each function does since the concepts are so small per code block.
  • The script further documents itself without the need to mention how it works in a README file.
  • The usage of main() should not introduce further complications as it is a very common function in most programming languages as the first lines of code to be executed in the program.

Let's not forget, scripts are also code. Good code needs to be readable and self-documenting. We should also make it easy for other programmers to read an extend our code. This increases productivity in the future and also makes others lives easier.

I hope you enjoyed this article and found it insightful! Thanks! - AN

@bobbravo2
Copy link

bobbravo2 commented Feb 5, 2023

Another thought that may improve readability -- moving from positional to named arguments? OS=$1 might benefit from some DRYing -- I still struggle with remembering BASH code flow: i.e. returns vs. passing by reference 🤣

@arthurnap24
Copy link
Author

Thanks for the comments @bobbravo2!!! Glad you liked it!

Definitely named arguments are better IMO. But for that I usually use this this getopts utility. But it's still pretty heavyweight at least for this tutorial :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment