Skip to content

Instantly share code, notes, and snippets.

@frobnitzem
Created September 14, 2023 06:16
Show Gist options
  • Save frobnitzem/ff9b120d0f060ba37135e9565048e002 to your computer and use it in GitHub Desktop.
Save frobnitzem/ff9b120d0f060ba37135e9565048e002 to your computer and use it in GitHub Desktop.
Sane Conventions for Developers

Sane Conventions for Developers

These conventions are guidelines to help developers stay productive. Although following them does not guarantee sanity, not following these convention has been known to produce undefined behavior. Incidentally, these also make a good scoreboard for assessing a project's maintainability.

Usability

  • Do not allow your project to grow beyond a few source files focused on a single goal. Symptoms are packages containing code that (while originally doing real work) now also writes container definitions, arranges shell variables, moves output files around, plays strategy games, integrates with the cloud in any way, or attempts to send/read email. If the real work were done already, some other program could be doing all those other things and calling your program when it gets more work.
  • Document your code before you write the code.
    • Design the central activities and data structures first.
  • Include an installation script. Not everyone knows how to use the ant build tool with a hackage project while writing a site-specific config file for your package. In fact, I don't think anyone does.
  • DRY with a twist: Don't provide two ways to do the same thing. One of them will do it ever so slightly differently (or, more likely the user will forget how to do it consistently), and all hell will break loose. The support team will be called in. Extra FAQ-s written. New features will be added to address the lack of consistency. The vi versus emacs debates will cause a food-fight in the cafeteria. You get the idea.

Developability

  • Do not allow your project to grow beyond a few source files focused on a single goal. Split off helper libraries. Build "high-level" packages on top of your project. Refuse to add functionality that doesn't address your original goal. If all else fails, abandon the project.
  • Use version control, make incremental changes, run tests before committing
  • Enable CI where possible
  • Do spell out your project's goals (and non-goals) clearly so that you find like-minded developers while ensuring folks whose ideas don't mesh with your project don't waste their time on it.
  • Comment your functions and data, especially the more clever ones. Add some references to how the thing is supposed to be used. Doc-tests are amazing. Can you have all your tests just call your documentation? That would be fantastic. In the future. we'll have the AI write code based on our documentation. Spectactular.
  • Never incorporate hardware-specific headers, data or function calls (e.g. non-POSIX interfaces) into the main body of your code. Keep it on the side, in its own file(s). The golang convention here is a good one. The exception to this is when you are building an entire package specific to one hardware type (e.g. programming an Arduino or a USB missile launcher).

Code Correctness

  • Check types and use linters wherever possible.
  • Save and run your tests as explicit tests and include them with the package.
  • Default to defining variables and data-structures as immutable unless there is a compelling argument that modification is needed. Use as little mutable state as possible. This will decrease the number of code paths exponentially.
  • Define and use high-level data structures, e.g. stacks,
  • Avoid global variables. If you do use them, understand all the import and name clash complications that can occur.
  • Do all validation on all input data as it enters your code. Avoid scattering checks throughout your process. An exception to this rule is data that is correct until proven otherwise by an external system. Use exceptions (or correct chains of return codes) and transactions to roll-back the program state in such cases.

Paths

Many "helpful" applications attempt to put a boatload of crap in $home (old python, julia, spack, etc.), and don't clearly label configuration, cache, output files, or other random run-time information. This leads to an unusable $home directory. What was supposed to be a curated location for archival information becomes a dumping ground that is constantly running out of space. Backups are expensive, and backing up a cache makes no sense. Most application teams are unaware that this is a constant source of user frustration, especially when there is no way to change the default. A telltale symptom of this is when developers ask users to work around issues by changing the $HOME environment variable. Innovative approaches like $VIRTUAL_ENV actually line up with the recommendations below.

If you must create persistent file state -- config files, accumulated outputs, caches, etc. then

  1. have at most one file (the main config. file) that defaults to $home/.config/program-relevant-name.
  2. Document that file, and allow its location to be set dynamically by a program option (highest priority) or an environment variable (fallback)
  3. For other files, see the general guidance below. It is acceptable to use any path (and sub-paths) in arbitrary ways provided that path appears somewhere in the main config file. This ensures the user can easily find and set these other locations as needed.

General guidance - system vs. local directories:

  • system directories are named as /(etc share bin lib include src)/<project>/files (for many files) or /(etc share bin lib include src cache)/<project> (single-file).
  • Note the addition of cache, meant to form a backing store that can be deleted file-by-file without consequence.
  • local directories are named the same, but prefixed with an arbitrary path (e.g. /<prefix>/etc/<project>.config.
  • Fallback search MAY be used to search for files in the system directories if they are not present in the local directories. Cases where fallback search is used MUST be documented.
  • With the exception of files under 'cache', all files under both types of directories MUST NOT be modified during normal program operation, but MAY be modified by installation/upgrade or other package-level maintenance.
  • With the exception of files under 'etc', files in these directories MUST NOT be modified only for the purpose of listing dependency file locations or configurable program options.
  • Any other kind of run-time program data not mentioned above SHOULD NOT be stored in any of the directories above. It belongs in (a suitable sub-path of) /tmp or in the user's working directory.

Environment Variables

Programs MUST document:

  1. what environment variables they read and how they interpret that information
  2. what environment variables they set (for the purpose of passing on to sub-programs)

Programs MAY document what environment variables they unset. If a program unsets all environment variables, the following SHOULD be defined when running sub-programs, since they are conventionally used by many programs:

  • HOME - path to the user's home directory
  • USER - name of the current user
  • PATH - colon-separated list of paths to search for executables
  • SHELL - full path to user's shell
  • EDITOR - default file editing program to use
  • TERM - type of terminal determining what characters are valid

There's a common practice of using environment variables to store secret data for web-applications (or other applications that must work with secret keys, passwords, etc. The applications doing this are not really secure against a read of /proc/<pid>/environ or similar, which apparently isn't likely changed even if the application updates its environment variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment