Skip to content

Instantly share code, notes, and snippets.

@stettix
Last active March 20, 2024 17:45
Show Gist options
  • Save stettix/5bb2d99e50fdbbd15dd9622837d14e2b to your computer and use it in GitHub Desktop.
Save stettix/5bb2d99e50fdbbd15dd9622837d14e2b to your computer and use it in GitHub Desktop.
Things I believe

Things I believe

This is a collection of the things I believe about software development. I have worked for years building backend and data processing systems, so read the below within that context.

Agree? Disagree? Feel free to let me know at @JanStette. See also my blog at www.janvsmachine.net.

Fundamentals

Keep it simple, stupid. You ain't gonna need it.

You should think about what to do before you do it.

You should try to talk about what you’re planning to do before you do it.

You should think about what you did after you did it.

Be prepared to throw away something you’ve done in order to do something different.

Always look for better ways of doing things.

“Good enough” isn’t good enough.

Code

Code is a liability, not an asset. Aim to have as little of it as possible.

Build programs out of pure functions. This saves you from spending your brain power on tracking side effects, mutated state and actions at a distance.

Use a programming language with a rich type system that lets you describe the parts of your code and checks your program at compile time.

The expressivity of a programming language matters hugely. It’s not just a convenience to save keypresses, it directly influences the way in which you write code.

Choose a programming language that has a good module system, and use it. Be explicit about the public interface of a module, and ensure its interals don't leak out to client code.

Code is a living construct that is never “done”. You need to tend it like a garden, always improving and tidying it, or it withers and dies.

Have the same high standards for all the code you write, from little scripts to the inner loop of your critical system.

Write code that is exception safe and resource safe, always, even in contexts where you think it won’t matter. The code you wrote in a little ad-hoc script will inevitably find its way into more critical or long-running code.

Use the same language for the little tools and scripts in your system too. There are few good reasons to drop down into bash or Python scripts, and some considerable disadvantages.

In code, even the smallest details matter. This includes whitespace and layout!

Design

Modelling - the act of creating models of the world - is a crucial skill, and one that’s been undervalued in recent years.

Model your domain using types.

Model your domain first, using data types and function signatures, pick implementation technologies and physical architecture later.

Implement functionality in vertical slices that span your whole system, and iterate to grow the system.

Resist the temptation to use your main domain types to describe interfaces or messages exchanged by your system. Use separate types for these, even if it entails some duplication, as these types will evolve differently over time.

Prefer immutability always. This applies to data storage as well as in-memory data structures.

When building programs that perform actions, model the actions as data, then write an interpreter that performs them. This makes your code much easier to test, monitor, debug, and refactor.

Dependency management is crucial, so do it from day one. The payoff for this mostly comes when your system is bigger, but it’s not expensive to do from the beginning and it saves massive problems later.

Avoid circular dependencies, always.

Quality

I don’t care if you write the tests first, last, or in the middle, but all code must have good tests.

Tests should be performed at different levels of the system. Don’t get hung up on what these different levels of tests are called.

Absolutely all tests should be automated.

Test code should be written and maintained as carefully as production code.

Developers should write the tests.

Run tests on the production system too, to check it’s doing the right thing.

Designing systems

A better system is often a smaller, simpler system.

To design healthy systems, divide and conquer. Split the problem into smaller parts.

Divide and conquer works recursively: divide the system into a hierarchy of simpler sub-systems and components.

Corollary: When designing a system, there are more choices than a monolith vs. a thousand “microservices”.

The interface between parts is crucial. Aim for interfaces that are as small and simple as possible.

Data dependencies are insidious. Take particular care to manage the coupling introduced by such dependencies.

Plan to evolve data definitions over time, as they will inevitably change.

Asynchronous interfaces can be useful to remove temporal coupling between parts.

Every inter-process boundary incurs a great cost, losing type safety, and making it much harder to reason about failures. Only introduce such boundaries where absolutely necessary and where the benefits outweigh the cost.

Being able to tell what your system is doing is crucial, so make sure it’s observable.

Telling what your system has done in the past is even more crucial, so make sure it’s auditable.

A modern programming language is the most expressive tool we have for describing all aspects of a system.

This means: write configuration as code, unless it absolutely, definitely has to change at runtime.

Also, write the specification of the system as executable code.

And, use code to describe the infrastructure of your system, in the same language as the rest of the code. Write code that interprets the description of your system to provision actual physical infrastructure.

At the risk of repeating myself: everything is code.

Corollary: if you’re writing JSON or YAML by hand, you’re doing it wrong. These are formats for the machines, not for humans to produce and consume. (Don’t despair though: most people do this, I do too, so you’re not alone! Let's just try to aim for something better).

The physical manifestation of your system (e.g. choices of storage, messaging, RPC technology, packaging and scheduling etc) should usually be an implementation detail, not the main aspect of the system that the rest is built around.

It should be easy to change the underlying technologies (e.g. for data storage, messaging, execution environment) used by a component in your system, this should not affect large parts of your code base.

You should have at least two physical manifestations of your system: a fully integrated in-memory one for testing, and the real physical deployment. They should be functionally equivalent.

You should be able to run a local version of your system on a developer’s computer with a single command. With the capacity of modern computers, there is absolutely no rational reason why this isn’t feasible, even for big, complex systems.

There is a running theme here: separate the description of what a system does from how it does it. This is probably the single most important consideration when creating a system.

Building systems

For a new system, get a walking skeleton deployed to production as soon as possible.

Your master branch should always be deployable to production.

Use feature branches if you like. Modern version control tools make merging easy enough that it’s not a problem to let these be long-lived in some cases.

Ideally, deploy automatically to production on every update to master. If that’s not feasible, it should be a one-click action to perform the deployment.

Maintain a separate environment for situations when you find it useful to test code separately from production. Avoid more than one such extra environment, as this introduces overheads and cost.

Prefer feature flags and similar mechanisms to control what's enabled in production over separate test/staging environments and manual promotion of releases.

Get in the habit of deploying from master to production from the very beginning of a project. Doing this shapes both your system and how you work with it for the better.

In fact, follow all these practices from the very beginning of a new system. Retrofitting them later is much, much harder.

Technology

Beware of hyped or fashionable technologies. The fundamentals of computer science and engineering don’t change much over time.

Keep up with latest developments in technology to see how they can help you, but be realistic about what they can do.

Choose your data storage backend according to the shape of data, types of queries needed, patterns of writes vs. reads, performance requirements, and more. Every use case is different.

That said, PostgreSQL should be your default and you should only pick something else if you have a good reason.

Teams

Hiring is the most critical thing you do in a team, so allocate time and effort accordingly.

Use the smallest team possible, but no smaller.

Do everything you can to keep the team size small. Remove distractions for the team, delegate work that isn’t crucial, provide tools that help their productivity. Increasing the team size should be the last resort.

Teams need to be carefully grown, not quickly put together. When companies brag about how many engineers they’re planning to hire in the near future, this is a massive red flag.

New systems are best designed by a small numbers of minds, not committees. Once the structure of the system is clear and the main decisions made, more people can usefully get involved.

Code ownership may be undesirable, but it’s important to have someone who owns the overall vision and direction for a system. Without it, the architecture will degrade over time as it gets pulled in different directions.

Prefer paying more to get a smaller number of great people instead of hiring more people at a lower salary.

Use contractors to bring in expertise, to ramp up a project quickly, for work on trial projects, or to deal with specialised work of a temporary nature. Don’t use contractors as a substitute for ordinary staff. (I say this as a contractor.)

If you’re hiring contractors because you can’t get permanent staff, you’re not paying your permanent staff enough.

As a team lead, if things go well, give your team the credit. If things go badly, take the blame yourself.

A team member who isn’t a good fit for the team can kill the performance of the whole team.

Inter-personal conflict is likewise poison to a team. While it may be possible to resolve underlying issues, if it proves difficult, it’s usually better for everyone to accept this and move people on.

Many people do “agile” badly, but that doesn’t mean that “agile” is bad.

Out of all the agile practices commonly used, estimating task sizes and trying to measure project velocity is the least useful. (Its utility is often less than zero).

People

Be kind towards others, everyone faces their own battles no matter how they appear externally.

Be kind to yourself. What you’re doing is hard, so accept that it sometimes takes longer than you’d like and things don’t always work out.

Don't let your own desire to get things done quickly turn into undue pressure on your peers.

The more certain people are about their opinions, the more you should take them with a pinch of salt.

Imposter syndrome is a thing, as is the Dunning-Kruger effect. So judge people on their contributions, not on how confident they seem.

Personal development

Always keep learning.

Read papers (of the scientific variety).

Read the classics. So many “new” ideas in software development have actually been around for decades.

Rest. Don’t work more than 8 hours a day. Get enough sleep. Have interests outside your work.

Work efficiently instead of very long days. Try to remove distractions as much as possible, for example pointless meetings. (As an aside, some meetings are actually useful.)

When looking for a solution to a problem, don’t just pick the first one that comes to mind. Learn to look for alternatives and evaluate them to find the best choice.

Don’t work in isolation. Share ideas, talk new ideas and designs over with your peers.

Don't be afraid to have strong opinions, but listen carefully to the opinions of others too.

@stettix
Copy link
Author

stettix commented Feb 8, 2020

Out of all the agile practices commonly used, estimating task sizes and trying to measure project velocity is the least useful. (Its utility is often less than zero).

so how do you plan the project? This usually needed for the higher stakeholder

In my experience, gut feel is better than any system of calculating story points and trying to linearly extrapolate that to future work.

But any attempt at estimating software projects is bound to be very inaccurate. So I'd say the best approach is to educate stakeholders to accept this uncertainty, showing them that you can deliver incremental value, and let them steer the direction in choosing the priority of upcoming features.

@stettix
Copy link
Author

stettix commented Feb 8, 2020

This gist is a great read, and hits near home! Thanks for putting it together. Regarding the "Design" section, you make emphasis in "modelling", and this following sentence caught my attention:

Modelling - the act of creating models of the world - is a crucial skill, and one that’s been undervalued in recent years.

Do you happen to have some resources of choice for improving this skill?

Thanks for the kind words @primercuervo. And that's a very good question!

The Domain Driven Design community is a good source of material on domain modelling. See for example Scott Wlaschin's Domain Modeling Made Functional.

Otherwise I struggle to find good texts that teach the art of modelling. As I say, I think this is an area that doesn't receive enough attention! I mostly learnt what I know about this the hard way, through work on various projects. I also spent quite a few years working with semantic web technologies. The book Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL is a good one for this area, but is probably too specialised for general purpose modelling.

It's probably worth looking back in time to older sources for inspiration. "Data and Reality" by William Kent (published in 1974!) is a great source of information (if not a how-to book). Books from the database design community may be useful, but I don't know any of theses myself.

This isn't a very satisfying answer I'm afraid - If anyone has suggestions here I'd be very interested to know!

@manawardhana
Copy link

Wow, This is an excellent compilation of a library of books into a couple of pages. Each line can be expandable into its own book. Thanks for sharing. I believe every developer/architect should revisit this time to time.
On

Resist the temptation to use your main domain types to describe interfaces or messages exchanged by your system. Use separate types for these, even if it entails some duplication, as these types will evolve differently over time.

I know the above point is the accepted norm and it is safe. I don't disagree.

But this is one of the most boring piece in Software Development. The alternative is NOT using domain objects for messaging. We have to rethink the relevance of our habits of using types and OOP which is a separate discussion.

GraphQL might have solved this to some extent (not for everybody nor everything). But I guess there are more alternate approaches.

@erebos12
Copy link

Something I would add to your list:
Fundamentals: Don't repeat yourself! ;-) Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

@GrayStrider
Copy link

GrayStrider commented Feb 18, 2020

if you’re writing JSON or YAML by hand, you’re doing it wrong

Can someone comment on this? I have my configs in JSON and YAML, how would one do it better? Or am I misinterpreting the message here

@stettix
Copy link
Author

stettix commented Feb 18, 2020

if you’re writing JSON or YAML by hand, you’re doing it wrong

Can someone comment on this? I have my configs in JSON and YAML, how would one do it better? Or am I misinterpreting the message here

Sure thing @GrayStrider! My view on this is that for genuine configuration, i.e. settings that vary between each installation of a system or service, storing these in config files in whatever format is fine. This really should be a very small number of settings though.

What I see a lot of, is that people use "configuration" files to do much more than this, for example describing the cloud infrastructure that their system runs on, and how different components inter-connect. For large systems, this can result in many thousands of lines of YAML/JSON or similar (I've seen cases of six figures!). This is when it gets very problematic. JSON or YAML are clearly not very expressive, they have no type system, and very poor mechanisms for sharing common code and building abstractions. This leads to lots and lots of duplication (copy & paste), for example.

Sometimes people try to alleviate these problems by building on top of the JSON/YAML formats, for example using templating engines, or adding basic facilities for variable substitution and so on. These are usually terribly clunky, and a far cry from using a real programming language. And when was the last time you saw unit tests for such configs anyway?

What I would much rather do, is describe all aspects of my system as code, using a high level, functional language with a good type system, in a high level domain model. This description can be interpreted to produce concrete implementation, either in code, or in the config file formats needed to make things happen (e.g. CloudFormation templates, Terraform scripts etc). Apart from saving me the pain of writing JSON or YAML by hand, this also encapsulates implementation details about the infrastructure, which makes for a system that's much easier to maintain and evolve over time. And it can allow multiple implementations, e.g. for running a system locally vs. on a cloud provider, or it can support multiple cloud providers, and so on.

But as I said, most people don't do this (yet), so don't feel to be if you don't either... :)

@GrayStrider
Copy link

Would love to take a peek at some open-source examples of this misuse, @stettix, if you happen to know of any. I had no opportunity to work on such large systems yet, but indeed, it seems only reasonable to convert the configs into more dynamic formats once it's clearly needed

@stettix
Copy link
Author

stettix commented Feb 23, 2020

Would love to take a peek at some open-source examples of this misuse, @stettix, if you happen to know of any. I had no opportunity to work on such large systems yet, but indeed, it seems only reasonable to convert the configs into more dynamic formats once it's clearly needed

@GrayStrider : I don't know of any such application I'm afraid. This is a common problem: most open source projects are libraries, frameworks and middleware, I don't know of many examples of actual applications and real systems of the type that exist around every enterprise. So it's difficult to see what such code looks like without actually working for lots of companies!

@bjartek
Copy link

bjartek commented Dec 17, 2020

Add a section about meetings @stettix? You say some are useful. Some wisdom here would be very nice to have. From my perspective

  • Meetings should have an agenda with background material sent to participants well before the meeting
  • Set of time at the start of the meeting for participants to catch up on background material (some people never have time to do this up front)
  • Meetings should be used to decide, not long discussions with few participants (they are called workshops)
  • Meetings are not workshops
  • Meetings should have a referer that will make a summary of the meeting available in wiki or similar.
  • Writing down where you disagree in meetings in crucial for paper trail.

@stettix
Copy link
Author

stettix commented Jan 13, 2021

Add a section about meetings @stettix?

That's probably worth a post by itself... maybe you should write it, @bjartek? :)

@bjartek
Copy link

bjartek commented Jan 13, 2021

Good point, let me see if I can brainstorm something :)

@EugeCos
Copy link

EugeCos commented Feb 8, 2021

Some very good piece of advice! Many thanks for compiling it an sharing.

@ondrasek
Copy link

Quite well written. The only point I disagree with is YAGNI. I believe that people very often misunderstand, what YAGNI is about. In my experience, YAGNI should mostly relate to over-engineering, but is very often applied to business requirements. If we have a clear business need which we know we will prioritize 2 months down the road, we shall take it into account. Not every system has the same cost of change and not everything is a simple web application which can be easily refactored.

@stettix
Copy link
Author

stettix commented Feb 11, 2021

@ondrasek

The only point I disagree with is YAGNI. I believe that people very often misunderstand, what YAGNI is about. In my experience, YAGNI should mostly relate to over-engineering, but is very often applied to business requirements

That is very much what I mean by it as well, keeping the implementation as simple as possible.

@pbernet
Copy link

pbernet commented Jan 14, 2022

Thx for sharing. The one I am struggling with is:
...but all code must have good tests.

IMHO this should be:
...but the important parts of the system must have meaningful tests.

This is because we developers have a tendency to see green bars of passing "happy path tests" which were created in a hurry to get a decent score on the test coverage.

@flakey-bit
Copy link

flakey-bit commented Aug 2, 2022

At the risk of sounding like a total fanboy, I agree with just about everything you've written.

One thing that I noticed was missing - do you believe in the "fail-fast" principal?

Obviously,

Build programs out of pure functions. This saves you from spending your brain power on tracking side effects, mutated state and actions at a distance.

&

Use a programming language with a rich type system that lets you describe the parts of your code and checks your program at compile time.

(which both essentially amount to "leaning on the type system") are my first preference 👍, but if your code finds itself in a situation it never expected to be in (preconditions violated) then it's almost always better to blow up than try to continue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment