- Abstract
- Do I need this?
- One thing at a time
- Wireframes
- Split changes in parts
- Data
- Functions and Methods
- Naming things
- Abstraction
- Data input, validation and handling
- SOLID
- One source of truth principle
- Design Pattern
- Deployment strategies and application requirements
- Contribution Guidelines
- Standards
- More documentation
- License
This is a loose collection of thoughts, ideas and best practices for a software engineer. What is the difference to a developer vs an actual engineer? A developers knows a programming language (very) well. He's very good at implementing a given task within this language. An engineer typically does the work before the code is actually written. The engineer decides why and how someting is implemented. Often, a single person fulfills both roles.
Thinking about ways to resolve an issue with code is different than actually hitting keys on the keyboard to write code. This guide isn't about details for a specific programming language (even though it contains some code examples). Rather it aims to teach you the basic about sfotware engineering.
For every new change ask yourself:
- Do we need this?
- Do we already have a suitable lib or existing code we can reuse?
- How do we implement it?
- 80/20 rule. Only implement a feature request if at least 80% of the users benefit from it
In a perfect world, all of this is discussed in a short standup meeting and documented in a Jira ticket. Also you ask yourself this before you actually start implementing it, not at the end (which sadly happend alot in the past)
- what is your goal?
- How to achieve it?
- Implement it
split all ideas in three parts and think about these questions.
Todo: Add a new section here? There are always multiple ways to implement something. Code should not only work. It needs to be efficient. Asymptotics is important. How does the code scale? How much time should I spend optimizing it if I could solve it by throwing more hardware on it or does it actully matter how fast my code is? If I develop a service that talks to a limiting API, te speed of my code doesn't matter (until someone replaces the API, then you'er screwed)
Example for Asymptotics:
Accept a positive integer as input. sum up all digits from 1 to the input. Doing this with a loop doesn't scale. Add together $input + ($input -1) scales almost linear.
Note: This is useful for planning new features and for debugging issues
- Draw a diagram/mockup/table of whatever you want to change
- Define which data is needed and modified
- Where does the data come from?
- Who has access to this data?
- In how many small tasks can we split this?
- More/smaller issues are better than a few big ones
- This allows many people to work in parallel on an issue, in short timeframes
TAn actual mockup works pretty good for graphical UI changes, but also for database migrations. for API/CLI changes, a diagram (flowdiagram, sequencediagram) is nice.
different commits for changes/enhancements/new features
- Each change should be categorized into those types
- Their workflow varies, so don't mix them up
- Makes review and reverting easier
- Bad example: https://github.com/voxpupuli/puppet-zabbix/pull/430/commits (+1 for splitting into multiple commits, but check the first one)
- Good example: https://github.com/voxpupuli/puppet-zabbix/pull/409/commits many small changes, One in each commit
- You generate any kind of output? Provide it in human readable form if you want, but you have to provide it in machine readable form!
- Separate data from code. This makes management of data way easier. Nobody want's to change an api password thats used in 70 bash scripts. But a single json/yaml file is way easier to parse and update. Also this makes the data more reuseable.
- Kind of represents business logic vs program logic (compared to the roles and profiles pattern)
- every function needs a meaningful name
- every function should only do one thing
- It should be short, around 10-15 lines in ruby
- Always, always test (function|user) input
- rm -rf $var/, steam on linux did this once, also squid rpm on centos6 or 7
- You probably do something wrong if your function gets a boolean as parameter, split it up into two parts. Source
- Every function and variable should be in English, not German
Thesis: there are only two real problems in computer sience
- cache invalidation
- naming things
- off by one errors
https://martinfowler.com/bliki/TwoHardThings.html
solution:
- you need a proper naming convention for variables and functions
- no matter what you choose, it should be consistent in your code and in your team
- Variable?/Class/Method names should never be generic (like i as counter), but specific
for i in $(cat bla); do
echo $i
done
- which file do we read?
- What is i? What are we iterating at?
for ((i=1; i<=COUNT_DRIVES; i++)); do
# some code here
done
- What is i again?
- What is COUNT_DRIVES?
Better:
for ((drive_number=1; drive_number<=AMOUNT_OF_HARDDISKS; drive_number++)); do
# some code here
done
woah woah, fucking complicated topic
- Abstraction is important
- It splits things into multiple pieces
- It allows multiple people to work on it (everybody gets one piece)
- It makes testing way easier
- It adds more boilerplate (looks bad, but that is a good thing)
- It is harder to understand :(
def zbx_check_if_template_is_linked(template_id)
list = @zbx_api.templates.get_ids_by_host(hostids: @zbx_host_id)
list.include? template_id
end
- What the hell is list again?
- Maybe we need the list of templates at another place as well?
def zbx_get_linked_templates
@zbx_api.templates.get_ids_by_host(hostids: @zbx_host_id)
end
def zbx_check_if_template_is_linked(template_id)
zbx_get_linked_templates.include? template_id
end
- Method name should always indicate what it does
- It should be easy to understand, without inline comments
- A function should not exceed 15 lines
- A class should not exceed 100-150 lines
- Use correct datatypes, not everything is a string
- brings more performance, more security
Even strings can be "special strings" if you add new datatypes for them, even if it is just a simple regex
- Use correct array functions
https://dev.to/andrew565/which-array-function-when
https://en.wikipedia.org/wiki/SOLID_(object-oriented_design)
-
One class is only responsible for a single functionality, not multiple
-
https://en.wikipedia.org/wiki/Single_responsibility_principle
-
software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification
-
Provide a solution do extend the code, for example through inheritance
-
Allow others to embed your code as a library (composition)
-
https://dev.to/ruidfigueiredo/why-composition-is-superior-to-inheritance-as-a-way-of-sharing-code
this is important for two topics. Data. You should not replicate data. For example: You've got two confluence pages to track hardware. Which of them is correct? Do you always update both? Probably not. Instead of multiple sources, implement a dataflow. Save it to a database, generate the confluence pages on the fly. Or simply link from one page to the other. Do not replicate data!
The same principle applies for documentation. You write a huge documenation by hand + software. Is the documentation always 100% correct? Do you update it with each software-change? Probably not. This is bad. Generate the documentation based on the code. examples: swagger-ui for java, puppet-strings for Puppet, yard for Ruby, sphinx for Python.
- Observer pattern - notify objects on changes
- Operator pattern - Extend behavior for each inheritance
- Singleton pattern - There can only be one instance of a class
- State pattern - method in an object does something different, depending on the state
- Codebase - One codebase tracked in revision control, many deploys
- Dependencies - Explicitly declare and isolate dependencies
- Config - Store config in the environment
- Backing services - Treat backing services as attached resources
- Build, release, run - Strictly separate build and run stages
- Processes - Execute the app as one or more stateless processes
- Port binding - Export services via port binding
- Concurrency - Scale out via the process model
- Disposability - Maximize robustness with fast startup and graceful shutdown
- Dev/prod - parity - Keep development, staging, and production as similar as possible
- Logs - Treat logs as event streams
- Admin processes - Run admin/management tasks as one-off processes
- use standardized formarts for admin / internal / metadata APIs
- cloudevents for event data formats
- Health Check Response Format for HTTP APIs
- OpenMetrics for metrics
- OpenTracing for distributed tracing in microservices
- Deployment rules - From the one and only Igor Galic
Based on The Twelve Factors, a guideline for proper cloud application, based on community efforts.
Take a look at VirtAPI project contribution guidelines or the Vox Pupuli guidelines. Define guidelines as early as possible for your project. It doesn't matter if it's a FOSS project or something within your company that will never be released to the public. Somewhere in the future somebody wants to interact with your codebase. This will be a lot easier if you have proper documentation already in place.
many people worked years to define standards, RFCs and protocols to unify software, communication acress different programs and software development. Instead of reinventing the wheel or following the NIH syndrome, use existing standards:
- Specification for building APIs in JSON
- RESTful API Guidelines from Zalando
- OpenID for authentication
- Open Policy Agent for authorization
opentracing/openmetrics
A loose collection of useful links:
- my git cheatsheet
- My SysAdmin Manifest
- Googles SRE book
- Short quiz about software engineering
- An introduction to distributed systems
This document is licensed under CC BY-SA 4.0. The document was created by Tim bastelfreak
Meusel.