paweloque/resiroop-team-handbook-2017-08-15.md

## resiroop-team-handbook-2017-08-15.md

      
    Raw
  

              resiroop-team-handbook-2017-08-15.md
            
          
    Resiroop Team Handbook

Contents


Preface
PART 1: How we work

Agile Mindset
Roles
Development Flow
Rituals


PART 2: Continuous Delivery

Version Control
Build Automation
Continuous Integration
Test Automation
Deployment Automation
Infrastructure Automation
Deployment Pipelines
DevOps
Technical Excellence


References
Contributing

Preface

Welcome aboard! This handbook is about how we agree to work together. Don't expect it to be complete or correct. Rather use it as a guide to make your first few weeks as a new joiner a bit easier.
Our main goal is to answer questions that you might have, point you in directions that we've found useful ourselves and highlight some things that are important to us.
Although we want to keep this handbook short, it's actually more comprehensive than it seems. We've tried not to repeat what has already been written elsewhere. Instead, we'll point you to selected literature where appropriate. Consider these references part of this handbook and make sure you read them carefully.
This handbook is organised into a two parts. The first part is non-technical and about how we work together as a team. The latter is about technical practices we use to make sure that we can ship high-quality software safely and continuously. We expect everyone to read the first part. If you're not in a technical role, however, feel free to skip or skim the second part.
PART 1: How we work

Agile Mindset

We believe in agile software development and are strongly aligned with the values and principles of the agile manifesto as well as ideas of  Extreme Programming (XP). Use these values and principles as your north star to guide decisions you make on a daily basis.
A big part of how we work is actually captured in The Agile Samurai. We totally assume everyone on the team has read it, so please make sure you've read it before you start working with us!
Roles

Building great software often requires a tremendous amount of collaboration between people with different skillsets. While we try to hire t-shaped people whenever possible, we usually still need a number of people with distinct expert skill sets. Here are some roles that you will typically encounter in our teams and what they usually do:


Product Owners (PO) spend a lot of time with stakeholders and use their input to drive the product vision and backlog.


Tech Leads (TL) make sure that the team can turn the product backlog into working software - effectively and sustainably.


Business Analysts (BA) work closely with the PO, stakeholders and the team to get into the details of what exactly should be built (emphasis what, not how).


Agile Coaches help teams to improve the way they work.


Software Engineers are responsible for every aspect of turning the product vision into working software. This includes collaborating and helping other roles, test automation, making architectural decisions, infrastructure automation, running the systems they built, being on-call, etc.


UX helps the PO and the development team create a great product by driving a great user experience. At the moment UX is a separate team, but we'd like them to be much more integrated into the development teams.


QA does exploratory testing, makes the team aware of quality problems and helps the BA define acceptance criteria on user stories.


Development Flow

Our main mechanism for scheduling work is a prioritised backlog of user stories. A user story has three critical aspects, which are the card, conversations and acceptance criteria. Generally speaking, a user story goes through the following stages, which we visualise on a Kanban Board:


Story creation: Most stories are born as two or three words on an index card. Many of them are created during an inception or story writing session. Others are created on the fly. In general, they are created by the PO or BA following up from conversations with stakeholders.


Story prioritisation: The PO and tech lead decide what to work on next. This happens on an ongoing basis.


Story analysis: The BA looks into the details, which typically involves talking to a range of people. This potentially means further breaking up (or merging) stories. The output of this process is a user story that is independent, negotiable, valuable, estimateable, small and testable (see INVEST). The BA and QA together also make sure that the story has clear acceptance criteria which we write in Gherkin.


Story review: The BA makes sure that other roles (e.g. UX, QA, Devs, Ops) have had a chance to review the story and give their input. At the end of this process, it should be very clear what the story is about, how large it is (possibly re-estimated) and what the high level tasks are to accomplish it.


Story kick off: Before any engineering work starts, an engineer gets together with whoever wrote the story for a few minutes and makes sure there's clear, shared understanding of what the story is about and when it will be considered done.


Story demo: The engineer(s) who last worked on the story demo the story to the BA and QA when they feel their done. This always happens on an integrated environment. If a story takes longer than one or two days to complete, we recommend doing a mid-story demo to avoid surprises and make sure that everything is still on track.


Exploratory testing: After a successful story demo, the QA may spend some time on exploratory testing. Ideally, no further issues will be found at this point.


Story sign off: One or all of BA/QA/Engineer demo the story to the PO. Again, there shouldn't be any surprises anymore at this point.


Note: It's quite normal that stories go through several iterations of points 2-4.
Other Backlog Items

Other items such as tech tasks or bugs are prioritised alongside user stories and enter the same prioritised backlog. Tech tasks are typically created and prioritised by the tech lead. Small tasks like, for example, fixing Broken Windows are usually picked up in between stories and are not separately prioritised.
Rituals

There are many rituals we do. You'll start picking them up as soon as you start working with us. Until then, here are a few of the more prominent ones:
Standups

We do them daily. They help us organise our work and make sure we're all pulling in the same direction. We try to keep them short and interesting for everyone. If someone goes off on a tangent, everyone is encouraged to pull them back.
Showcases

We do them weekly with our stakeholders. On the one hand, we do this to build trust by providing transparency and insight - primarily by showing signed-off software and project metrics (e.g. burn up charts). On the other hand, it's a great opportunity for both the stakeholders and us to make sure we're heading in the right direction.
Retrospectives

We do them every other week. We do them to reflect and improve the way we work together as a team. Most importantly, we always come up with some action items and follow up on them.
OKR Checkins

We've recently adopted OKRs. Every three months, we try to define a couple of them in-line with the company OKRs. On a weekly basis, we have a 15 minute check where we review them, update our progress towards them and, when necessary, try to come up with additional ideas or actions to make sure we can achieve them.
Code Reviews

We don't have a formal code review process. Instead, we sit down and talk to other team members when we need help or when we see something that warrants a discussion. Sometimes we do code reviews with the whole team. In general, the metric we're keeping an eye on with regards to code quality are WTFs/min - no joke 😉
Pair Programming

Most of us are happy to pair-program but it would be a lie to say that we're doing it by default. So, if you're keen to pair program, we'd be very happy for you to do so and help us get better at it. It's hard work but we believe it makes for better teams and is incredibly rewarding personally.
PART 2: Continuous Delivery

In order to ship high-quality software, safely and continuously, we're using a number of practices which are collectively referred to as Continuous Delivery. These practices have been shown to lead to significantly higher team and organisational performance (see State of DevOps Report). Below is an outline of some of the ones you'll find in our team.
Version Control

We keep everything needed to create, test, install and run our system in version control. That means code, tests, db migration scripts, build scripts, deployment scripts, infrastructure and environment configuration, etc.
Moreover, stick to these rules when using GitHub:

never ever store unencrypted sensitive data in a repo;
try to manage access via teams;
repo owner and team maintainers are primarily responsible for managing access to repos;
choose meaningful and consistent repo names;
make sure every repo has a one line repo description;
make sure that all our GitHub profiles include our full name;
don't create secret teams.

Build Automation

We automate all the steps necessary to build a deployable artifact. This includes things like linting, compiling, executing tests and ultimately creating a deployable artifact (e.g. docker image, minified JS, etc.) We also make sure that:

builds can be executed via a single command;
builds can be executed from the command line;
builds don't require any human intervention;
builds stay fast and don't require more than 2 mins before pushing;
build statuses are binary, i.e. they either fail or succeed;
builds are executed by every team member before pushing to git.

Continuous Integration

Continuous Integration is one of the core practices that enable Continuous Delivery. Recent research found that merging code into master on a daily basis contributes to higher performance. This practice is known as trunk based development, which is what we do. More specifically:

we commit (or merge) our changes to master at least once a day;
we link our CI server to master;
we treat every commit as a potential release candidate;
we don't create long-lived branches;
we use feature toggles for work that is not finished or ready for release;
we try to keep the build green at all times;
we never commit/push on a broken build;
we try not to block others when fixing a broken build. Instead, we revert the change and fix it locally;
we take responsibility for fixing the build. Typically, the last person who committed will fix it;
we always make sure our changes go through successfully - especially before we go home.

Test Automation

We build quality into the product, which means that engineers write all tests during development. The goal of our test suite is not to have 100% coverage. The goal is to get the fastest feedback possible to decide if a change can be safely be deployed to production.
Testing Strategies

We use different testing strategies to achieve high confidence in our software while keeping cost low and feedback cycles short. We use the testing pyramid as a visual clue to help us maintain a balance between the different tests. We only test at higher levels what absolutely can't be tested at lower levels. We use the same testing framework for all levels of tests. Below is a summary of the different testing strategies we use.
Unit Tests

Unit tests exercise the smallest pieces of testable software in the application to verify they behave as expected. As a rule of thumb, if your test touches a disk, makes a network call or tests more than one class or function at the same time, it’s probably not a unit test and you should think how you can isolate the behaviour you want to tests. This is important, because we want our unit tests to remain super fast. We also want them to be very focused so that when they fail we know immediately why. Of course, it’s entirely possible to achieve 100% unit test coverage without being able to start the app. That’s because unit tests can’t test all aspects of a system and why you need additional levels of testing.
Integration Tests

Integration tests verify the communication paths and interactions between components to detect interface defects. They are focused tests that probe the interaction of our code with an external system (e.g. make a DB query, load something from disk, etc.) or test the interplay between a number of classes to assert behaviour that does not emerge in isolation.
Functional Tests (aka Component Tests)

Functional tests run against a component's public interface. This can mean multiple processes, as long as the team has control over the processes (e.g. service + database owned by same team is OK). All external services are stubbed/faked so that we've got complete control over the component under test. This step is crucial so that we don't end up with failing tests that we can't do anything about because the error happens in an external service that is not under our control. Also, it's the only way to coerce our system in a controlled way into a desired state so that we can test, for example, error scenarios.
Consumer Driven Contracts (CDCs)

CDCs verify that an external service meets the contract expected by a consuming service. They are very focused (i.e. easy to diagnose) and offer fast feedback. The basic idea is to express what a component expects from an external service in a machine readable way so that these expectations can be verified. We also run them against our fake external services so that we know they are in sync with reality.
As a general rule, we design our systems to be robust to changes. It's the second part of Postel's Law "Be conservative in what you do, be liberal in what you accept from others.". As a consequence, we don't depend on structure or elements of a response from an external system that's not relevant to our component. And, of course, the CDCs reflect this.
System Tests (aka End-to-End Tests)

System tests verify that a system meets external requirements and achieves its goals, testing the entire system, end-to-end. System tests give us the highest level of confidence.
The problem with system tests is:

they are brittle because they rely on many moving parts;
they are expensive to run (e.g. spin up a browser, external services, etc.);
they are costly to maintain;
they are hard to diagnose (errors could be anywhere);
they can not test all code paths (combinatorial explosion, error states).

As a result, we try to have very few of these tests. At this stage of the testing process, we should only verify behaviour that is impossible to verify at earlier stages/lower levels.
TDD

We've agreed to TDD all our production code, which implies absolutely no production code commits without accompanying tests. If this is not your default way of working, we encourage you to give it a try for a few weeks. We think that it will make you a better engineer and that your peers will thank you for it. As a reminder, here are the three rules of TDD:


Don't write any production code unless it is to make a failing unit test pass.


Don't write any more of a unit test than is sufficient to fail.


Don't write any more production code than is sufficient to pass the one failing unit test.


BDD / ATDD

We embrace BDD as an outside-in methodology. This starts with identifying business outcomes, then drilling down into features that will achieve those outcomes, capturing features as user stories that define the scope of the feature through acceptance criteria and ultimately automating the acceptance criteria as automated tests. Consequently, the first thing we do before writing any production code, is writing a failing acceptance tests (aka ATDD).
Avoid Fixtures

We try to minimise the use of test fixtures for a number of reasons:

the separation of data from tests makes tests less expressive;
it's difficult to see which part of the fixture data is relevant for a test;
tests lack focus because a lot of irrelevant data is typically loaded;
it's difficult to see which tests depend on which fixtures;
fixtures are hard to maintain because changes to data often mean updating many fixtures;
fixtures are hard to maintain because you often end up with lots of duplication.

Instead, we use Builders or Factories to generate data on the fly.
Further Reading

Test automation is a broad topic and creating and maintaining a comprehensive, fast and reliable test suite requires strong software engineering skills. Thankfully, a lot has been written down. Make sure that, at a minimum, you read the following:

Testing Strategies in a Microservice Architecture
Growing Object Oriented Software Guided by Tests
Page Object Pattern

Deployment Automation

One of the enablers of Continuous Delivery, is to build systems with testability and deployability in mind. As a result, we make sure that our components:

are easy to deploy (we typically use Docker);
are published to a shared repo so that others can deploy them on their environments;
can be deployed without human intervention.

Infrastructure Automation

The state of our infrastructure is all specified in a machine-readable way (at the moment, we use Terraform) and like everything else, version controlled. Again, we make sure no human intervention is needed to create or re-create infrastructure and environments.
Deployment Pipelines

A deployment pipeline is an automated implementation of an application's build, deploy, test and release process - basically every step of the value stream between development and release.
We create a deployment pipeline for every component that can be independently deployed, including its required infrastructure and configuration.
Build artifacts are usually propagated through a number of integrated environments (e.g. DEV, STAGING) before they end up in production (e.g. LIVE).
DevOps

We embrace the DevOps mindset and are also responsible for running the products and services we build. This includes the usual operational responsibilities such as setting up monitoring, alerting, scheduling on-call, responding to incidents etc.
Technical Excellence

We strive for technical excellence in everything we do, because we believe that this is what enables us to sustain a fast pace indefinitely. It's a very broad topic and trying to tackle it is beyond the scope of this handbook. No matter your level of experience, we hope you're as passionate about building software as we are and that you're driven to learn something new every day, to grow professionally and personally and stay abreast of what's happening in our industry.
If you're looking for a place to start, try here:

Twelve-Factor App
Clean Code
Release it

References

Here's a partial collection of references this handbook and how we work is based on.

Agile Manifesto
Extreme Programming
The Agile Samurai
Lean Software Development
Essential XP: Card, Conversation, Confirmation
Continuous Delivery
State of DevOps Report 2016
Continuous Integration
Trunk Based Development
Feature Toggles
Consumer Driven Contracts
BDD - What's in a story
Growing Object Oriented Software Guided by Tests
Testing Strategies in a Microservice Architecture
Page Object Pattern
Twelve-Factor App
Clean Code
Release it

Contributing

Don't like or feel something is missing? Please talk to us or submit a PR.