Skip to content

Instantly share code, notes, and snippets.

@postazure
Created April 27, 2018 15:42
Show Gist options
  • Save postazure/c6b5fa62462f46aab71ccd400597ec48 to your computer and use it in GitHub Desktop.
Save postazure/c6b5fa62462f46aab71ccd400597ec48 to your computer and use it in GitHub Desktop.
Lead Dev NYC 2018 Conference Notes

Revitalizing a cross-functional product organizations

Team health velocity strategic execution

wherewithall.com

  1. Clarify roles and responsibilities What are they responsible for and how that fits in to the broader org. Ambiguity breads frustration. Inertia, chaos, ambivalence (no one feels empowered)

Discovery (& frame solution) -> Design -> Development -> Testing and Dep -> Post-dep Iteration These never seem to happen in the chronological steps, its messy. There is also no one role that handles each step. Some functional role takes lead but they collaborate.

Vendiagram that clarifies the responsibilities of each role and see what responsibilities are distinct and which are shared. This will help resolve conflict by making it clear up front what role is responsible.

(outcomes: accountability and velocity)

  1. Living product docs strategy and process Documenting fosters the conversation, repeatable

Establish what Documents, Meetings, Milestones are necessary to keep the team on track.

Discovery -> Documents - Narrative Doc or Spec - Goals, non-goals and success metrics (What is required and what is optional) Meetings - feature kickoff, see where there is agreement and where there is disagreement Milestones - Discovery Complete, we are moving forward (Late feedback might not be considered)

Design -> Design Brief Product Review - Review that the design accuratyl addresses the original problem Design Complete

Development Doc - Technical Plan Architecture Review - Justify technical requirements Code Complete - The bulk is complete, we are moving to testing and shipping

QA and Deployment Test Plan and Deployment Plan Pre-Deployment Sync Test Complete and Shipped

Post-Deployment Post-ship, Feature Updates - How is it performing and what might we change? Post-Ship Sync - Gain alignment, report back about the impact

  1. Lead difficult conversations Ground rules for Meetings (rules of engagement) - Make conversations more predictable
  • Stay curious
  • Everyone is smart and trying
  • Not computers/phones
  • Content stays in the conversation

People don't want to rat some one out because then they wont feel part of the group. Practice difficult conversations, how does it feel to say those words out loud. Other people will approach the conversation differently.

Create an env where leads can role-play difficult conversations. Use scripts to separate the practice from reality. Everyone should practice not just managers. Conflict mediator (not HR, that can create a power dynamic).

  1. Be Mindful when you communicate "Poor com will dilute impact, frustrate employees and result in failure."
  • Be mindful of your audience, what is at stake for the audience members
  • Be aware of medium, ton and body language. Concise language might make it seem like you are frustrated.
  • Power dynamics
  • Is this person in a position to take the action I'm suggesting?
  • Does what I bring to the conversation actually elevate it? Is this constructive and productive.
  • Meet transparency with responsibility.
  • Assume best intentions ( Based on the information available to them. ) What is going on behind the scenes. Practice empathy.
  • Listen to learn. Stay curious, others may have an expertise in things you do not. Prepare to be surprised. Be excited to have your mind changed.
  1. Join Forces Disjointed things happen when teams are silo-ed. Cross-org meetings share important information. Answer questions together, publicly. Roll out information to everyone in tandem. Collaborate on when information will be shared.

Collaborative Debugging

"Debugging is twice as hard as writing the code in the first place." Repeated work between investigations, but context wasn't actually shared. So it was repeated.

Assumption: Each person should claim and fix a bug. However, this can lead to repeated work and not actually fixing the bug. Even sharing a failed hypothesis can be helpful because the devs wont repeat investigating the failed hypothesis.

Adding all context and investigative materials should be included in the ticket to share context.

Microservices

  1. Costs
  • Operational Costs
    • Additional points of failure
  • Scaling
    • Also a cost! It requires a lot of inferstructure
  • Build and deploy pipelines
    • Teams should build and maintain their own pipelines
  • Integration costs
    • Make sure that we haven't broken contracts with clients and clients haven't broken contracts with us.
    • Error handling
  • Developer's costs
    • Running several services locally
    • Test Data
    • Network config

Use automation to account for costs.

  • Contract tests that ensure that expectations are not broken
  • "Every cost that is incurred is compounded by the scope undertaken." Each micro-service should provide it's own benefit

Pick something that will really benefit or is encapsulated. Or maybe this part of the code is ripe for refactor.

  1. Over-centralization Too much shared code. This often occurs because of shared code which is potential for cupeling. This also creates deployment issues because the dependencies are shared. "Distributed monolith"
  • Avoid "too much shared code"
  • follow rules of library Design
  • consensus on standards over shared implementation
  • No business logic in libraries

Organizational over-centralization Microservice requires an organizational change. Integration between devs, ops and qa.

  1. Neglecting the monolith "The monolith will be gone soon!" - Is a lie, and will encourage devs to write bad code in the monolith. Write tests when working with your monolith.

"Microservices are a panacea!" - Also a lie. Service definition should start in the monolith. Microservices should own its own data so that it can own its own logic.

Use the Strangler Pattern - continue to extract logic from the monolith until the monolith's service is just a thin wrapper around the service calls.

"Domain Driven Design"

Keep in mind marginal accountability Make room for autonomy Own the whole process Don't wait for Micro-services (Leave it better than you found it)

Error budgeting - SRE

Reliability - we want to be 99% available, but this only accounts for up/down. However, it should be more bout the quality of service. Could there be a degregated service where a micro service could be down but only minority impacting users.

Define a more nuanced error budget. If there are bugs they may cut into the budget. So if there is a lot of downtime then perhaps less risky features? or more scrutiny.

If the target is to ambitious it might negatively impact users. Users wont get benefits like new features, but they may feel the pain of extra scrutiny.

Also, if there is minimal impact to users, perhaps spend more time on the issue instead of seeing everything as a crisis.

SLI (indicator), SLO (objective), SLA (agreement)

Indicator is a defined metric. Then you can aggregate this over a larger time to measure if you are hitting your objective.

Practical SRE

  • Metrics and Monitoring
    • Automatic record systems
    • High quality alerts
    • Only involve humans when the SLO is actually threatened
  • Capacity Planning
  • Efficiency and Performance
  • Change management
    • This are mostly broken by new code and configuration
    • Need to be able to detect and automatically roll back

As long as the system is reliable enough, then we can experiment more.

Human errors are a system problem. You can't fix a person but you can fix a process.

Fear of modifying large systems

You made a change and something broke. Tomorrow you need to comeback and make another change. What can make this more comfortable?

Tool kit for modifying large systems. Incremental changes.

  • Measure your code.
  • Ramp up your work implementably.

How to modify a black box with confidence?

  • Instrument, but return the same thing. So in this case measure performance to get a baseline.
  • Feature flags can be used to make incremental changes.
    • Incremental ramp ups (a% at a time)
    • And a kill switch

You can give other people the feature key to turn it off, and they don't need to know anything about the code to turn it off.

Confidence from knowing where the off switch is.

New Manager Death Spiral

Your job is to aggressively delegate.

  • Let others change your mind.
  • Augment your obvious and non-obvious weaknesses with a diverse team. (Your going to have tense conversations and thats ok. Ideas do not get better with agreement, they get better with thoughtful disagreement)
  • Delegate more than is comfortable

Trust: High trust teams are more successful. They know that you're the boss, but you earn that trust by overcomunicating, delegating and receiving feedback.

With each act, am I building or eroding trust?

Better Incident Command to Improve MTTR

The incident commander role... Increase the learning that happens during an incident. An individual that coordinates the incident response. They are not responsible for fixing the issue. It is very important that you have a clear, well documented process.

Roles Severity Life cycle

Tech systems, Human systems, Priorities (Data, UI, etc?)

Regulate the flow of emotion (Fight, Flight, Freeze)

  • Calm down anger
  • Slow down panic
  • Some people are going to get stuck

Regulate the flow of information

  • Who is there, what do they know and what do they care about

Regulate the flow of analysis

  • Ask questions
  • challenge assumptions
  • amplify voices

Continuous Culture

Only 10% of large projects make it to production (succeed) 74% of small projects make it to production

Large projects have 64% of unused features Small projects only have 14% of features are unused. These projects have more features removed from them.

CYNEFIN Framework - Complexity of software Development = It is complicated. Causality due to Complexity. We understand that there is a complexity, but with larger projects we cannot find causality

Shorter lead times, means they make smaller errors and can recover from them quicker. Smaller means less risk.

Customers don't want continuous deployment. It creates broken trust. So they need to have buy in, tell them why they want this.

Problem vs. Process fix (Processes that feel good)

When things go wrong its easier to fix the problem instead of drilling in to fix core problems.

It feels really good to be the rock star and fix the problem, and if you don't fix the process there will be plenty of problems to fix. It feels very different to be the 'auditor' and fix the process. That makes it hard to spend resources on fixing process over shipping new features.

"Rockstars don't get days off"

When you just patch stuff you're not increasing bus-count.

Rockstars become bottlenecks,

  • they create a dependent team (everyone just defers to them)
  • And they know that Rockstars are just going to come in and say 'Well actually...'
  • If some one asks a question and everyone turns to one person then you've silo-ed your knowledge
  • And eventually, the rockstar will leave and the knowledge with them

Process can get rid of the 'Rockstar' mentality

Teams slow down

  • Lack confidence in ability, knowledge or autonomy
  • Lack of clarity about the goal of a given project

A good process is designed to create confidence and clarity.

"Trust your team b/c they are smart, and if they aren't smart then you should look at your hiring process."

Empowerment, no bottlenecks, knowledge is shared and documented.

How do we create a good process:

  • Excellent on-boarding and documentation - can a new dev commit a non trivial thing on day one.
  • Ongoing internal education and training - when you encounter something new, then create a training (document it)
  • Frequent code reviews and coaching - these should be positive. Praise what was done and talk about the anti-patterns that could have been done. Put your own bad code on display and talk about it.
  • Comprehensive test suits
  • Internally consistent style and quality guidelines

@jlengstorf

But, no one listens... Why processes fail (because its about how people feel not what they know). (Be right, but be appreciably correct.)

Processes should feel good

  1. Emotional Rewards
  • Less efforts required to do things the new way.
  • Make it the default choice.
  • Immediate praise and positive feedback. Validate them.
  • Public praise
  1. Automation
  • Setup CI and CD for tests
  • Prettier automatically
  • ESLint...
  • Automatic test coverage count
  • The danger is that the response is inconsistent. Unconscious bias, bad days, good days all make this inconsistent.
  1. Simplification
  • Consider the cost of on-boarding and training and can you do that for every new person
  • Use stable open source tools when possible to reduce the things that you need to maintain.
  • Write code that is small and easy to delete. Build for now, not 5years from now.
  • "Premature optimization can be a violent source of tech debt."
  1. Yak Shaving
  • Limit the team's exposure to yak Shaving
  • Rabbit holes, 8 tasks deep just to get to do the initial work.
  • Create zero or low config dev environments

Don't make people do a bunch of work before they can start working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment