Skip to content

Instantly share code, notes, and snippets.

@kieranajp
Created February 21, 2017 13:35
Show Gist options
  • Save kieranajp/7922c4a47959d57221cbd11fe35c7788 to your computer and use it in GitHub Desktop.
Save kieranajp/7922c4a47959d57221cbd11fe35c7788 to your computer and use it in GitHub Desktop.
Notes from microXchg conf 2017 in Berlin

MicroXchg 2017

## Resilient functional service design

The problem

  • you don't make any money until you go to production
  • you don't make any money unless your software is available & responsive
  • distributed systems change the rules of making it robust - throwing money at hardware is no longer enough
  • Failure is now the norm, it's unpredictable, it's going to get worse. Don't try to avoid failures, accept they'll happen

Design for resilience

  • If you don't get the core resilience right all the monitoring & recovery won't save you
  • Systems should fail in isolation and not cascade
    • However sometimes services depend on each other from a business point of view
    • It doesn't matter how many circuit breakers you have, once you have a dep chain like this you're screwed
    • By trying to avoid this you accidentally build a monolith
  • We learn func decomposition, DRY, layered architecture, design for reusability
    • But this leads to tightly coupled, low cohesion, non-resilient services...
  • Caches to the rescue?
    • "Do you really think that copying stale data all over your system is a suitable measure to fix inherently broken design?"
    • Important, but not a replacement for good design

Re-learning systems design for distributed systems

  • Bulkhead design
  • Of course there's no silver bullet
  • Think about the foundation of design
    • High cohesion, low coupling
    • Separation of concerns
  • Resources
    • 1972 paper: decomposing systems into modules
    • Lean book
  • Do DDD (as opposed to EDD (entity) - ffs ubiquitous language)
    • DO NOT start with the data model
    • You'll find the separation of concerns in the business model, in the dynamic model
  • Short activation paths
    • Minimise the amount of internal remote calls to satisfy one request
  • DISMISS REUSABILITY
    • Reusability == coupling
    • Leads to bad service design & compromises availability
    • Rarely pays off, avg reusability factor of 1.1 or 1.2 - it needs to be 5 to be worth it
    • "Do not strive for reuse, strive for replaceabilty"
    • If a module should have been made reusable, it will become evident over time
  • Think about your communication paradigm
    • Horizontal (sync) vs. vertical (async / event) slicing
    • Influences overall service design a lot, and the resilience patterns to use
    • Choose carefully, don't limit your design options without understanding the reasons behind and ramifications of your decision

DDD & REST - Domain-Driven APIs for the Web

  • If you accept a core domain object as a string, how do you know it's valid? e.g. string email vs email vo
  • Your persistence engine choice can change the way you think about your entities and aggregates
  • Ubiquitous language is contained between bounded contexts - e.g. order in PO context vs. logi context

### Domain events

  • Level 0 - CRUD

  • Level 1 - Explicit operations

    • In terms of business operations (UL)
  • Level 2 - some operations are events

  • Level 3 - CQRS & ES - event all the things (out of scope of this talk)

  • Prevents feature creep - events help decouple - avoid integration issues

    • Move event creation to the aggregate - e.g. Order::complete
  • Treat events as an explicit concept - same as you're explicit with your types & VOs

REST

  • REST is NOT CRUD over HTTP
  • DDD modelling can help a lot here
    • Aggregates - 3 important characteristics - identifiable, referable, scope of consistency
      • Same as REST resources!
      • Look at where your aggregates are, and shape your resources around your aggregate boundaries
      • Representation design matters, you should always take the aggregate into context
        • In REST you can represent this with hypermedia / HATEOAS
          • You can represent for example status with this - e.g. you can cancel an order when a "cancel" link is available, rather than based on status id
          • Reducing the complexity of business decisions that a client has to make
          • Reducing domain knowledge in the client
      • This helps with API evolvability
        • Key in a system of systems - allows deployment without forcing other systems to update as well - Blog post

Microservice Websites

Problem

  • How to develop a website with multiple teams?
  • Different business units making a website that feels like one contiguous experience
  • Frontend as a bottleneck
  • "Decentralised Governance" gives an option for teams to choose different tools (book by fowler)
  • Mobile perf (the thing you were thinking of wrting about)

Transclusion

  • Including all or part of an electronic document into one or more other documents by hypertext reference

  • Expose a fragment resource, /shopping-cart, consume declaratively like <img src="">

  • See: Edge side includes <esi:include src=""> - server side w3c rec

    • Requires transpilation, supported by Akamai, fastly etc
    • Allows you to cache the shit out of most of your page, and just reload dynamic elements e.g notification
  • See: <h-include src=""> - client side library with custom elements, transitive includes, http2, lazy loading!

    • Async but has XHR lag
    • Vanilla JS and polyfills only
  • Can use both together to use best tradeoffs

  • now you have service dependencies - fragment is dependent on its own CSS/JS

  • Need cache busting

  • Service side transclusion works well here

  • Dude wrote a thing

Microservices and mobile

  • Fast iterations for everyone (even on mobile)
  • As an industry we need to push to reject manual review (in app stores)
    • This is "mobile waterfall"
  • Couldn't release a feature that was only for Berlin, because reviewers were in US
  • Bugfix? Half a week >:|
  • Canary releases? Forget it
  • Extensive E2E tests or manual QA mitigate bugs, but aren't really a solution, and just make the process slower
  • Beta users (e.g. testflight) are good, but don't let you deploy 10 versions for canary testing
  • web vs native all over again
  • PWAs solve a lot of the cons of web apps :party:
  • Hybrid applications have the cons of both (not counting react native), but slightly quicker deployment cycles as you can update the embedded webapp. Also perf sucks

Roll your own solution

  • Crazy, but let's do it anyway
  • Building your mobile app as a parser for your microservice's data
  • Build it as configuration - when this event is received, perform these actions
    • Ask the server what should happen each time - application logic as a service!
    • This is fucking cool - basically your own DSL for your app
    • You can choose your trade-offs:
      • Performance vs. complexity
      • Quicker iterations
      • Cost efficiency

#### Pseudo-microservice for mobile

  • When your application starts, get a list of services from the backend
  • Register event handlers
  • When the event occurs, query your service and perform the resulting actions
  • This way you can even canary test multiple business logic workflows

https://github.com/waterlink/LikeyLikes https://github.com/waterlink/LikeyLikesStatic https://git.io/vDJKG

LT: Too many microservices

  • MSs are good
    • Iterate quickly
    • Smaller is simpler
    • Better defined responsibilities & teams
  • They mean
    • More services
    • More teams
    • More communication
    • Need for more documentation
  • Problems
    • What is available on the platform?
    • Platform architecture?
    • Who is responsible for a service?
  • Wikis are shit
  • You could try to collect metadata from your microservices
  • Before:
    • YAML files kept up to date
    • Wiki generated from these files
    • But search was limited, no immedate benenefit
  • Then came up with pivio.io
    • Who owns what
    • Service registry
    • Fancy diagrams
    • Searchable (elasticsearch)
    • Built-in query language

The pretty face of your microservice

  • Consider your API as a UI with developers as the users
    • Or machines! :robot_face:
  1. Start with a really good idea
    • A fancy API won't rescue a useless microservice
    • Don't be afraid to throw stuff out
  2. Match your system ito the real world
    • Ubiquitous language and DDD
  3. Don't reinvent the wheel
    • Follow standards
    • Share patterns, traits and schemas - so paging, filtering etc. is always done the same way
  4. Internal consistency
    • The same action should work the same way
    • Pattern library
    • Consistent error codes & messages (drink)
    • Naming conventions (drink)
  5. Prevent errors
    • Make it very hard to make mistakes
    • Validation rules should go in API definition
    • Be tolerant when reading input
  6. Minimalist design
    • Don't put more stuff in your API than you need
    • Ask for the bare minimum, and avoid redundancy
    • Use references instead of the full dataset
  7. Help and documentation
    • Ensure reality matches docs
    • Make it easy for people to read (good structure & search, up to date)
    • Explain how to recover from errors
  8. Break the rules
    • When standards and usability disagree, follow usability
    • Remain consistent
    • Know why you are breaking the rules, master the rules first
  9. Don't justify your design
    • If your users don't like it, change it

Applying runtime configuration to a microservice architecture

  • smartlaw.de
  • White label config (different tenant, same environent, so different config on per-request basis)
  • This one is boring as fuck and the guy is in a suit
  • But at least he has a tag cloud

Microservices: the organisational & people impact

  • Most of the problems with microservices are connected with people and organisational systems
  1. Strategy: situational awarenes
    • Are microservices a good fit?
      • Middle management are latching onto microservices as a buzzword to brand things
      • Lipstick on the pig
    • Not understanding architecure principles
      • Build around business functionality (DDD)
      • Creating mini monoliths (12 factor!)
    • No well defined devops
      • Deployment / ops free for all
    • Microservices are not a silver bullet if you have these problems
      • Determine business goals, hypothesise, choose tech, and validate
    • So what are our goals?
      • Delivering value
      • Business agility
      • Safer, more rapid changes to software
      • Why jump to microservices? CI/CD/DevOps, value stream should come first. Important foundation
    • Wardley maps: useful technique to understand business and technical landscapes
    • You have to know where you are before you can decide where to go
    • Choose tooling to support your approach - don't change your approach because of your choice of tooling
  2. Define goals
    • SMART goals
      • Get your stakeholders involved
    • Communicate the vision
      • Map strategic goals to architecture goals
      • Map these back to development and practice goals
    • DO UBIQUITOUS LANGUAGE RIGHT ARGH
  3. Technical leadership is vital
    • Promote shared understanding - it's about communication
    • Do proper risk management
    • "Just enough" up-front design - how much architecture is just enough?
    • Conway's law is well accepted, but it's not so clear where architects sit
      • Overarching, consulting, or per team?
      • PO having final say is bad - three amigos pattern
    • Technical insanity antipattern
      • We created a tech mess
      • But no change is required???
      • We need strong technical lead
      • If you can't create a well-architected monolith what makes you think you can make microservices???!
    • InnerSource
      • Encourages sharing and documentation
      • Reduces tribal knowledge
      • Draw a map, rather than copy the territory
      • Quality guidelines come baked into testing
  4. Evaluating tooling
    • Spine model
    • We get stuck at a dogmatic level with the tooling, the spine model helps you decide what you value rather than making knee jerk reactions
    • Bias is super real
    • https://www.amazon.de/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374533555
    • Antipattern: Technical envy
      • Blindly copying e.g. Netflix, Spotify
      • Learn the context, principles, practices, culture
      • Understand the advantages and drawbacks
  5. Feedback - visibility and constant learning
    • Business, architecture, operations
    • Business:
      • Dashboards and metrics are useful
      • Microservices should be business-driven
      • Validate hypotheses
      • Share metrics regularly
      • Show the benefit and business value the microservices bring
    • Architecural feedback
      • Your code as a crime scene
      • Visualise churn and complexity
      • Work out where the "slums" of your code are
      • Same thing for code quality
    • Antipattern: Trojan monoservice
      • When you accidentally make a monolith
      • Matthew Skelton: Types of software monolith
      • Continulally retrospect on technical work using supporting metrics
    • Operational visibility
      • Logging, monitoring, alerting (drink!)
      • When bad things happen, people are always involved
      • Mikey Dickinson & healthcare.gov
      • A little standardisation goes a long way
        • Automation is the goal
        • Understand problems with postmortems, get at the root cause
        • Checklists provide structure
    • When done well, microservices enable agility
      • But if you don't build in signals and metrics, and you don't have the data and adapt to it, then there's limited benefit
  6. Responsibilities
    • Just change to squads, chspters, guilds. Problem solved!
      • Learn from conway, netflix, spotify, but don't blindly cargo cult
    • Devops
      • Antipattern: The fullstack myth
      • Define responsibilities, who owns what (gitlab: nobody owned backups)
      • Focus on what matters wrt microservices, you need to have devops nailed down first
      • Top: CI/CD - how much value is there in non-deployed code?
    • Antipattern: Water-micro-fall
      • The "perfect" microservice
      • Not validating assumptions
      • Change mindset to continuously deliver incremental changes to production asap
      • Dancing skeleton: Get something super simple through a pipeline to production ASAP
    • Change management is essential
    • Transformation is a process, you can't buy devops
    • Leading Change by John P Kotter

Day 2

## Authorization and Authentication in microservice environments

### Problem

  • Log in and see the UI, but the UI might be powered by different microservices
    • How do these know what the user is allowed to do?
  • JWT can help
    • Log into auth service, get JWT, UI sends JWT to microservices
    • Microservices can check token validity themselves (signature) - they don't need the auth service anymore
    • Two types, JWS / JWE
  • JWS
    • Three parts: header, payload (claims), signature
    • Header contains algorithm that was used
    • Payload contains iss, exp, sub... you know this stuff
    • Signed with secret (private key)
  • JWE
    • Five parts:
      • Header
      • encrypted key
        • symmetric
        • encrypted with shared secret
      • initialisation vector (salt)
      • cipher text
        • encrypted payload
        • encrypted with enc algorithm
        • encrypted using initialization vector
      • auth tag
        • also a result of enc algo
        • ensures integrity
    • Two additional keys in payload:
      • enc: Encryption algorithm for the cipher text
      • zip: compression algorithm
    • Pro: Everything is unreadable to the user
    • Con: Have to distribute private keys to microservices

## Secure Microservices Adoption

  • By isolating services, we isolate security risks
  • Not every service is equally important, some are higher value targets
  • Isolated services reducce overall security risks
  • Problem: End users need to interact with multiple services
  • Problem: End users frequently include many roles
  • Solution 1: API gateway
  • Solution 2: Backend for frontend for different end users
  • Trust boundary: at this point you need Authentication, Authorization, validation
    • Different use cases fit different authN solutions
    • internal boundaries
      • Maybe your service shouldn't access other, more sensitive services (privilege escalation)
  • Is this client allowed to access this entity?
    • Customer can't modify order for another customer

Secret management

  • Secret management software should:
    • Store & transfer encrypted
    • Audit all access
    • Rotate automatically
    • Fine grained ACL

Beyond OAuth2: E2E Microservice Security

  • More teams
    • Smaller (2 pizza)
    • More independent
  • More trust boundaries
  • Speed to production
  • Mo' processes mo' problems :meth-mo:
  • OAuth2 to the rescue
  • One token to rule them all
    • Danger!
    • The token is too powerful - can do anything to the system as that user
    • Only limit is expiry
    • Token leakage is a big deal
  • Client credentials grant type can be used internally
    • Don't pass user jwt around, resource can get its own token representation
    • Match tokens to your orgs trust boundaries
      • Teams maybe don't fully trust each other, apps perhaps shouldn't either
  • Proposal for new oauth2 grant type: token exchange
    • Given actor + subject + audience, get a new token
      • Policy decision given caller, user and intent
      • New token expresses these
    • Given actor + previous token + audience, get a new token
      • Policy decision based on delegation chain (call stack)
    • Now we can take internal trust boundaries into account
    • Pros:
      • User, client, call stack are part of policy decision
      • can request very limited power tokens (audience & scope)
      • Trust boundaries are unambiguous, all information is present to the auth server
      • Centralised policy management
    • Cons:
      • Network and auth server overhead
      • Security vs. performance tradeoff
        • Token caching & reuse
      • Policy management vs. agility
  • Dude made a thing, it's all java
    • JWT-ception
    • Single use JWT
      • 1 aud, 1 op
    • Embed JWT inside JWT for nested microservice calls

Understand, Automate, Collaborate for Development Speed with Microservices

  • The challenge of a postmodern software developer

  • Engineer of your own problems

  • I'd like to be writing some code, but I have to do other stuff

    • Navigating VCS
    • Clicking things in the browser
    • Jumping back and forth a LOT
    • CI and CD
    • DevOops
  • With microservices this is x100

  • Alt+Tab should not be a key skill!

  • We have great tools out there

    • But they're not really aware of each other
    • They also can't read your mind
    • You're the one who has to map that back to your context
  • Modern software development is a cognitive overhead problem

  • How many microservices do you need to comprehend to do your work?

  • This is the reason for SRP - reduce cognitive overhead!

  • but we keep creating the problem for ourselves

  • So:

    • What is of use?
    • What should I notice and what should I ignore?
    • Where do I need to go to get it?
  • Bring in information at the right time right to your eyeballs, as and when you need it

    • Shouldn't have to go and get it
    • No more "have you seen X" where X is a critical that's lost in the noise
    • Actionable information at all times
  • Microservice systems become big data problems in their logging alone

    • What's the bare minimum I need to see?
    • What extra information can be used to enrich that?
  • We've been creating highly complex systems that seem to hate human beings for a long time

  • How to solve these problems?

    • Chat? (slack)
      • No wait, far too much chat
      • Noise amplification system
      • Too many integrations, too much giphy
      • Jet cockpit with all the dials
      • Please please no more slack
    • It has to be more than just show
      • Chat can end up a nightmare, but with the right thinking you can turn it into much more
        • It's a lousy dashboard, but a great way to collaborate
        • Get several people aware of something and working on it
        • Let's not turn slack into a shitty dashboard, it's not about pushing information in there. It's about doing stuff.
      • Observe
        • Make me aware
        • Notify me just when I need to know
      • Orient
        • Show me what I need to know
        • Supporting information
        • Who can I talk to? Who made the last commit?
        • Bring them in
      • Decide
        • Help me figure it out
        • Where should I look?
      • Act
        • Help me act
      • Chat is a great system for enabling an OODA loop for your system
        • Forget all the hype, use it for collaboration - what it's designed for
  • Visibility and Control to Automate Software Development

    • Less busywork
    • A key skill in microservices is creating a new project
      • Get it in CI
      • Get it deployed
      • etc.
      • Make this as easy as possible
      • Often you walk into a project that's already there, making something new is a key skill and takes much longer than you think
  • Atomist - make sense of your software development flow

    • Tighten your OODA loop inside a collaborative environment (slack)
    • Integrate all the things so you can get the right information to the right eyeballs and the right actions to get stuck in
    • Less yak shaving
  • Normal chat: STUFF IS GOING ON

    • @atomist: list issues
      • Takes deluge of info and helps you make decisions
    • @atomist: create issue
      • Uses slack thread (neat idea)
      • Gives you buttons for actions
    • Collaborative
    • E.g. travis build.... then release button appears
    • Rug files (dsl) for configuration or typescript
    • Also has editors for rewriting / linting code
      • Done through slack UI again :D

Conway's Law and the Innovator's Dilemma

  • If the building blocks are already there it's easier to glue them together

  • Pedro got sherlocked

  • Divide and conquer

  • Independent, stable teams building independent services

    • If you want to go fast, go alone
  • Think global, act local

    • Global vision to align
    • Shared values to set focus
    • Local decisions - microservices, not micromanagement
  • Decentralised planning

  • Microteams, microservices, microwins

    • "billing" other teams for their services
    • "Cost" and service on a team level rather than org, so it's proportional
  • Planning, prioritising, and saying no

  • Made a proto-persona

  • Put him on the map from "crossing the chasm" bell curve

  • say no to those too far over the chasm, you can't develop everything at the same time and one sector is more important

  • Build-measure-learn-add more customers

  • Serendipity strikes again when reading the innovator's dilemma

    • Divide and conquer
    • Be friends with your fans
    • Deliberate ignorance
    • Optimised processes
    • Small teams big wins
  • 5 challenges

    • Companies depend on customers and investors for resources
    • Markets that don't exist can't be analysed
    • Technology supply may not equal market demand (minidisc)
    • An organisation's capabilities define its disabilities
    • Small markets don't solve the growth needs of large companies (but divide and conquer, feel more important)

With great power comes great responsibility

Distributed Scheduler Hell

  • How we moved 100s of VMs into containers

  • How we deploy a distributed database into production (digitalocean/vulcan, port of Prometheus)

  • Requirements:

    • 3Gbps traffic
    • 20TB storage
    • <100ms read times
    • 100k write ops
  • Prior: Everything was a VM

  • Arduino: 1 process -> kernel (processes etc)

  • Distributed scheduler provides:

    • Container deploy
    • v memory deploy
    • memory quota
    • Disk storage
    • Networking
    • CPU scheduling
  • E.g. mesos

  • Deployed everything to mesos, no more devops... everything broke

  • Distributed: different applications will run onto different nodes and you don't care

  • Kafka as a custom scheduler on top of marathon on top of mesos on top of linux

    • dafuq
  • Mesos killed kafka master and deleted all the data :D

  • Deleted mesos :D

  • Found hashicorp nomad instead

    • has its own gossip protocol so that there's no master to be down
    • If one datacenter goes off it still works
    • However nobody uses it and they don't handle state
  • kubernetes instead

    • Can hide complexity
    • Kubelets make things more resilient than mesos
    • Kube yml is verbose
    • Really good CLI
  • Upsides to distributed schedulers:

    • No more devops, just throw your containers up there and be done

Alles hat ein Ende, nur die Wurst hat zwei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment