Skip to content

Instantly share code, notes, and snippets.

@zachm
Last active September 29, 2016 10:56
Show Gist options
  • Save zachm/d0d204d26e2d0204b9766185faf893bb to your computer and use it in GitHub Desktop.
Save zachm/d0d204d26e2d0204b9766185faf893bb to your computer and use it in GitHub Desktop.
Automacon

These notes were taken by me, they are mostly factual but slightly opinionated in places. Those places should be (I hope) obvious.

Typos are expected because I can only type so fast.

Fear, Uncertainty, and Continuous Deployment

Eric Sigler @ PagerDuty

They had issues in January related to their internal deployment tool: SOA, 100-ish engineers, etc. Terms:

  • Continuous Integration: validation and test execution
  • Continuous Delivery: creation of artifacts, docker and debs and things
  • Continuous Deployment: autodeploy to prod!

Devops and agile are just ripping off lean management techniques from the 1950's. We also abuse the term 'code inventory': old pull requests, etc.

Fake it until you make it:

  • "You don't need 100% test coverage"
  • What you gonna do at the zombie apocalypse?
  • Canaries, blue/green, instrumentation
    • don't need it? just pause for 15 minutes, and let your monitoring scream if you have a problem.
    • keep the instrumentation simple: binary is good to start
  • CDep holiday: one day a week, just click and merge!
    • but engineer behind the curtain doing the manual steps
    • next: shell script to be the engineer
  • If you deploy your service, you are auto oncall for 30 minutes!
  • tl;dr keep it simple and gradually get complicated/automated

Bots Not Cattle

Josh Berkus @ RedHat

He worked on HA Postgres2.0, and observed that the industry in general feels... automating databases... "That's hard, we'll do it later."

Old metaphor: (Servers are) "Cattle not pets"

But now we have dockerdockerdocker ... kubernetes!

  • Problems: Cattle are dumb, need central management, move in one direction.
    • Central issues: infra scale, people scale, comms lag, app response latency
  • Ooh: FSM with limited state surrounding them!
    • Imagine the bounce service, waiting for checks to pass or fail
    • Transition through replication, startup, leader election, etc.
  • Bots: Automate the app. Do NOT automate the entire system!
    • Describes mesos frameworks as "bot-ish platform". Problems: lagginess, race conditions, resource usage
    • Frameworks for writing state machines? I guess the bounce was ahead of its time!

Being an introvert at a conference is not as hellish as you think it is.

JJ Asghar @ Chef

This is basically what JJ does to both cope and get the most out of conferences.

  • First conference is the hardest, gets easier quickly.
  • Carry a game in your bag, people will just play with you randomly!
  • Sit at table facing people, they will just come to you.
  • "Booze helps, it really does."
  • Don't open laptop or notebook, it makes conversations hard.
  • Look at what people are looking at, you'll learn cool tidbits quickly.
  • Start an open space so crazy that people are curious.
  • Find a conference buddy, not someone you work with. Doors open.
  • Take breaks, you can skip a session if you want to. Recharge if you need to.
  • He made the talk so people would come up to him!

Rust's Community Automation

E. Dunham @ Mozilla

How can you apply the rust community's automation strategies to do a lot more FOSS with a small team?

  • Community: people work to make progress.
  • Automation: Tools to do people things.
  • Moderation: Rules + Enforcement
    • You can automate dissemination of community rules
    • Chat topics are useful, etc.
  • Auto-maintain a green repo that, by definition, passes all tests.
  • This tree should (always) be passing: encourages people to write tests, run tests, maintain the tests.
  • Yessssss Bot to nag the reviewer to review code!
  • Templates for issues and PRs.
  • Tune notifications to work for you:
    • Filtering all Github notifies is bad :/
    • Ugh I totally do this, I need to get better at that.
  • Great issue triage for new contributors: "I'm new, what should I work on?"
    • Reward maintainers for doing this triage by making it very public.
    • Great docs; train folks that the docs actually have answers.
  • Include a CONTRIBUTING.txt
  • There are sites for baby's first PR: codetriage.com, etc.
    • Get your project's bugs on there!

A culture of safety: minimizing misery in regulated environments

Elliott Murphy @ KindlyOps

"Safety comes from people." People can't be agitated or this goes to hell.

More regulations now, and more coming soon!

  • EU repealed safe harbor in 2015
  • EC passes data protection regs in 2016
  • US Civil Rights starts HIPAA audit program this year

And developers say: These rules are so lame and a waste of time/money!

Okay, now tell me about your HA load balancers and replicated DBs...

And this is key from my perspective: We all want to bikeshed the things we think are interesting/cool. None of us want to wrap our minds around stuff we didn't invent, don't control, believe to be limiting our freedom.

Connect Controls to Requirements

  • Engineers should feel comfortable changing controls while still meeting the requirement.
  • A govt client required hiring folks with no misdemeanor convictions: the company was small so they kinda got around it.
  • But in the meantime, they had to add that to the job req :(
  • Once that client's contract was over, they REMOVED that requirement from the req.

Solve for intent

  • Policies go out of date, so retire them when they are no longer needed or make sense.
  • Don't get bogged down in the minutiae, remember your users.

Always be auditing

  • Google: "Racist algorithms" vs "Accountable algorithms"
  • Once again, remember your users
    • But take a proactive approach to how your systems are affecting them.
  • Compliance is not enough.

Workshop: Kubernetes for Sysadmins

Kelsey Hightower @ Google

Relevant repo: https://github.com/kelseyhightower/automacon

Note: These are very high-level, cherry-picked ideas. For this workshop, you'll just have to wait for YouTube to really do it justice.

Tetris bin packing with respect to the scheduler: You can't use the first fit because it's often the worst fit. Have to hold onto the bigger chunks for bigger deploys, otherwise we encounter fragmentation...

Yeah... stateless webapps you can lift 'n shift, but datastores? That presume a POSIX FS? Oh boy.

Don't deploy anything (ex nginx) with the :latest tag! Makes rollbacks HILARIOUS.

Did you guys test it? Yeah we tested the latest version. Which version is that? Uh the latest one.

If you make a custom scheduler how do you deploy it? Chicken and egg. There's no first class scheduler, they're all created equal, so we just create another kube deploy for that scheduler, but schedule it with the default scheduler. (I guess there's... one built in?)

If you can sustain 60% workload, across all your machines, 24/7... you're a genius. Come work at GOOG, we're hiring.

So in the examples he's showing, he made a Lowest Price scheduler. But you can make one in any way you want, and that's pretty powerful by mapping it to org's desires/requirements.

Secure HTTP endpoint with TLS cert: The old way... If you have a CA PROVIDER, you definitely work in enterprise. How do you know when the cert expires? LOL YOUR USERS TELL YOU

Kube will mount certs as cert objects, you can mount them, and they'll be available. He gives examples, but basically this looks a lot like docker volume mounting, with a vault-like thingy behind them holding the secrets. And a better config format, imho.

Also, reload your damn certs within your app. Why do you have to restart the app if you reload your certs? (I think Kelsey's GOOG is showing here... also pushing through your TLS termination to your app can be a challenge of course.

Contracts between components of the system. So between the app and cert store, app and DNS, etc.

Workshop: Building atop Kubernetes: from fleet to Kubernetes

Jason Hansen @ DEIS

Once again, high-level, cherry-picked ideas.

DEIS is an open source PaaS that builds on Docker and CoreOS in a Heroku fashion. They're using influxdb as a source of truth for monitoring.

Platform utilizes Kubernetes, etc. for dev-oriented self-service. Startup just got funded.

If this starts working we'll know that Kelsey IS Kubernetes! While the live demo was being problematic.

Service discovery: they push it down via the Kube model to their service abstraction (IP address, DNS name). They used to just use raw ETCD, or IP+port combos.

Deploying: Used to require 2x capacity; rolling deploys fixes that. #kubernetesFixesEverything

Kube healthchecks: they have liveness and readiness. If fail liveness, restart. If fail readiness, don't route traffic there.

Workshop: Augmented Reality with JFrog Artifactory : Metadata to automate pipelines from Dev to Ops

Baruch Sadogursky @ JFrog

POC using Docker: 90% of audience Docker in production: 5% of audience Docker and nothing else: 0% of audience ... oh wow. This is actually surprisingly low given this audience.

Continuous Integrity: Basically, relatively few changes/builds/etc. make it into final production.

docker build is a powerful, useful hammer... but it makes everything look like a nail. Especially with latest... even saying FROM ubuntu:14.04 is not okay because of security, point releases, etc.

FROM ubuntu:1234567890deadbeef - very solid, but useless!

How to fix? Immutable, standard binaries. Promoted exactly once. When you promote, do so by specifying the digest.

JFrog makes this Xray thing that will tell you where a given dependency is being used in docker, etc.

Why do we do manual QA testing? Because the robots aren't good enough to provide the quality we expect. Also we totally don't trust them.

Build the Docker image only after the WAR process, not before. This doesn't really address the build environment, but it does ensure that you won't accidentally build an evil Docker image.

Overall this JFrog Artifactory thing is a pretty powerful out of the box deployment manager thingy. I don't know that anything about it is super novel, but it does make these hard(er) guarantees about what ends up in what build. Nice UI too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment