Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Notes from DevOpsDays

DevOpsDays Vancouver Notes

Intro on Friday

Demonware Capacity planning

  • Stories about lessons learned building matchmaking, leaderboards, and low latency large systems for Activision/Blizzard

Empathy Matt Smillie @notmatt

Quick talks

  • A really good one on Mobify building up their Analytics BI tools using RDS.
  • "Agile design is not intelligent design" but it got them a powerful service quickly, but noone in their right mind would ever design this. ;)
  • To Apply, Brendan Gregg's USE method when debugging: https://twitter.com/AlexJHammel/status/721033862453682176

Lunch

Devops teamwork culture

  • ideas for implementing positive change in your org

Terraform at Hootsuite

  • It doesn't know your existing infrastructure
  • Ruby gem terraforming can read your infrastructure like ansible-blueprint
  • Built "locking" via Jenkins so that pieces have a state. Trying to DIY Atlas to stay cheap.

Random Docker discussion

  • Popular docker build service recommendations
  • Build kite
  • Circle CI
  • ^^ build services.
  • tangent: ejson for shopify secrets

Day 2

Scaling out Continuous Delivery at Shopify

cognitive science by @DaveMangot of Papertrail

Demonware CI pipeline presentation by @mtomwing and @bobcatwilson

pager duty's Ken Rose @klprose

  • "Failure Friday fosters a culture of handling failure" at PagerDuty
  • "Running a service through the Failure Friday gauntlet is a great addition to the release process for new services" Overview:
    • improved resilience by testing in production
    • reliability. Pager duty can't go down. Because they're important backup, so very resilient.
    • canary deploy
    • bad deploys causing outages
    • postmortem noticed
    • they use Go CD by thought works
    • Do "end to end provider testing"

They use twilio, plivo tropo Weighted selection of SMS

They test production. - wrote 100 tests/watchdogs/health checks in production - short tests every 5 - long tests every 30 -examples found: - API compatibly broken - slow queue - LB breakage http - transient failures - tests make load

Failure Friday. Inject failures into the system Gather the company Kinda like a hackathon Watch the data dog dashboard Stop the service Restarting hosts Network isolate with IP tables Tc qdisc add eth0. --- make fake latency #FF chat channel Log commands run and what happened. Track TODOs then later fix Post graphs into logs

Lightening timer talks:

Watching infrastructure state using Terraform by ACL

  • Terraform plan
  • run continuous to see state drift events

Julien from Microsoft

  • Run load tests often

Slalom consulting

  • built a Lamda function from CloudFormation that makes CloudFormations

Two pizza teams by krishworld.com

  • keep it small, keep things fast

Choosing a CI tool by @ecmkcallum from Thought works

  • They make GoCI
  • Wrote some books on it
  • Wanted to look at alternatives to their own tools
  • Evaluated a large batch and settled on Travis vs GoCI
  • History of people burned by past experience with CI like Jenkins
  • newness is hotness, but shiny ain't always a good thing

Lunch

Boeing on DevOps

  • dream: what the future of connected devices might look like
  • IOT, embracing devops culture, and deployment toolchains

Akamai CDN

  • single page app optimization
  • "Good point about frameworks being heavy b/c they have something for everyone." @mtomwing

Chat ops

  • talked with folks experimenting with deployment and monitoring in their chatrooms
  • Hootsuite, Shopify, Samsung
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.