Skip to content

Instantly share code, notes, and snippets.

@phrawzty
Created June 27, 2018 07:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phrawzty/6ae99e5020f28b90aa114438552f6abb to your computer and use it in GitHub Desktop.
Save phrawzty/6ae99e5020f28b90aa114438552f6abb to your computer and use it in GitHub Desktop.
notes from paris container day 2018

Plenary

Cloud Native ML with Kubeflow (EN)

David Aronchick

Machine learning is, at its core, just statistics.

The statistics can be really, really complicated though.

  • non linear groups
  • multi-dimensional models

Machine learning is a way to solve problems without explicitly knowing how to create the solution Machine learning is hard - even for Google

DIY machine learning is super common

  • setup from scratch
  • migrating between environments is super hard

Hosted ML also exists

  • works immediately
  • but it becomes bespoke quickly
  • vendor lock-in :(

Value of k8s is in the extension model

  • no need to fork code

Need a "cloud native ML"

  • composibility
  • portability
  • scalability

Composability

  • every business is different
  • ability swap components is key

Portability

  • "multi-cloud is the reality"
  • 81% (!) of Enterprise are multi-cloud, avg 5 (!) cloud platforms
  • Dev is an environment and it's usually totally different from stage/prod

Scalability

  • More resources
  • More humans
  • More problems

Kubeflow (pr. Kûb Flô)

  • Simplifies ML on top of k8s (re: 3 criteria above)

Demo!

  • Sentiment analysis
  • Demo was a video, which is good - but the video didn't work, which is bad.
    • The live demo gods are angry today!
  • Apparently suppoed to be a terminal workflow of deploying Kubeflow
  • Kubeflow configs and sets up tensor2tensor (handy)
  • "tpu" makes it all faster?

Contact

  • "kubeflow" on all the things

Room 1

Kubernetes: Should uou use it for your next project?

Anthony Seure (eng at Algolia)

~2 years using k8s in prod

(One of the two projectors broke. oh no! not his fault.)

"everything at google runs in a continer" - joe beda, 2014

introduction to k8s

  • software, config, tools
  • architecture
    • config
      • describe services
      • define resource (avail/min/max)

future?

  • dc/os
  • hashi nomad

Algolia (the service) runs on bare metal, highly optimised. 1200+ servers in 70+ DCs

All user-facing stuff (website, dashboard, blog, docs, status page, etc) is in VMs w/ buckets

Backend services (logs, analytics, usage pipeline, monitoring) on VMs & k8s

k8s at Algolia two years ago:

  • single-machine monolith
  • analytics served from ES

entre-temps tried:

  • SaaS solutions
    • just couldn't deal with the volume OR way too expensive
  • google cloud dataflow
    • also way too expensive

now:

  • migrated(ing) to k8s
  • live in both google and aws at the same time. works but network fees are expensive.

infra (google):

  • k8s = GKE
  • nodes = GCE
  • Ingress = GLB
  • Docker reg = GCR

testing:

  • staging is done on-demand against real-world load (!)

logging, monitoring, alerting:

  • currently: stackdriver and wavefront
  • considering Datadog
    • shout out!

Should you use k8s?

  • That's a real question. Not everybody needs it. It's a lot of work.
  • Still need to pay attention to infra & classic ops concerns.
    • New and exciting things that will break!

Cloud vs on-prem?

  • If you're 100% k8s native, portability is possible
    • Watch out for vendor extensions
  • "If you can afford it, prefer IaaS providers"

Learning curve is steep

  • Can you afford to invest the time and resources? again: do you really need?
  • share knowledge early and often
    • on-board people from other teams as early as possible

deployment

  • no miracle solutions. terraform, skaffold, gcloud deplyoment manager, etc
  • all tools are painful. pick one early and deal with it.

logging

  • centralise from day one
  • pick a good tool, you're gonna need it, esp. if you're on-prem

testing

  • load and end-to-end testing
  • blue/green

k8s scales, but does your app?

  • watch out for static deps, threading issues, etc

Buzzwords in the CLoud Native Era

@horgix (WeScale)

Buzzwords!

  • tracing
  • service mesh
  • serverless

micro-servers:

  • monoliths are dead, micro-services are the way forward

orchestration

  • allocate resources to jobs
  • reschedule jobs in case of failure
  • bring API-centric infra

observability

  • monitoring is a sub-set

logs: recording events, easy to grep

metrics: data combined from measurong events, can odent trends and context

tracing: recording events w/ causal ordering, ident cause across services

  • see also: APM
  • solutions: datadog, new relic, others

service mesh: routing with intelligence

  • linkerd, conduit, istio, etc

serverless: Functions as a Service

  • 5 years ago: run this infra-as-code that installs my app
  • now: run my container
  • serverless: run my code

advantages of serverless

  • only pay for usage
  • pretty easy to deploy
  • "nano-services" ?
  • open faas, openwhisk, kubeless, and the cloud providers

The Cloud Native Way

Ihor Dvoretskyi @idvoretsky - Dev Adv CNCF

Rise of micro-services correlates with rise of containerisation

Docker is fundamentally attractive to application developers because Docker behaves like an app, not an operating system (hmmm)

Docker is great but it doesn't scale by itself

Kubernetes started at Google and has since been "donated" to the Linux Foundation

Kubernetes "graduated" from CNCF earlier in 2018

More than 20 "platinum" companies supporting CNCF; more than 50 companies contributing code

CNCF "trail map": L.cncf.io

Many local meetups; the one here in Paris is one of the largest / most vital.

Room 3

Containers from scratch

Liz Rice @lizrice

Containers from scratch

Dive into what docker run <image> actually does at a code level

  • live coding in Go

Within the app, if called with arg run, this must actaully do something.

  • i.e. fork a thread

In order to prepare the call for containerisation, we must use unix timesharing.

  • syscall.SysProcAttr
  • 🎉 containers!

Use /proc/self/exe to make the current process self-referential

  • This way the app can set environment (namespace) then call itself within that env

/proc/ has numbered subdirs for each running process

  • Within these dirs there's a tonne of FS-accessible information (in the form of files) about the running process
  • ps uses this to generate output

syscall.chroot to, well, chroot.

  • Must chdir afterwards or sadness will prevail.

So docker run ` is effectively chrooting somewhere then forking self-referentially within that chroot.

  • Ok yeah it's more than that but that's the basics.

Mounting proc within the chroot allows access to /proc contextually.

summary: Namespaces

  • what you can see
  • created with syscalls

Summary: CGroups

  • what yuou can use
  • filesystem interface

/sys/fs/cgroup/: cgroups!

  • /sys/fs/cgroup/docker/<id>: cgroup for that container

CGroups can be used to control the number of processes that can be run within the container

  • Useful to prevent fork-bombing.

L7 load balancing without a service mesh

Damien Lespiau (weave works)

Open with a WeaveWorks biz & product pitch. blah.

Context: micro-services on k8s

  • traffic management is a problem that increases exponentially with scale

Why load balance at L7 instead of L4?

aside: what does the cli tool hey do?

L4 load-balancing is trickier in the world of HTTP pielining. HTTP/2 request multiplexing (gRPC) makes this worse.

Goals of load balancing:

  • distribute the load fairly
  • affinity
  • locality
  • circuit-breaking

L4 vs L7

  • L4: connection-level
  • L4: affinity stops at IP/port
  • L7: request-level
    • potentially better load distributin
    • way more affinity criteria
    • "passive circuit breaking"

"Sidecar Proxy" is a thing

  • Damien had a hard time explaining it though

L7 proxies:

  • language agnostic
  • simple clients
  • proxy in the data path - this might also be a disadvantage

Client-side proxies:

  • need lib for each language
  • no extra hop in the data path though
  • full control over the desired behaviour

Look-aside load-balancing:

  • "I've never seen a product implement this."
  • gRPC

"reverse proxy with consistent hashing"

Hash the endpoints, plot them on a circle. Hash incoming requests, plot them on that circle. Requests will go to the endpoint, clockwise.

  • Bounded loads to prevent saturation: upperBound = c * averageLoad, c > 1
    • If a given endpoint is saturated, pass to the next.

Circuit breaking

  • relies on k8s readinessProbe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment