Skip to content

Instantly share code, notes, and snippets.

@prehnRA
Created June 17, 2018 14:54
Show Gist options
  • Save prehnRA/26fc807130d8eb478a8b42ad987060af to your computer and use it in GitHub Desktop.
Save prehnRA/26fc807130d8eb478a8b42ad987060af to your computer and use it in GitHub Desktop.
My learning kubernetes journal

Learning Kubernetes

Questions / Resource Search

  • What are the good books?
  • Are there good online courses?
  • What about figuring out how things go wrong and fix them?
  • Anything for AppEng people?

Learn Docker

Learn Kubernetes

https://kubernetes.io/docs/tutorials/kubernetes-basics/#basics-modules https://www.katacoda.com/courses/kubernetes/playground https://labs.play-with-k8s.com/

Kubernetes Basics

  • What is Kubernetes?

    • "Kubernetes coordinates a highly available cluster of computers that are connected to work as a single unit."
    • "Allows you to deploy containerized applications to a cluster without tying them specifically to individual machines." - Ok, that's better.
    • "Kubernetes automates the distribution and scheduling of application containers across a cluster in a more efficient way.
  • What's a master?

    • A master manages the cluster (it coordinates activities)-- scheduling applications, maintaining state, scaling, rolling out updates.
    • QUESTION: What happens if the master dies? What're the signs? How can this be avoided?
      • It's bad.
      • The master is actually multiple parts, so it depends on what part dies.
        • If the API server dies, most kubectl commands won't work, and the kubelets can get confused (they use the API to talk to master).
        • If etcd dies, you could lose cluster configuration / data. BAD.
        • If the scheduler dies, then you probably can't get new pods scheduled on the cluster (deployments fail, apps go offline / get unhealthy as their pods fail and can't be replaced)
        • If the controller manager fails, ReplicationControllers stop working which means you probably won't get the right number of replicas of deployments. Also, services & pods probably won't get joined right, because the endpoint controller is in here.
        • If the cloud controller fails, you won't be able to interface with the cloud provider that you are using, so any operations relying on that will fail (if you use LoadBalancers provided by your cloud for instance-- you won't be able to create new ones)
      • The good news is you can configure a clustered master-- running master components across multiple nodes, monitoring the health of these nodes, and automatically replacing them as needed. https://kubernetes.io/docs/admin/high-availability/building/
  • What's a node?

    • A computer (VM or physical machine) that serves as a worker.
    • The "master" bosses these around.
    • Each one has a "kubelet" which talks to the master.
    • "The node should also have tools for handling container operations, such as Docker or rkt."
      • Docker is a container engine. So is rkt (seems to be CoreOS related and security-minded).
    • "A kubernetes cluster that handles production traffic should have a minimum of three nodes."
    • Nodes communicate with the master via the Kubernetes API.
      • TODO: Check out the kubernetes API and see if there is value in learning that directly too.
  • What's a process?

    • On Kubernetes, I think these are all containers, that get "scheduled" within nodes by the master.
  • What is minikube?

    • A way to run kubernetes directly on your machine with one node, in a VM.
  • If I have an app, and I want to push it to kubernetes: how?

    • Is this even a good question? Is this the right "layer" for that?
      • Kubernetes just cares about making deployments. You can make a deployment with the CLI or via the API. A deployment is basically docker image + config. So, to "push an app to kubernetes":
        • You bundle the app up as a docker image
        • You figure out what the necessary config is
        • You run the kubectl command to make a new deployment from that image and that config.
  • How does "distribution" work?

  • How does "scheduling" work?

  • QUESTION: What is CoreOS?

    • It's an enterprise-y kubernetes/ops thing. RedHat owns it now.
    • A lot of it is open source, and the CoreOS team contributes to / maintains a lot of kubernetes related things.
      • Container Linux is an example-- a trimmed down linux designed to be the base of linux images.
      • Also etcd, dex, a bunch of operators

Installing Hyperkit

Hyperkit is only for Mac. Other platforms can probably use more reasonable hypervisors for their VM.

  • Check it out from source? Run make?

  • TODO: How do we have hyperkit installed? Does this happen automatically with the new docker cask on Mac?

Install Kubectl

  • TODO: How do we have kubectl installed?

Install minikube

  • This is like installing your gentoo from source or whatever. What if this were like software or something?

  • Install hyperkit driver:

curl -LO https://storage.googleapis.com/minikube/releases/latest/docker-machine-driver-hyperkit \
&& chmod +x docker-machine-driver-hyperkit \
&& sudo mv docker-machine-driver-hyperkit /usr/local/bin/ \
&& sudo chown root:wheel /usr/local/bin/docker-machine-driver-hyperkit \
&& sudo chmod u+s /usr/local/bin/docker-machine-driver-hyperkit

Love2curlBinariesOntoMySystemWithSudo.png.exe

Install minikube itself (https://github.com/kubernetes/minikube/releases):

curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.27.0/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/

Starting it up:

minikube start --vm-driver hyperkit # otherwise it tries vbox

NOTE: No space left on device error-- minikube tries to steal 64gb of disk so you need lots clear. However, also this can happen when minikube gets into a weird state. You may need to nuke your ~/.minikube/ directory and start again.

Stopping Minikube

systemctl stop localkube # maybe
minikube stop # maybe
docker rm -f $(docker ps -aq --filter name=k8s)

Interactive Tutorial

Module 1: Verifying Our Setup

minikube version
minikube start
kubectl version
kubectl cluster-info
kubectl get nodes

Starts by checking that we have minikube, then starting minikube. Then we check that we have kubectl, and check what cluster kubectl is pointing at (should be pointed at a 192.168.x.x which is minikube). Further, if we get the nodes (kubectl get nodes) we should see just the one (the minikube master).

Module 2: Creating a Deployment

  • What is a "Deployment" in k8 vocabulary? What are the "pieces" of a deployment?

    • I think a deployment is basically an image + a configuration (particularly how many copies of it to run aka how many pods)
  • How do we make a deployment deploy?

    • kubectl run NAMESPACE --image=URL_TO_IMAGE --port=PORT_TO_RUN_ON
kubectl run kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

Creates a deployment: kubernetes-bootcamp = name?, image is a docker image, port= tells it what port to attach to?

It says it is running, but I don't see it. Oh, I can't reach it on the network maybe? Try kubectl proxy kubectl get nodes won't change because it is all running in a single node! (that's how minikube rolls!)

"Starting to server on 127.0.0.1:8001. Serve what? Looks like the k8 API?

  • Other things too supposedly, like each pod?
    • It's the k8 API. But there's also a proxy (part of the API?) that can be used to reach pods

Yeah. There's proxy routes for each pod! Example:

http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME/proxy/

your pod name will vary!

Module 3: Pods

  • Kubernetes automatically made a pod for us before.
  • A Pod is an abstraction for one or more app containers (like Docker) & some shared resources for them (disk store, a network, etc), and config (e.g. which ports).
  • Everything in a Pod is always scheduled / located together (never broken apart on different nodes) and has one IP.

Nodes

  • Pods run on Nodes. Nodes can have multiple Pods.

    • Master decides which Nodes to run Pods on based on "available resources"
      • QUESTION: How? What does this mean?
        • My impression: there's a bunch of complexity in the way this works by default. One of those smart people doing lots of math things.

          The kubernetes scheduler is a policy-rich, topology-aware, workload-specific function that significantly impacts availability, performance, and capacity. The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on. Workload-specific requirements will be exposed through the API as necessary.

        • There are two policies you can tweak if necessary: FitPredicate and PriorityFunction
          • FitPredicates are REQUIREMENTS: e.g. only run these pods on nodes matching these labels
          • PriorityFunctions determine which nodes, among the ones that are compatible (see FitPredicate) should the scheduler prioritize (the pod can only run on one node, so which is the best). The default is one that tries to put the pod on the node with the lowest current resource utilization
        • As far as I can tell, these are changeable at the time the scheduler is compiled. Or at least that's when you have to build in the options. You can change config at runtime though (e.g. you can make the scheduler only schedule a pod on a certain label, but you couldn't introduce a while new type of predicate [making this up-- random assignment])
  • Nodes have a kubelet (talks to Master), and a runtime for containers (Docker or rkt)

  • Containers go together in a single pod is they are tightly coupled and need to share resources!

Troubleshooting Basics

  • kubectl get - list resources
  • kubectl describe - show detailed information about a resource
  • kubectl logs - print the logs from a container in a pod
  • kubectl exec - execute a command on a container in a pod

For example, your "deis run" function that you copied off of stackoverflow finds a particular container and pod that is running your app in deis, and does kubectl exec on it to execute the command.

  • kubectl get FOO where foo is a resource type-- so far we've seen pod and node as resource types

kubectl describe pods gives a bunch of information about our pods: name, namespace, node, start time, labels, IP, controlled by (?), container(s) [with image, port, etc], conditions, volumes, recent events, "tolerations"

  • "Controlled by" indicates which ReplicationController (older) / ReplicaSet (new preferred) is in charge of making sure the right number of pods are running.

  • "The describe output is designed to be human readable, not to be scripted against."

    • QUESTION: What is to be scripted against? Well, I think you have options:
      • You can get YAML output for just about everything by adding -o yaml
      • You can do go templates for output too like -o go-template='{{(index .spec.ports 0).nodePort}}')
      • There's an API
  • kubectl logs POD_NAME gives logs of what has happened recently

  • kubectl exec POD_NAME -- COMMAND runs COMMAND in POD_NAME

    • e.g. kubectl exec kubernetes-bootcamp-5c69669756-kg2mk -- ls -la
    • -- is important to complex commands, otherwise kubectl will try to handle all the flags, switches, arguments, etc
    • kubectl exec -it kubernetes-bootcamp-5c69669756-kg2mk -- bash gets an interactive shell! (-it seems to let it be interactive, otherwise it returns immediately)

Jazz Break

  • So, for "our" environment to be like heroku, we'd want
    • one namespace per app? that's how deis does it
    • automatic determination of namespace / pod names for the current app
    • porcelin around logs, exec, etc
    • exec to automatically run in a one-off pod? Or at least automatically ID a pod to run in for you.
    • automatic wrapping of an app in a container maybe? Or do we just get over it? (forcing people to use docker is kinda the whole problem)
    • IDEA: can we make a docker image that leverages bin/setup, bin/server to always do the right thing?
    • IDEA: Map heroku's functionality / workflow & translate to k8.

Module 4: Services

  • What are services?

  • Pods die. When a Node dies, pods on it are lost.

    • "ReplicationController might then dynamically drive the cluster back to desired state via creation of new Pods to keep your application running."
    • Each pod has an IP, so the cluster does need to keep track of what is happening.
  • "A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them. Services enable a loose coupling between dependent Pods. A Service is defined using YAML (preferred) or JSON, like all Kubernetes objects. The set of Pods targeted by a Service is usually determined by a LabelSelector (see below for why you might want a Service without including selector in the spec)."

  • Ok, so what's a LabelSelector then?

  • Services are what allows your pods to be reached outside of the cluster! There are different types of services:

    • ClusterIP (default): makes the service available inside the cluster by IP
    • NodePort: exposes the service using NAT to make it available outside with NodeIP:NodePort map.
    • LoadBalancer - Creates an external load balancer in the current cloud and gives an external IP to the service.
    • Expose the service via a CNAME record. Requires kube-dns 1.7+ externalName determines the name
  • Services let your app keep ticking if pods / nodes die [because you can load balance amongst all the others].

  • Labels and Selectors are things that Kubernetes uses to do operations on groups of things.

    • Kind of like how CSS selectors let you do styles on certain nodes?
      • Yeah, ish. A way of querying, or SELECTING certain resources based on labels.
    • Labels are key value pairs, and can be used for lots of stuff, e.g. app=Foo, labeling staging vs. prod, or different versions, or just a tag.
  • NOTE: minikube makes a default service called kubernetes

  • kubectl expose deployment/kubernetes-bootcamp --type="NodePort" --port 8080

  • kubectl get services

  • kubectl get services/kubernetes-bootcamp get the nodePort

  • kubectl get pods -l run=kubernetes-bootcamp lists all pods labeled run=kubernetes-bootcamp

  • kubectl get services -l run=kubernetes-bootcamp lists services from that label

  • kubectl delete service -l run=kubernetes-bootcamp deletes services by that label

  • QUESTION: Does minikube support LoadBalancer yet? How close is that?

Module 5: Scaling

https://kubernetes.io/docs/tutorials/kubernetes-basics/scale/scale-intro/

"Scaling out a Deployment will ensure new Pods are created and scheduled to Nodes with available resources. "

  • Kubernetes supports autoscaling but it isn't covered in this section.

  • Services provide load balancing, so they can feed traffic to multiple pods.

    • They use endpoints to tell if a pod is up
      • Is this "health checks"?
        • Sort of. I think they are actually called "liveness and readiness probes"
          • https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
          • Liveness probes can be scripts that run (exit condition determines liveness), HTTP requests (HTTP status code determines liveness), or a TCP connection (whether it can connect on a given port determines liveness)
          • LIVENESS probes are used to determine if a pod should be killed and replaced
          • READINESS probes are used to determine if a pod can accept traffic
            • Maybe a pod is overwhelmed, and you don't want to give it any more requests, but you don't want to kill it either (because it'll recover once it handles its workload)
  • You can do rolling updates without downtime

  • Under kubectl get deployments: "Replicas" <- copies of a deployment? "Current replicas" is how many are running on the pods?

  • Scaling command example: kubectl scale deployments/kubernetes-bootcamp --replicas=4

Module 6: Updating

https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/

  • aka how to do a rolling update with deployments
  • Kubernetes will try to replace only some of your pods, leaving others running
    • By default, it does one at a time, but you can also do a higher number or a percentage of the pods
  • The load balancer will not send traffic to the ones that are being changed out
  • You update deployments by changing the image (or config, I suppose)
    • Example: kubectl set image deployments/kubernetes-bootcamp kubernetes-bootcamp=jocatalin/kubernetes-bootcamp:v2
  • Handy: kubectl rollout status deployments/kubernetes-bootcamp tell whether a rollout has succeeded or not.
  • ROLLBACK: kubectl rollout undo deployments/kubernetes-bootcamp

That's it for the "Kubernetes Basics" tutorial!!

Jazz Break: Questions Again

  • How is our etcd backed up? Are there other places cluster data is stored?
  • Are we running a single master? Or a clustered/distributed master?
  • Thought: we may have to learn some go at some point
  • IDEA: Evaluate tectonic (from CoreOS)-- what's the business model, what's the license, is it completely open source?

What to learn next?

  • Volumes! Persistent storage wasn't covered
  • What are operators?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment