Skip to content

Instantly share code, notes, and snippets.

@dmitri-lerko
Last active November 28, 2017 17:19
Show Gist options
  • Save dmitri-lerko/50f703c10053fa589fb8d177bb862eb2 to your computer and use it in GitHub Desktop.
Save dmitri-lerko/50f703c10053fa589fb8d177bb862eb2 to your computer and use it in GitHub Desktop.
CKA Preparation
Extra verbosity to see underlying APIs:
kubectl --v=99 get pods busybox
Namespaces:
kubectl get ns
kubectl create ns linux
kubectl get ns/linux -o yaml
kubectl delete ns/linux
Specify PODs namespace:
apiVersion: V1
kind: Pod
metadata:
name: redis
namespace: linux
Entire Kubernetes API uses Swagger Specification, this is evolving towards the OpenAPI.
To access container from local machine (debug) we can use port-forward:
kubectl port-forward redis-8675df8689-bw8nn 6379:6379
apiVersion: extensions/v1beta1
kind: ThirdPartyResource
metadata:
name: custom-resource.example.com
description: "A custom resource"
versions:
- name: v1
Will result in creation of CustomResource kind in the example.com API group.
kubectl get thirdpartyresources
Now you can create a custom resource:
apiVersion: example.com/v1
kind: CustomResource
metadata:
name: crazy
labels:
kubernetes: rocks
kubernetes create -f crazy.yaml
Lastly you need a controller to take action upon creation of custom resources. e.g. https://github.com/coreos/etcd-operator http://www.devoperandi.com/kubernetes-automation-with-stackstorm-and-thirdpartyresources/
CustomResourceDefinition is a more up to date version of the same, as TPR will be deprecated.
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: database.foo.bar
spec:
group: foo.bar
version: v1
scope: Namespaced
names:
plural: databases
singular: database
kind: DataBase
shortNames:
- db
kubect create database.yaml
apiVersion: foo.bar/v1
metadata:
name: my-new-db
spec:
type: mysql
kubectl create -f db.yaml
kubectl get db
kubectl get databases
Helm - Package manager for K8S
Helm tries to simplify complex application deployment on Kubernetes. Created by Deis, part of Linux Foundation.
Run Tiller server inside Kubernetes, run client Helm on your local machine.
A Packaga is called Chart.
1) Init Tiller
2) Find Chart
3) Install Chart
4) Create resources
https://github.com/kubernetes/charts
Uses Go templating syntax. Variables are defined in Values file.
brew install kubernetes-helm
brew init
helm init
kubectl get deployments -n=kube-system
Adding repositories to Helm:
helm repo add testing
helm repo list
helm search redis
helm install testing/redis-standalone
helm list
Kubernetes high-level overview
Written in Go.
Master (manager) - scheduler, controller manager (reconcile states), API server and stores state of the cluster (etcd).
Worker (node) - kubelet, service proxy.
Kubelet receives requests to run the containers and watches over node local containers.
Proxy creates and manages network rules to expose container on the network.
Components of functioning cluster:
API Server - REST inteface to all of the kubernetes resources
scheduler - place containers on a node according to metrics, rules and resource availability.
controller manager - controller loop that brings cluster's current state to a desired state.
kubelet - interacts with underlying Docker on the node
proxy (aka kube proxy) - in charge of network connectivity to containers
etcd cluster - uses leader election algorithmto provide strong consistency of the stored state among the nodes.
Check etcd content with 'etcdctl ls /registry'
Kubernetes Networking Model: https://kubernetes.io/docs/concepts/cluster-administration/networking/
Slide deck on K8 networking: https://speakerdeck.com/thockin/illustrated-guide-to-kubernetes-networking
Networking:
Lowest compute unit is a Pod
1 IP per pod (similar to VM, physical host)
Containers in pod communicate over localhost
Pods have default 'pause' container to get an IP
Containers share network namespace (--net=container:<container name>)
Uses Container Network Inteface (CNI) since 1.6 (helps with inner pod network and single 1 IP per pod) (https://github.com/containernetworking/cni)
K8 network requirements:
All pods can communicate with each other accross nodes
All nodes can communicate with all pods
No NAT
All IPs are routable without a NAT.
Achievable via physical level (e.g. GKE) or overlay flannel, weave, calico, romana
HA Master (https://github.com/kelseyhightower/kubernetes-the-hard-way)
Simple way to mimic Pod networking where multiple-containers share the same IP:
docker run -d --name=container-1 busybox sleep 6000
docker run -d --name=container-2 --net=container:container-1 busybox sleep 6000 (container-2 uses contaier-1's network)
docker exec -it container-1 ifconfig
docker exec -it container-2 ifconfig
Roll your own K8S master using Docker
docker run -d --name=etcd -p 8080:8080 gcr.io/google_containers/etcd:3.1.10 etcd --data-dir /var/lib/data
docker run -d --net=container:k8s gcr.io/google_containers/hyperkube:v1.7.6 /apiserver --etcd-servers=http://127.0.0.1:2379 --service-cluster-ip-range=10.0.0.1/24 --insecure-bind-address=0.0.0.0 --insecure-port=8080 -- admission-control=AlwaysAdmit
docker run -d --name=controller-manager --net=container:etcd gcr.io/google_containers/hyperkube:v1.7.6 /controller-manager --master=127.0.0.1:8080
Test your own K8S sandbox:
docker exec -it k8s /bin/sh
export ETCDCTL_API=3
etcdctl get "/registry/api" --prefix=true
curl 127.0.0.1:8080/api/v1
Ingress - collection of rules that allow inbound connections to react the cluster services.
Ingress Controller is a proxy (nginx, HAproxy) that gets reconfigured, based on the rules you create via Kubernetes API.
https://github.com/kubernetes/ingress-nginx
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ghost
spec:
rules:
- host: ghost.<IP>.nip.io
http:
paths:
- backend:
serviceName: ghost
servicePort: 2368
kubectl get ingress
kubectl delete ingress <ingress_name>
kubectl edit ingress <ingress_name>
Enable ingress addon on minikube: minikube addons enable ingress
kubectl run ghost --image=ghost
kubectl expose deployments ghost --port=2368
kubectl switch config:
kubectl config use-context foobar
Demo cluster in GKE:
gcloud container clusters create democontainer
gcloud container cluster list
kubectl get nodes
gcloud container clusters delete democontainer
Demo cluster on your machine:
minikube start
kubectl get nodes
mikikube ssd
docker ps
ps -aux | grep localkube
tmux tip:
set: synchronize:panes
Installing using Kubeadm:
https://kubernetes.io/docs/setup/independent/install-kubeadm/
https://kubernetes.io/docs/tasks/federation/federation-service-discovery/
https://github.com/kelseyhightower/kubernetes-cluster-federation
Manually manage multiple-clusters:
kubectl config use-context london
kubectl get nodes
kubectl config use-context dublin
kubectl get nodes
per $HOME/.kube/config
https://kubernetes.io/docs/tasks/federation/set-up-cluster-federation-kubefed/
kubectl --context=federated-cluster get clusters
Cluster object:
apiVersion: federation/v1beta1
kind: Cluster
metadata:
name: chicago
spec:
serverAddressByClientCIDRs:
- clientCIDR: "0.0.0.0/0"
serverAddress: "${CHICAGO_SERVER_ADDRESS}"
secretRef:
name: chicago
Extract config & secrets:
kubectl config view minikube --minify --flatten
You can adjust pod distirbution accross clusters using weight in federation.kubernetes.io/replica-set-preferences annotation.
Summary of federation creation:
1) Pick the cluster where you'll run federation componenets (API server and control pane) and create a namespace where you'll run them.
2) Create a federation API server service that you can reach, create a secret containing the credentials for the account you will use on the federation, and launch the API server as a deployment.
3) Create a local context for the Federation API server, so that you can use kubectl to target it. Generate a kubeconfig file for it, store it as a secret, and launch a control plane. This will be used to authenticate with API server using kubeconfig secret.
4) Create a secret for each cluster's kubeconfig. With cluster resource manifest at hand, use kubectl to create them on federation context.
Evolved from Borg - Google's internal orchestration system. Borg brings 15 years of hyper-scale workload management experience.
Whitepaper: https://research.google.com/pubs/pub43438.html
In 2007 Google contributed *cgroups* to LINUX kernel. It limits resources used by collection of processes.
Kubernetes infographic: https://apprenda.com/why-kubernetes/
K8 case studies: https://kubernetes.io/case-studies/
Source: https://github.com/kubernetes/kubernetes
Not core, but useful features: https://github.com/kubernetes-incubator
Whitepapers: https://research.google.com/pubs/pub44843.html
Youtube: https://www.youtube.com/watch?v=VQAAkO5B5Hg Cluster Management at Google
Useful websites: www.cnfo.io
https://stackoverflow.com/search?q=kubernetes
Heapster https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/
Prometheus https://prometheus.io/
Enable heapster on minikube:
minikube addons enable heapster
Logging:
https://kubernetes.io/docs/concepts/cluster-administration/logging/
Troubleshooting:
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-pod-replication-controller/
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/
Best troubleshooting commands:
kubectl describe
kubectl logs
kubectl exec
Run a troubleshooting pod:
kubectl run busybox --image=busybox --command sleep 3600
kubectl exec -it <busybox_pod> -- /bin/sh
To prevent node from scheduling:
kubectl cordon <node_name>
To evict all pods:
kubectl drain <node_name>
StatefulSets https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
Similar with deployments, but used with stateful applications where identity of the pods needs to be maintained between pod restarts.
Horizonal Pod Autoscalers (HPA) https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
HPAs automatically scale RC or deployments based on CPU usage. Requires Heapster addon properly running (collects metrics).
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
kubectl get hpa
kubectl autoscale --help
Jobs. https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
Part of Batch API. Runs a set number of pods to completion. Failed pods restarted. Job is made of tasks that run within pods.
Will have a pralellism and completion keys. Default to 1. https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/
Cron Jobs https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/ - alpha in 1.7
Daemon Sets - extensions group. https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ Ensures that all nodes or set of nodes run a single pod.
Aimed to run daemon like services on each node.
kubectl get daemonsets
kubectl get ds
RBAC https://kubernetes.io/docs/admin/authorization/
ClusterRole
Role
ClusterRoleBinding
RoleBinding
https://kubernetes.io/docs/admin/accessing-the-api/
Authentication (Check against configured Authentication modules)
Authorization (ABAC, RBAC) (Check existing authorization policiies) Authorized when user has permissions to perform the requested action.
Admission Control - check contents of an actual object being created and validate them before admitting request.
All above encrypted using TLS, need proper SSL cert configuration. kubeadm does it for you.
Authentication in K8S:
Done via certificates, tokens and basic auth
Users are not created by API, should be managed by an external system
System accounts used by processes to access API
Type of authentication is defined in the kube-apiserver
--basic-auth-file
--oidc-issuer-url
--authorization-webhook-config-file
Authorization modes:
ABAC - Attribute based access control
RBAC - Role based access control
WebHook
Can be configured as kube-apiserver options:
--authorization-mode=ABAC
--authorization-mode=RBAC
--authorization-mode=Webhook
--authorization-mode=AlwaysAllow
--authorization-mode=AlwaysDeny
--authorization-policy-file=my_policy.json
{
"apiVersion" : "abac.authorization.kubernetes.io/v1beta1",
"kind" : "Policy",
"spec" : {
"user" : "bob",
"namespace" : "foobar",
"resource" : "pods",
"readonly" : true
}
}
More examples: https://kubernetes.io/docs/admin/authorization/abac/#examples
RBAC policies can be defined over kubernetes API.
Role
ClusterRole
Binding
ClusterRoleBinding
Examples: https://kubernetes.io/docs/admin/authorization/rbac/#api-overview
kind: Role
apiVersion: rbac.authorization.kubernetes.io/v1beta1
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
Kubectl has a build in manifest generator for RBAC:
kubectl create role pod-reader --verb=get --verb=watch --verb=list --resource=pods
kubectl create rolebinding foo --role=pod-reader --user=minikube
kubectl get rolebinding foo -o yaml
Admission controllers - it is recommended to configure kube-apiserver with this minimum set of controllers:
--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota
More info on admission controllers https://kubernetes.io/docs/admin/admission-controllers/#resourcequota
SecurityContext
Per pod / container additional restrictions like process UID, group, file permissions.
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
securityContext:
runAsNonRoot: true
containers:
- image: nginx
name: nginx
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
PodSecurity Policies https://kubernetes.io/docs/concepts/policy/pod-security-policy/
Applies pod policies to every pod in the cluster.
PSP + RBAC example: https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md
Network isolation for pods can be added using annotations:
kubectl annotate ns <namespace> "net.beta.kubernetes.io/network-policy=
{\"ingress\": {\"isolation\": \"DefaultDeny\"}}"
This is deny ingress to all pods in the namespace, to define network rules follow:
https://kubernetes.io/docs/concepts/services-networking/network-policies/
Only possible with Calico, Weave Net, Romana. Practice with https://kubernetes.io/docs/tasks/administer-cluster/declare-network-policy/
Replication Controllers (RC) are part of the api/v1 API and are considered stable. New class of RCs has bene introduced - Replica Sets.
You should use Replica Sets instead of RC wherever possible.
Hierarchy:
Replication Controllers
Pods
Container
RCs provide declarative definition of what pod should be and how many you want running at any time.
RC makes sure that homogeneous set of pods is always up and available.
Documentation: https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/
kubectl get rc redis -o yaml
kubectl scale rc redis --replicas=5
kubectl get pods --watch
apiVersion: v1
kind: ReplicationController
metadata:
name: redis
spec:
replicas: 2
selector:
app: redis
template:
metadata:
name: redis
labels:
app: redis
spec:
containers:
- image: redis
name: redis
Deployments. Resource for managing rolling updates. extensions/v1beta1 API group.
Allow server side updates to pods at specified rate. Used for canary, etc.
Deployments generate RS.
kubectl run nginx --image=nginx
kubectl get deployments,pods
Deployments should be favoured over RC.
Labels
Select pods by labels
kubectl get pod -l run=nginx
kubectl get pods -Lrun (ignores the value)
Assign labels on the fly
kubectl label pods nginx-7c87f569d-rfxts foo=bar
kubectl label pods bla foo- (removes label foo)
kubectl get pods --show-labels
e.g. helps with scheduling logic
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- image: nginx
nodeSelector:
disktype: ssd
Deployments key feature is support for multple ReplicaSets during rolling/canary deployment with unique configuration per each RS.
Scaling deployments:
kubectl scale deployment/nginx --replicas=5
kubectl get deployments
Update deployment to a specific image version:
kubectl set image deployment/nginx nginx=nginx:1.10 --all
kubectl get rs --watch
Trigger rolling update:
kubectl edit deployment/nginx
kubectl get rs --watch
With all RS of a deployment kept, you can roll back to a previous revision by scaling up and down replica sets.
Roll backs:
kubectl run ghost --image=ghost --record
kubectl get deployments ghost -o yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
kubernetes.io/change-cause: kubectl run ghost --image=ghost --record=true (without --record we would not know why new revision exists)
Now break update:
kubectl set image deployment/ghost ghost=ghost:dmitri --all
kubectl rollout history deployment/ghost
kubectl get pods
kubectl rollout undo deployment/ghost
kubectl get pods
Roll back to specific version:
kubectl rollout undo deployment/ghost --to-revision=2
Pause deploymentt:
kubectl rollout pause deployment/ghost
kubectl rollout resume deployment/ghost
Without a running scheduler your pods will remain in Pending state.
kube-scheduler paramers:
--scheduler-name
--policy-config-file
--leader-elect
To find a node to run a pod the schueduler goes through the list of filetrs to find available node and ranks them. Highest ranking node selected to run a pod.
Filtering done via set of policies called predicates
Ranking done via set of priority functions.
Together this forms a scheduling algorithm (https://github.com/kubernetes/community/blob/master/contributors/devel/scheduler_algorithm.md)
You can affect pod scheduling using pod specification: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeName - target node
nodeSelector - taget set of nodes. e.g. kubectl label node minikube foo=bar
affinity -
schedulerName
tolerations
Taints:
A node with a particular taint will repel pods that don't tolarete that taint.
kubectl taint node master foo=bar:NoSchedule
tolerations:
- key: "foo"
operator: "Equal"
value: "bar"
effect: "NoSchedule"
Affinity: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
node affinity - more advanced ways to express node preferences for a pod.
Rules can be sofr or hard, expressing preference or requirement.
e.g.
apiVersion: v1
type: Pod
metadata:
name: ghost
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: foo
operation: In
values:
- bar
containers:
- name: ghost
image: ghost:0.9
podAffinity - advanced ways to express placement of pods in relation to other pods.
podAntiAffinity
apiVersion: v1
kind: Pod
metadata:
name: ghost
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchingExpression:
- key: app
operator: In
values:
- frontend
topologyKey: failure-domain.beta.kubernetes.io/zone
containers:
- name: ghost
image: ghost:0.9
Pod can select scheduler:
apiVersion: v1
kind: Pod
metadata:
name: ghost
spec:
schedulerName: foobar
containers:
- name: ghost
image: ghost:0.9
kubectl expose deployment/nginx --port=80 --type=NodePort
kubectl get svc
kubectl get endpoints
kubectl get svc nginx -o yaml
apiVersion: v1
kind: Service
...
spec:
clusterIP: 10.0.0.112
ports:
- nodePort: 31230
curl $(minikube ip):31230
DNS (https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)
kubectl get rs --all-namespaces | grep dns
Check DNS is running from within a pod:
kubectl exec -it busybox -- nslookup nginx
Server: 10.0.0.10
Address 1: 10.0.0.10
Name: nginx
Address 1: 10.0.0.112
Services (https://kubernetes.io/docs/concepts/services-networking/service/)
Are abstractions that direct traffic to a set of pods that provide a micro-service. Services are key to linking applications together.
Implemented via iptables. Kube-proxy watches the Kubernetes API for new services and endpoints being created. It opens random ports on nodes to listen to the traffic to the ClusterIP:Port, and redirects to random service endpoints.
This used to be a round-robin in the userspace implementation.
Service types:
ClusterIP - only internal access. Range is defined via an API server startup option.
NodePort - not recommened for public access. Great for debugging. Need a hole in a firewall to work. NodePort range is defined in the Cluster Configuration.
LoadBalancer - currently only implemented on public cloud providers like GKE and AWS.
You can also run kubectl proxy locally to access a ClusterIP service. This option is great for development.
curl http://localhost:8001/api/v1/namespaces/default/services/ghost:<port_name>
Simpliest volume example:
EmptyDir - volume shared by containers within pod, will survice container crashes, but not the pod destroy.
apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
containers:
- image: busybox
name: busybox
command:
- sleep
- "3600"
volumeMounts:
- mountPath: /scratch
name: scratch
- image: busybox
name: lessbusybox
command:
- sleep
- "3600"
volumeMounts:
- mountPath: /scratch
name: scratch
volumes:
- name: scratch
emptyDir: {}
kubectl exec -it busybox -c busybox -- touch /scratch/bla
kubectl exec -it busybox -c lessbusybox -- ls -l /scratch
Persistent Volume (PV) & Persistent Volume Claim (PVC)
Abstract away underlying storage, provide persistent storage to pods.
kubectl get pv pvc
kind: PeristentVolume
apiVersion: v1
metadata:
name: pv0001
labels:
type: local
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/somepath/data1"
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
spec:
containers:
....
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: myclaim
DynamicStorage provisioning using StoraClass
AWS Example: https://github.com/kubernetes/examples/blob/master/staging/persistent-volume-provisioning/README.md
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
Secrets:
kubectl get secrets
kubectl create secret generic --help
kubectl create secret generic mysql --from-literal=password=root
By default, secrets are only base64 encoded (https://kubernetes.io/docs/concepts/configuration/secret/#security-properties)
apiVersion: v1
data:
password: cm9vda==
kind: Secret
metadata:
name: mysql
type: Opaque
spec:
containers:
- image: mysql:5.5
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql
key: password
name: mysql
Secrets are stored in tmpfs on the node and only available to nodes running your pod.
You can also mount secrets as volumes/files.
...
spec:
containers:
- image:
name: busybox
command:
- sleep
- "3600"
volumeMounts:
- mountPath: /mysqlpassword
name: mysql
name: busy
volumes:
- name: mysql
secret:
secretName: mysql
kubectl exec -it busybox -- cat /mysqlpassword/password
ConfigMaps
Used to pass configuration data to a pod. Can be used along with secrets, environment variables and command line arguments.
kubectl create configmap foobar --from-file=config.js
kubectl get configmap
kubectl get configmap foobar -o yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: foobar
data:
config.js: |
{
....
Using ConfigMap as an environment variable:
env:
- name: SPECIAL_LEVEL_KEY
valueFrom:
configMapKeyRef:
name: special-config
key: special.how
volumes:
- name: config-volume
configMap:
name: special-config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment