ddelange/kubectl_plug_play.md

## kubectl_plug_play.md

      
    Raw
  

              kubectl_plug_play.md
            
          
    Kubectl Plug & Play Extensions


interactive shell -- kubebash <namespace> <deployment>
live colored logs streaming -- kubelogs <namespace> <deployment>
deployments dockertag table -- kubebranch <namespace> [<partial deployment name>]

A customized subset of kubectl commands. See also: official kubectl cheatsheet
tl;dr

I just want plug&play bash functions

Set up kubectl and add these functions to .bashrc:
kubelogs() {
  # View logs as they come in (like in Rancher) using mktemp and less -r +F.
  # Use ctrl+c to detach from stream (enter scrolling mode)
  # Use shift+f to attach to bottom of stream
  # Use ? to perform a backward search (regex possible)
  # Use N or n to find resp. next or previous search match
  # Set KUBELOGS_MAX to change amount of previous lines to fetch before streaming
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -ne 2 ]; then
      echo "Usage: kubelogs <your-namespace> <podname-prefix>"
      return
  fi
  local namespace=$1
  local pod=$2
  local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
  if [[ ${podname} != ${pod}* ]]; then
      echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
      return
  fi
  local tmpfile=`mktemp`
  local log_tail_lines=${KUBELOGS_MAX:-10000}
  local sleep_amount=$((7 + log_tail_lines / 20000))
  echo "kubectl logs --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} --since 24h --tail ${log_tail_lines} -f ${podname} > ${tmpfile}"
  kubectl logs --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} --since 24h --tail ${log_tail_lines} -f ${podname} > ${tmpfile} &
  local k8s_log_pid=$!
  echo "Waiting ${sleep_amount}s for logs to download"
  sleep ${sleep_amount} && less -rf +F ${tmpfile} && kill ${k8s_log_pid} && echo "kubectl logs pid ${k8s_log_pid} killed"
}

kubebash() {
  # Execute a bash shell in a pod
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -ne 2 ]; then
      echo "Usage: kubebash <your-namespace> <podname-prefix>"
      return
  fi
  local namespace=$1
  local pod=$2
  local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
  if [[ ${podname} != ${pod}* ]]; then
      echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
      return
  fi
  kubectl exec -ti --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} ${podname} bash
}

kubebranch() {
  # View a list of current branch[es] deployed for namespace [+ pod]
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -gt 2 ]; then
    echo "Usage: kubebranch <your-namespace> [<partial-podname>]"
    return
  fi
  if [ $# -lt 1 ]; then
    echo "Usage: kubebranch <your-namespace> [<partial-podname>]"
    return
  fi
  local namespace=$1
  if [ $# -eq 2 ]; then
    local pod=$2
    local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
    if [[ ${podname} != ${pod}* ]]; then
        echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
        return
    fi
    kubectl get deployments --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | sed -n '1!p' | awk '{print $1 "\t" $8}' | uniq | tr ":" "\t" | column  -t | grep ${pod}
  else
    kubectl get deployments --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | sed -n '1!p' | awk '{print $1 "\t" $8}' | uniq | tr ":" "\t" | column  -t
  fi
}


Set up kubectl

View

In order to use kubectl from your machine (cluster specific), write the kubeconfig file to ~/.kube:
mkdir ~/.kube
nano ~/.kube/config
# paste kubeconfig (see image) and write out


General commands

View

Might fail if there are missing permissions for namespaces
# list full podnames in a namespace
kubectl get pods --namespace <your-namespace> -o wide
# list all namespaces in the cluster (will error if there are missing permissions for some namespace)
kubectl get namespaces
# list all pods in all namespaces in the cluster (will error if there are missing permissions for some namespace)
kubectl get pods --all-namespaces -o wide
# execute e.g. `ls -l /code`
kubectl exec -ti --namespace <your-namespace> <full-podname> -- ls -l /code


View logs as they come in (like in Rancher)

View

To view the last 25 lines in the past 24h, plus all new messages, use:
kubectl logs --namespace <your-namespace> --since 24h --tail 25 -f <full-podname> | less +F
To automate finding full podname and viewing logs, add the following function to .bashrc and execute kubelogs <your-namespace> <podname-prefix>
kubelogs() {
  # View logs as they come in (like in Rancher) using mktemp and less -r +F.
  # Use ctrl+c to detach from stream (enter scrolling mode)
  # Use shift+f to attach to bottom of stream
  # Use ? to perform a backward search (regex possible)
  # Use N or n to find resp. next or previous search match
  # Set KUBELOGS_MAX to change amount of previous lines to fetch before streaming
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -ne 2 ]; then
      echo "Usage: kubelogs <your-namespace> <podname-prefix>"
      return
  fi
  local namespace=$1
  local pod=$2
  local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
  if [[ ${podname} != ${pod}* ]]; then
      echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
      return
  fi
  local tmpfile=`mktemp`
  local log_tail_lines=${KUBELOGS_MAX:-10000}
  local sleep_amount=$((7 + log_tail_lines / 20000))
  echo "kubectl logs --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} --since 24h --tail ${log_tail_lines} -f ${podname} > ${tmpfile}"
  kubectl logs --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} --since 24h --tail ${log_tail_lines} -f ${podname} > ${tmpfile} &
  local k8s_log_pid=$!
  echo "Waiting ${sleep_amount}s for logs to download"
  sleep ${sleep_amount} && less -rf +F ${tmpfile} && kill ${k8s_log_pid} && echo "kubectl logs pid ${k8s_log_pid} killed"
}
Notes:

grep stops after first matching podname is found (-m 1)
the --timestamps option for kubectl logs is also noteworthy.
Without sleep, --tail 100 will already become slow if lines are extremely long. Use less -S +F to truncate instead of wrap long lines. With mktemp and sleep, the logs will be loaded before calling less, and even --tail 1000 can be viewed 'instantly'.


Execute a bash shell in a pod

View

To execute a bash shell in a pod, use:
kubectl exec -ti --namespace <your-namespace> <full-podname> bash
To automate finding full podname and executing shell, add the following function to .bashrc and execute kubebash <your-namespace> <podname-prefix>
kubebash() {
  # Execute a bash shell in a pod
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -ne 2 ]; then
      echo "Usage: kubebash <your-namespace> <podname-prefix>"
      return
  fi
  local namespace=$1
  local pod=$2
  local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
  if [[ ${podname} != ${pod}* ]]; then
      echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
      return
  fi
  kubectl exec -ti --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} ${podname} bash
}
Notes:

grep stops after first matching podname is found (-m 1)


View a list of current branch[es] deployed for namespace [+ pod]

View

To get an overview of the branches currently deployed in a namespace [for pods containing some string], add the following function to .bashrc and execute kubebranch <your-namespace> [<partial-podname>]
kubebranch() {
  # View a list of current branch[es] deployed for namespace [+ pod]
  # Set $KUBECONFIG to deviate from "$HOME/.kube/config"
  if [ $# -gt 2 ]; then
    echo "Usage: kubebranch <your-namespace> [<partial-podname>]"
    return
  fi
  if [ $# -lt 1 ]; then
    echo "Usage: kubebranch <your-namespace> [<partial-podname>]"
    return
  fi
  local namespace=$1
  if [ $# -eq 2 ]; then
    local pod=$2
    local podname=`kubectl get pods --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | grep Running | grep -o -m 1 "^${pod}[a-zA-Z0-9\-]*\b"`
    if [[ ${podname} != ${pod}* ]]; then
        echo "Pod \"${pod}\" not found in namespace \"${namespace}\""
        return
    fi
    kubectl get deployments --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | sed -n '1!p' | awk '{print $1 "\t" $8}' | uniq | tr ":" "\t" | column  -t | grep ${pod}
  else
    kubectl get deployments --kubeconfig ${KUBECONFIG:-"$HOME/.kube/config"} --namespace ${namespace} -o wide | sed -n '1!p' | awk '{print $1 "\t" $8}' | uniq | tr ":" "\t" | column  -t
  fi
}
Notes:

partial podnames can be used


## kubetop.py
# View node and container resource requests, limits, and usage, grouped per node,namespace,pod,container
#
# Installation:
# pip install mapply pandas kubernetes sh
# brew install watch kubectl coreutils
#
# Usage:
# alias kubetop='watch -n4 python ~/Downloads/kubetop.py'
# kubetop

import json
import os
import re
import subprocess

import numpy as np
import pandas as pd
from kubernetes.utils.quantity import parse_quantity
from sh import numfmt

from mapply.parallel import multiprocessing_imap


class Q(dict):
    """Dict with key access via attributes."""

    def __missing__(self, key):
        value = self[key] = type(self)()  # retain local pointer to value
        return value  # faster to return than dict lookup

    def __getattr__(self, name: str):
        return self[name]

    def __setattr__(self, name: str, value):
        self[name] = value


pd.set_option("display.max_rows", None)
pd.set_option("display.width", None)
pd.set_option("display.max_colwidth", None)


def print_node_top():
    nodes = subprocess.check_output(
        "kubectl top node --sort-by memory", shell=True, encoding="utf-8"
    ).strip()
    print(nodes)
    return nodes


def get_nodes():
    nodes = subprocess.check_output(
        "kubectl get nodes -L 'topology.kubernetes.io/zone' -L 'beta.kubernetes.io/arch' -L 'beta.kubernetes.io/instance-type' --no-headers=true",
        shell=True,
        encoding="utf-8",
    ).strip()
    nodes = pd.DataFrame(
        (re.split(" +", x.strip()) for x in nodes.split("\n")),
        columns=[
            "node",
            "status",
            "roles",
            "age",
            "version",
            "zone",
            "arch",
            "instance_type",
        ],
    ).set_index("node")
    nodes["spot"] = nodes["roles"].str.contains("spot-worker")
    return nodes


def get_allocatable():
    allocatable = (
        subprocess.check_output(
            "kubectl get nodes -o jsonpath=\"{.items[*]['metadata.name', 'status.allocatable']}\"",
            shell=True,
            encoding="utf-8",
        )
        .strip()
        .split(" ")
    )
    allocatable = dict(zip(*np.array_split(allocatable, 2)))
    df = pd.Series(allocatable).apply(json.loads).apply(pd.Series)
    df.index.name = "node"
    return df


def get_top():
    top = subprocess.check_output(
        "kubectl top pods --containers=true --all-namespaces --no-headers=true",
        shell=True,
        encoding="utf-8",
    ).strip()
    top = pd.DataFrame(re.split(" +", x.strip()) for x in top.split("\n"))
    top.columns = [
        "namespace",
        "pod",
        "container",
        ("cpu", "usage"),
        ("memory", "usage"),
    ]
    top.reset_index(inplace=True, drop=True)
    top.set_index(["pod", "namespace", "container"], inplace=True)
    top.columns = pd.MultiIndex.from_tuples(top.columns)
    return top


def get_resources():
    pods = subprocess.check_output(
        "kubectl get pods --all-namespaces -o json", shell=True, encoding="utf-8"
    ).strip()
    pods = json.loads(pods)
    per_node = Q()
    for pod in pods["items"]:
        if pod['status']['phase'] in ("Succeeded", "Pending"):
            continue
        for container in pod["spec"]["containers"]:
            if not container["resources"]:
                container["resources"] = {
                    "limits": {"cpu": np.NaN},
                    "requests": {"cpu": np.NaN},
                }

        for volume in pod["spec"]["volumes"]:
            if claim := volume.get("persistentVolumeClaim"):
                name = volume["name"]
                claim_name = claim["claimName"]
                # match the pvc to the container mount
                for container in pod["spec"]["containers"]:
                    for mount in container.get("volumeMounts", []):
                        if mount["name"] == name:
                            if container["resources"]["requests"].get("pvc"):
                                container["resources"]["requests"][
                                    "pvc"
                                ] += f",{claim_name}"
                            else:
                                container["resources"]["requests"]["pvc"] = claim_name

        data = {
            container["name"]: container["resources"]
            for container in pod["spec"]["containers"]
        }

        per_node[pod["spec"].get("nodeName")][pod["metadata"]["namespace"]][
            pod["metadata"]["name"]
        ] = data

    resources = pd.json_normalize(per_node, sep="|").T
    resources.index = pd.MultiIndex.from_tuples(
        resources.index.str.split("|").tolist(),
        names=["node", "namespace", "pod", "container", "", ""],
    )
    resources = resources.unstack().unstack()[0]
    resources.drop(("pvc", "limits"), axis=1, inplace=True)
    return resources


def merge(top, resources):
    join = resources.join(top, how="outer")
    join = join[sorted(join.columns)]
    join = join.reorder_levels(["node", "namespace", "pod", "container"]).sort_index()
    return join


top, resources, nodes, allocatable = multiprocessing_imap(
    lambda x: x(),
    [get_top, get_resources, get_nodes, get_allocatable],
    progressbar=False,
    n_workers=1,
)
join = merge(top, resources)


def _parse_quantity(x):
    if x != x:
        return np.NaN
    if x[0].isdigit():
        return parse_quantity(x)
    return x


node_totals = join.map(_parse_quantity).groupby("node").sum(0)

allocatable = allocatable.map(_parse_quantity)
allocatable.columns = pd.MultiIndex.from_tuples(
    (x, "allocatable") for x in allocatable.columns
)
node_totals = node_totals.join(allocatable)[["cpu", "memory"]]
node_totals = node_totals[sorted(node_totals.columns)]

nodes.columns = pd.MultiIndex.from_tuples(("meta", x) for x in nodes.columns)
nodes = nodes.join(node_totals)

# most memory usage first
nodes.sort_values(("memory", "usage"), ascending=False, inplace=True)

# sort nodes on same sorting as nodes
join = join.loc[nodes.index]

nodes["memory"] = (
    nodes["memory"]
    .map(int)
    .map(str)
    .apply(lambda x: numfmt("\n".join(x), to="iec-i", format="%.1f", field="-").split())
)

# exclude namespaces that have 'system' in them
if os.environ.get("HIDESYS"):
    join = join[
        ~join.index.to_frame(index=False)
        .namespace.str.contains("system|prometheus", regex=True)
        .fillna(False)
        .values
    ]

print(nodes)
print(join.fillna(""))

# all containers that do not have mem limits or mem limit != mem request
# join[join.memory.limits.isna() | (join.memory.limits != join.memory.requests)]['memory'].reset_index().fillna('').drop(columns='node').sort_values(['namespace', 'pod', 'container'])
	# View node and container resource requests, limits, and usage, grouped per node,namespace,pod,container
	#
	# Installation:
	# pip install mapply pandas kubernetes sh
	# brew install watch kubectl coreutils
	#
	# Usage:
	# alias kubetop='watch -n4 python ~/Downloads/kubetop.py'
	# kubetop

	import json
	import os
	import re
	import subprocess

	import numpy as np
	import pandas as pd
	from kubernetes.utils.quantity import parse_quantity
	from sh import numfmt

	from mapply.parallel import multiprocessing_imap


	class Q(dict):
	"""Dict with key access via attributes."""

	def __missing__(self, key):
	value = self[key] = type(self)() # retain local pointer to value
	return value # faster to return than dict lookup

	def __getattr__(self, name: str):
	return self[name]

	def __setattr__(self, name: str, value):
	self[name] = value


	pd.set_option("display.max_rows", None)
	pd.set_option("display.width", None)
	pd.set_option("display.max_colwidth", None)


	def print_node_top():
	nodes = subprocess.check_output(
	"kubectl top node --sort-by memory", shell=True, encoding="utf-8"
	).strip()
	print(nodes)
	return nodes


	def get_nodes():
	nodes = subprocess.check_output(
	"kubectl get nodes -L 'topology.kubernetes.io/zone' -L 'beta.kubernetes.io/arch' -L 'beta.kubernetes.io/instance-type' --no-headers=true",
	shell=True,
	encoding="utf-8",
	).strip()
	nodes = pd.DataFrame(
	(re.split(" +", x.strip()) for x in nodes.split("\n")),
	columns=[
	"node",
	"status",
	"roles",
	"age",
	"version",
	"zone",
	"arch",
	"instance_type",
	],
	).set_index("node")
	nodes["spot"] = nodes["roles"].str.contains("spot-worker")
	return nodes


	def get_allocatable():
	allocatable = (
	subprocess.check_output(
	"kubectl get nodes -o jsonpath=\"{.items[*]['metadata.name', 'status.allocatable']}\"",
	shell=True,
	encoding="utf-8",
	)
	.strip()
	.split(" ")
	)
	allocatable = dict(zip(*np.array_split(allocatable, 2)))
	df = pd.Series(allocatable).apply(json.loads).apply(pd.Series)
	df.index.name = "node"
	return df


	def get_top():
	top = subprocess.check_output(
	"kubectl top pods --containers=true --all-namespaces --no-headers=true",
	shell=True,
	encoding="utf-8",
	).strip()
	top = pd.DataFrame(re.split(" +", x.strip()) for x in top.split("\n"))
	top.columns = [
	"namespace",
	"pod",
	"container",
	("cpu", "usage"),
	("memory", "usage"),
	]
	top.reset_index(inplace=True, drop=True)
	top.set_index(["pod", "namespace", "container"], inplace=True)
	top.columns = pd.MultiIndex.from_tuples(top.columns)
	return top


	def get_resources():
	pods = subprocess.check_output(
	"kubectl get pods --all-namespaces -o json", shell=True, encoding="utf-8"
	).strip()
	pods = json.loads(pods)
	per_node = Q()
	for pod in pods["items"]:
	if pod['status']['phase'] in ("Succeeded", "Pending"):
	continue
	for container in pod["spec"]["containers"]:
	if not container["resources"]:
	container["resources"] = {
	"limits": {"cpu": np.NaN},
	"requests": {"cpu": np.NaN},
	}

	for volume in pod["spec"]["volumes"]:
	if claim := volume.get("persistentVolumeClaim"):
	name = volume["name"]
	claim_name = claim["claimName"]
	# match the pvc to the container mount
	for container in pod["spec"]["containers"]:
	for mount in container.get("volumeMounts", []):
	if mount["name"] == name:
	if container["resources"]["requests"].get("pvc"):
	container["resources"]["requests"][
	"pvc"
	] += f",{claim_name}"
	else:
	container["resources"]["requests"]["pvc"] = claim_name

	data = {
	container["name"]: container["resources"]
	for container in pod["spec"]["containers"]
	}

	per_node[pod["spec"].get("nodeName")][pod["metadata"]["namespace"]][
	pod["metadata"]["name"]
	] = data

	resources = pd.json_normalize(per_node, sep="\|").T
	resources.index = pd.MultiIndex.from_tuples(
	resources.index.str.split("\|").tolist(),
	names=["node", "namespace", "pod", "container", "", ""],
	)
	resources = resources.unstack().unstack()[0]
	resources.drop(("pvc", "limits"), axis=1, inplace=True)
	return resources


	def merge(top, resources):
	join = resources.join(top, how="outer")
	join = join[sorted(join.columns)]
	join = join.reorder_levels(["node", "namespace", "pod", "container"]).sort_index()
	return join


	top, resources, nodes, allocatable = multiprocessing_imap(
	lambda x: x(),
	[get_top, get_resources, get_nodes, get_allocatable],
	progressbar=False,
	n_workers=1,
	)
	join = merge(top, resources)


	def _parse_quantity(x):
	if x != x:
	return np.NaN
	if x[0].isdigit():
	return parse_quantity(x)
	return x


	node_totals = join.map(_parse_quantity).groupby("node").sum(0)

	allocatable = allocatable.map(_parse_quantity)
	allocatable.columns = pd.MultiIndex.from_tuples(
	(x, "allocatable") for x in allocatable.columns
	)
	node_totals = node_totals.join(allocatable)[["cpu", "memory"]]
	node_totals = node_totals[sorted(node_totals.columns)]

	nodes.columns = pd.MultiIndex.from_tuples(("meta", x) for x in nodes.columns)
	nodes = nodes.join(node_totals)

	# most memory usage first
	nodes.sort_values(("memory", "usage"), ascending=False, inplace=True)

	# sort nodes on same sorting as nodes
	join = join.loc[nodes.index]

	nodes["memory"] = (
	nodes["memory"]
	.map(int)
	.map(str)
	.apply(lambda x: numfmt("\n".join(x), to="iec-i", format="%.1f", field="-").split())
	)

	# exclude namespaces that have 'system' in them
	if os.environ.get("HIDESYS"):
	join = join[
	~join.index.to_frame(index=False)
	.namespace.str.contains("system\|prometheus", regex=True)
	.fillna(False)
	.values
	]

	print(nodes)
	print(join.fillna(""))

	# all containers that do not have mem limits or mem limit != mem request
	# join[join.memory.limits.isna() \| (join.memory.limits != join.memory.requests)]['memory'].reset_index().fillna('').drop(columns='node').sort_values(['namespace', 'pod', 'container'])