Skip to content

Instantly share code, notes, and snippets.

resources:
limits:
cpu: 1000m
memory: 512Mi
requests:
cpu: 100m
memory: 50Mi

How to Set Kubernetes Resource Requests and Limits - A Saga to Improve Cluster Stability and Efficiency

A mystery

So, it all started on September 1st, right after our cluster upgrade from 1.11 to 1.12. Almost on the next day, we began to see alerts on kubelet reported by Datadog. On some days we would get a few (3 - 5) of them, other days we would get more than 10 in a single day. The alert monitor is based on a Datadog check kubernetes.kubelet.check, and it's triggered whenever the kubelet process is down in a node.

We know kubelet plays an important role in Kubernetes scheduling. Not having it running properly in a node would directly remove that node from a functional cluster. Having more nodes with problematic kubelet then we get a cluster degradation. Now, Imagine waking up to

extraArgs:
base-role-arn: "arn:aws:iam::<YOUR AWS ACCOUNT ID>:role/"
host:
interface: cni0
iptables: true
rbac:
create: true
resources:
limits:
cpu: 100m
extraArgs:
base-role-arn: "arn:aws:iam::<YOUR AWS ACCOUNT ID>:role/"
host:
interface: cni0
iptables: true
rbac:
create: true
resources:
limits:
cpu: 100m
@stevenc81
stevenc81 / Quick Kubernetes Helm Learnings on Chart Backward Compatibility.md
Last active October 10, 2019 01:02
Quick Kubernetes Helm Learnings on Chart Backward Compatibility

Quick Kubernetes Helm Learnings on Chart Backward Compatibility

At Buffer we utilize kube2iam for AWS permissions. It's deployed, upgraded and managed with the stable Helm Chart repository.

It has been working great, until last week due to my oversight. The situation was resolved in a matter of 30 minutes and no users were affected. Nonetheless, I'd love to share my learnings with the community.

To upgrader kube2iam I have been using this command

helm upgrade kube2iam --install stable/kube2iam --namespace default -f ./values.yaml

@stevenc81
stevenc81 / Kubernetes Master Nodes Backup for Kops on AWS - A step-by-step Guide.md
Created August 27, 2019 01:03
Kubernetes Master Nodes Backup for Kops on AWS - A step-by-step Guide

Kubernetes Master Nodes Backup for Kops on AWS - A step-by-step Guide

For those who have been using kops for a while should know the upgrade from 1.11 to 1.12 poses a greater risk, as it will upgrade etcd2 to etcd3.

Since this upgrade is disruptive to the control plane (master nodes), although brief, it's still something we take very seriously because nearly all the Buffer production services are running on this single cluster. We felt a more thorough backup process than the currently implemented Heptio Velero was needed.

To my surprises, my Google searches didn't yield any useful result on how to carry out the backup steps. To be fair, there are a few articles that are specifically for backing up master nodes created by kubeedm but nothing too concrete for `kop

Application Observability in Kubernetes with Datadog APM and Logging - A simple and actionable example

Last year I shared an example on how to realize application tracing in Kuberntes with Istio and Jaeger. After that, the industry has made some substantial headway on this front and we are seeing more vendor support as a result. At Buffer, since we primarily use Datadog for Kubernetes and application monitoring, it's only fitting to complete the circle with Datadog APM and Logging. I had a chance to create a small example for the team and would very much love to share with the community.

Okay, without further ado, let's dive in!

Installing Datadog agent

First thing first, in order to collect metrics and logs from Kubernetes an Datadog agent has to be installed in the cluster. The Datadog team ma

@stevenc81
stevenc81 / dd-tracing-logging-examples-nodejs-deplyment.yaml
Created May 13, 2019 01:46
A quick example for Datadog APM tracing and logging
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: dd-tracing-logging-examples-nodejs
name: dd-tracing-logging-examples-nodejs
namespace: dev
spec:
replicas: 1
template:
@stevenc81
stevenc81 / dd-tracing-logging-examples-nodejs.js
Created May 13, 2019 00:04
A quick example for Datadog APM tracing and logging
require('dd-trace').init({
hostname: process.env.DD_AGENT_HOST,
port: 8126,
env: 'development',
logInjection: true,
analytics: true,
});
const { createLogger, format, transports } = require('winston');
const addAppNameFormat = format(info => {

Upgrading Kubernetes Cluster with Kops, and Things to Watch Out For

Alright! I'd like to apologize for the inactivity for over a year. Very embarrassingly, I totally dropped the good habit. Anyways, today I'd like to share a not so advanced and much shorter walkthrough on how to upgrade Kubernetes with kops.

At Buffer, we host our own k8s (Kubernetes for short) cluster on AWS EC2 instances since we started our journey before AWS EKS. To do this effectively, we use kops. It's an amazing tool that manages pretty much all aspects of cluster management from creation, upgrade, updates and deletions. It never failed us.

How to start?

Okay, upgrading a cluster always makes people nervous, especially a production cluster. Trust me, I've been there! There is a saying, hope is not a strategy. So instead of hoping things will go smoothly, I always have bias that shit will hit the fan if you skip testing. Plus, good luck explaining to people