Skip to content

Instantly share code, notes, and snippets.

Avatar
💭
Adding tremendous value

Steven Acreman sacreman

💭
Adding tremendous value
  • ThunderOps
  • United Kingdom
View GitHub Profile
@sacreman
sacreman / k8sbackup.sh
Last active Jul 12, 2019
Backup all K8s config to a timestamped location
View k8sbackup.sh
#!/bin/bash -e
BACKUP_DIR="/var/tmp/k8sbackup/$(date +%s)"
echo "Backing up cluster to ${BACKUP_DIR}"
NAMESPACES=$(kubectl get ns -o jsonpath={.items[*].metadata.name})
RESOURCETYPES="${RESOURCETYPES:-"ingress deployment configmap secret svc rc ds networkpolicy statefulset cronjob pvc"}"
GLOBALRESOURCES="${GLOBALRESOURCES:-"namespace storageclass clusterrole clusterrolebinding customresourcedefinition"}"
mkdir -p ${BACKUP_DIR}
@sacreman
sacreman / aks.txt
Created Oct 23, 2018
Dolos Results 2018-10-23
View aks.txt
2018-10-22 08:39:01 INFO Starting test
2018-10-22 08:39:01 INFO Creating Resource Group
2018-10-22 08:39:03 INFO Creating the AKS cluster
2018-10-22 08:52:22 INFO Getting cluster credentials
2018-10-22 08:52:23 INFO Get Nodes
2018-10-22 08:52:26 INFO b'NAME STATUS ROLES AGE VERSION\naks-nodepool1-18093422-0 Ready agent 3m v1.9.11\n'
2018-10-22 08:52:26 INFO Applying Deployment
2018-10-22 08:52:32 INFO b'deployment.apps/azure-vote-back created\nservice/azure-vote-back created\ndeployment.apps/azure-vote-front created\nservice/azure-vote-front created\n'
2018-10-22 08:52:32 INFO Getting external IP
2018-10-22 08:56:18 INFO Getting web contents from 104.41.139.254
View hosts.json
{
"dashboard": {
"title": "Hosts",
"description": "Basic host stats: CPU, Memory Usage, Disk Utilisation, Filesystem usage and Predicted time to filesystems filling",
"id": null,
"rows": [{
"collapse": false,
"editable": true,
"height": "250px",
"panels": [{
@sacreman
sacreman / kubernetes-dashboard.json
Created Aug 6, 2017
Kubernetes Dashboard for Grafana
View kubernetes-dashboard.json
{
"annotations": {
"list": []
},
"description": "Monitors Kubernetes cluster using Prometheus. Shows overall cluster CPU / Memory / Filesystem usage as well as individual pod, containers, systemd services statistics. Uses cAdvisor metrics only.",
"editable": true,
"gnetId": 315,
"graphTooltip": 0,
"hideControls": false,
"id": 2,
@sacreman
sacreman / prometheus.yml
Last active Aug 7, 2020
Prometheus configuration to scrape Kubernetes outside the cluster
View prometheus.yml
# Prometheus configuration to scrape Kubernetes outside the cluster
# Change master_ip and api_password to match your master server address and admin password
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# metrics for the prometheus server
- job_name: 'prometheus'
View dalmatiner_prometheus.md

Monitoring Kubernetes with Prometheus + Long Term Storage

View build_vs_buy.md

Running an online service isn't easy. Every day you make complex decisions about how to solve problems and often there is no right or wrong answer, there are just different ways with different results. On the infrastructure side you have to weigh up where everything will be hosted. Is that on a cloud service like AWS, or in your own data centres, or any number of other options, perhaps even a mix.

Monitoring choices are equally hard. There are the tools that are familiar and a known quantity, some new ones that look interesting from reading blogs, and then the option to buy one of any number of SaaS products.

Let's imagine for the sake of brevity of this blog that you are looking to move into AWS from your traditional data centre and want to upgrade from your Nagios, Graphite and StatsD stack to something a bit newer. This is actually an incredibly common scenario that we see every day.

The first decision to make is to analyse up front whether to build or buy. To properly make that decision you'll need to

View feature_flags.md
  • aws
  • awsapi
  • metricsBrowser
  • aws-i
View old_api_docs.md

Introduction

Welcome to the Dataloop API documentation!

To use the API you'll need an api key which can be created in Dataloop under your user account settings. When integrating services you may want to consider creating an application specific user in Dataloop with access to accounts at the correct role level.

You will also need to know the organisation name and account name that you want to work with. These match the organisation and account names in Dataloop. Use these details where you see <org name> and <account name> in the examples.

View sum base.count metrics
#!/usr/bin/env python
import sys
from dlcli import api
'''
Returns a sum of the number of agents that have returned the base.count metric in the last minute.
You will need to update the TAG, org, account and key variables in the settings below.
'''
You can’t perform that action at this time.