Skip to content

Instantly share code, notes, and snippets.

View sacreman's full-sized avatar
💭
Adding tremendous value

Steven Acreman sacreman

💭
Adding tremendous value
  • ThunderOps
  • United Kingdom
View GitHub Profile

Running an online service isn't easy. Every day you make complex decisions about how to solve problems and often there is no right or wrong answer, there are just different ways with different results. On the infrastructure side you have to weigh up where everything will be hosted. Is that on a cloud service like AWS, or in your own data centres, or any number of other options, perhaps even a mix.

Monitoring choices are equally hard. There are the tools that are familiar and a known quantity, some new ones that look interesting from reading blogs, and then the option to buy one of any number of SaaS products.

Let's imagine for the sake of brevity of this blog that you are looking to move into AWS from your traditional data centre and want to upgrade from your Nagios, Graphite and StatsD stack to something a bit newer. This is actually an incredibly common scenario that we see every day.

The first decision to make is to analyse up front whether to build or buy. To properly make that decision you'll need to

Monitoring Kubernetes with Prometheus + Long Term Storage

@sacreman
sacreman / graphite.py
Created July 8, 2015 13:44
Statsite Graphite + Dataloop Sink
"""
Supports flushing metrics to graphite
"""
import sys
import socket
import logging
class GraphiteStore(object):
def __init__(self, host="localhost", port=2003, prefix="statsite.", attempts=3):

My Top10 Open Source Time Series databases blog has been incredibly popular with over 10,000 views and growing. It sat on the front page of Reddit /r/programming for a day or two and we got a bunch of traffic from Hacker News and DBWeekly. The raw data in the spreadsheet that accompanies the blog has been constantly updated by a team of volunteers which now includes some of the database authors.

It has quickly become the single point of reference for anyone looking for a new time series database.

insert tweet referencing back blog to us

Someone called my blog biased on Twitter which I thought was funny. It's true that I am biased towards mostly solving my own problems and like anyone I can only draw upon my own experiences. However, I'm fairly impartial when it comes to these topics in general.

{
"dashboard": {
"title": "Hosts",
"description": "Basic host stats: CPU, Memory Usage, Disk Utilisation, Filesystem usage and Predicted time to filesystems filling",
"id": null,
"rows": [{
"collapse": false,
"editable": true,
"height": "250px",
"panels": [{
@sacreman
sacreman / gist:4453493
Created January 4, 2013 15:32
haproxy config
global
log 127.0.0.1 local2 info
pidfile /var/run/haproxy.pid
stats socket /var/run/haproxy.stat mode 600 level admin
#debug
defaults
mode http
log global
@sacreman
sacreman / upgrade_python.sh
Created November 19, 2014 14:34
Centos 6 upgrade to Python 2.7
#!/usr/bin/env bash
# install build tools
sudo yum install make automake gcc gcc-c++ kernel-devel git-core -y
# install python 2.7 and change default python symlink
sudo yum install python27-devel -y
sudo rm /usr/bin/python
sudo ln -s /usr/bin/python2.7 /usr/bin/python
<URL "http://localhost:8098/stats">
Instance "riak"
<Key "converge_delay_last">
Type "gauge""
</Key>
<Key "converge_delay_max">
Type "gauge""
</Key>
<Key "converge_delay_mean">
Type "gauge""

DalmatinerDB Installation Guide (Linux)

These instructions outline how to install DalmatinerDB on a single Linux x86_64 physical server or virtual machine. Scaling out will be covered in a future document. This setup guide also covers configuring CAdvisor and Telegraf to send in monitoring data and Grafana to build dashboards.

Here's how everything connects together:

dalmatiner architecture

Create a VM

@sacreman
sacreman / aks.txt
Created October 23, 2018 11:56
Dolos Results 2018-10-23
2018-10-22 08:39:01 INFO Starting test
2018-10-22 08:39:01 INFO Creating Resource Group
2018-10-22 08:39:03 INFO Creating the AKS cluster
2018-10-22 08:52:22 INFO Getting cluster credentials
2018-10-22 08:52:23 INFO Get Nodes
2018-10-22 08:52:26 INFO b'NAME STATUS ROLES AGE VERSION\naks-nodepool1-18093422-0 Ready agent 3m v1.9.11\n'
2018-10-22 08:52:26 INFO Applying Deployment
2018-10-22 08:52:32 INFO b'deployment.apps/azure-vote-back created\nservice/azure-vote-back created\ndeployment.apps/azure-vote-front created\nservice/azure-vote-front created\n'
2018-10-22 08:52:32 INFO Getting external IP
2018-10-22 08:56:18 INFO Getting web contents from 104.41.139.254