Skip to content

Instantly share code, notes, and snippets.

View grobbie's full-sized avatar

Rob Gibbon grobbie

View GitHub Profile
@grobbie
grobbie / charmed-spark-datahub.sh
Last active February 8, 2024 15:50
Deploy an on-premise data hub with Canonical MAAS, Spark, Kubernetes and Ceph
#!/bin/bash
#
# author: Rob Gibbon, Canonical Ltd.
# about: This script deploys a complete, charmed data lake stack on MAAS (https://maas.io)
# prerequisites: * A host computer running Ubuntu
# * A working MAAS environment, with provisioned
# nodes available and configured for KVM/libvirt
# * A Kaggle account and valid API token configured
# * Google Chrome browser installation for WUIs
from pyspark.sql import SparkSession
from pyspark import SparkFiles
spark = SparkSession.builder \
.appName("LinuxDistributionStabilityRanking") \
.getOrCreate()
spark.conf.set("fs.s3a.attempts.maximum", "1")
spark.conf.set("fs.s3a.connection.establish.timeout", "5000")
spark.conf.set("fs.s3a.connection.timeout", "10000")
import requests
from bs4 import BeautifulSoup
import pandas as pd
def get_stackoverflow_data(num_pages=3):
base_url = "https://stackoverflow.com/questions/tagged/"
data = []
tags = ["ubuntu", "rhel", "arch", "suse"]
for tag in tags:
import requests
from bs4 import BeautifulSoup
import pandas as pd
def get_stackoverflow_data(tag="linux-distribution", num_pages=3):
base_url = "https://stackoverflow.com/questions/tagged/"
data = []
for page in range(1, num_pages + 1):
url = f"{base_url}{tag}?tab=votes&page={page}"
@grobbie
grobbie / SSB 100 performance testing on PostgreSQL 12 and Ubuntu 20.04
Last active November 10, 2021 12:47
SSB 100 performance testing on PostgreSQL 12 and Ubuntu 20.04
#pg_bulkload
sudo apt install libcrypto++6 libssl-dev libkrb5-dev libselinux-dev libpam-dev libcrypto++-dev libgssapi-krb5-2 libz-dev libedit-dev
#note - dbgen segfaults on ARM, fixed by using gcc-10
sudo apt install gcc-10
#dexter and build basics
sudo apt-get install gcc make flex bison byacc git ruby ruby-dev postgresql-server-dev-all
pushd /tmp
@grobbie
grobbie / TPC DS 100 performance test on Postgres & Ubuntu
Last active November 10, 2021 17:16
TPC-DS 100 performance testing on PostgreSQL 12 & Ubuntu 20.04 LTS
#dexter and build basics
sudo apt-get install gcc make flex bison byacc git ruby ruby-dev
#pg_bulkload
sudo apt install libcrypto++6 libssl-dev libkrb5-dev libselinux-dev libpam-dev libcrypto++-dev libgssapi-krb5-2 libz-dev libedit-dev
wget https://github.com/ossc-db/pg_bulkload/archive/refs/tags/VERSION3_1_19.tar.gz
tar xzf VERSION3_1_19.tar.gz
pushd pg_bulkload-VERSION3_1_19
make
╔════════════════╦══════════════════╦══════════════════╦══════════╗
║ Gas ║ 1900 level (ppm) ║ 2015 level (ppm) ║ Increase ║
╠════════════════╬══════════════════╬══════════════════╬══════════╣
║ Carbon dioxide ║ 296 ║ 400 ║ 35% ║
║ Methane ║ 0.88 ║ 1.859 ║ 111% ║
║ Nitrous oxide ║ 0.2772 ║ 0.328 ║ 18% ║
╚════════════════╩══════════════════╩══════════════════╩══════════╝
╔════════════════╦══════════════════════╦══════════════════════╦═════════════════════╗
║ Gas ║ Delta as GWP 20Y ║ Delta as GWP 100Y ║ Delta as GWP 500Y ║
╠════════════════╬══════════════════════╬══════════════════════╬═════════════════════╣
║ Carbon dioxide ║ 120 ║ 120 ║ 120 ║
║ Nitrous oxide ║ 0.054 x 289 = 15.606 ║ 0.054 x 298 = 16.092 ║ 0.054 x 153 = 8.262 ║
║ Methane ║ 1.19 x 96 = 114.24 ║ 1.19 x 32 = 38.08 ║ 1.19 x 7.6 = 9.044 ║
╚════════════════╩══════════════════════╩══════════════════════╩═════════════════════╝
╔════════════════╦════════════════════════════════════════╦═══════════════════════════════╦══════════════╗
║ Gas ║ Preindustrial atmospheric value in ppm ║ 2015 atmospheric value in ppm ║ Delta in ppm ║
╠════════════════╬════════════════════════════════════════╬═══════════════════════════════╬══════════════╣
║ Carbon dioxide ║ 280 ║ 400 ║ 120 ║
║ Nitrous Oxide ║ 0.275 ║ 0.329 ║ 0.054 ║
║ Methane ║ 0.65 ║ 1.84 ║ 1.19 ║
╚════════════════╩════════════════════════════════════════╩═══════════════════════════════╩══════════════╝
╔══════════════════════════╦═══════════════════╦══════════════╦═══════════════╦═══════════════╗
║ GWP values and lifetimes ║ Lifetime in years ║ GWP 20 years ║ GWP 100 years ║ GWP 500 years ║
╠══════════════════════════╬═══════════════════╬══════════════╬═══════════════╬═══════════════╣
║ Methane ║ 12 ║ 96 ║ 32 ║ 7.6 ║
║ Nitrous oxide ║ 114 ║ 289 ║ 298 ║ 153 ║
║ Carbon dioxide ║ 95 ║ 1 ║ 1 ║ 1 ║
╚══════════════════════════╩═══════════════════╩══════════════╩═══════════════╩═══════════════╝