starkers/notes-seraph.md

## notes-seraph.md

      
    Raw
  

              notes-seraph.md
            
          
    DIY hetzner multi-host

This guide is a general howto based against a server I recently setup.
The good things:


you get a cheap server with several IPs
You can run virtualised hosts with their own IPs
You can also run containers with public IPs. (in this case I'm using LXD/LXC, docker is an option or even experiment with others inside your KVM)
ZFS integration for your LXC snapshoting and even KVM hosts, I can't stress how awesome ZFS is at this
LTS 16.04 has security patches till at least the year 2021

The bad things:


This server is a pet.. through and through!
if it dies you have downtime! There is no automatic fail-over

This works for servers with Dual disks.
IMPORTANT SAFETY NOTE

(if your data matters to you)
....as this server is going to be using ZFS I used hetzner's "server bidding" to get a server with ECC ram..
I've never experienced issues with Hetzner (in 4+ years) but I know people who have,
I believe you should NEVER use a single host for production. This host is for me to run VM's and develop on only.
Lets begin

After ordering your new server it should be booted into the rescueimage and there should be an email in your inbox with the servers IP.
I sincerely hope you are using SSH keys..
Base install from hetzner rescueimage

Once booted into the rescueimage (you can boot this from their robot if required)
I include below a simple config file, it will:

Install Ubuntu 16.04 (xenial) onto a raid1 (mirror) disk set.
Allocate 100Gb to the LVM block
only use 30Gb+/- of the LVM, (leaving space for lvm snapshots/backups)

The thinking behind not doing a ZFS root is simply because if you need to boot a recovery system to repair the server ZFS is most likely not supported by your live-rescue images.
OK.. with that out the way create a config file
cat >server.config<<EOF

DRIVE1 /dev/sda
DRIVE2 /dev/sdb

SWRAID 1
SWRAIDLEVEL 1

BOOTLOADER grub

HOSTNAME newhostname

PART /boot ext4 512M
PART lvm   vg0   100G

LV vg0 root /    xfs  20G
LV vg0 swap swap swap   4G

IMAGE /root/images/Ubuntu-1604-xenial-64-minimal.tar.gz

EOF


Now let hetzners automatic installer rip:
installimage -c server.config


I personally now mount the rootfs and set the default password:
mkdir /tmp/root
mount /dev/mapper/vg0-root /tmp/root
chroot /tmp/root
passwd root

#check root may login
grep RootLogin /etc/ssh/sshd_config

reboot
First boot

This will be slow for the first 10min or so as md is hammering the disk in the background. Don't worry server will be fast when its done.
For now kick of updates
apt update
apt -y upgrade
apt -y dist-upgrade

check grub

We want to ensure here that grub is correctly installed to both devices:
Run this then select both /dev/sda and /dev/sdb
dpkg-reconfigure grub-pc

check lvm/md0

cat /proc/mdstat  #look for "[UU]" meaning mirrors are in sync

#TODO automate test this so u know if there are failures

df -h /boot  #md0

pvs  #should be /dev/md1 and have some disk free

vgs

lvs


Install zfs

apt install -y zfs-initramfs

Create partitions for zfs

This makes a new third partition on each disk. (good ol primary style..)
#TODO, maybe use sfdisk..
echo "n\np\n3\n\n\nt\n3\nbf\nw\n"  | fdisk /dev/sda
echo "n\np\n3\n\n\nt\n3\nbf\nw\n"  | fdisk /dev/sdb

NB: \n means [enter]
Translating the above comand echo pipes this into fdisk :

n: new
p: primary partition
3: number 3
[enter]: start range = max
[enter]: end range = max
t: change type
3: partition number 3
bf: solaris
w: write changes to disk

zfs actually doesn't give a hoot what the partiton type is.. but if when seeing "Solaris" linux admins will hopefully not be tempted to mkfs on it!
## ok re-probe partitions
partprobe

## check if detected..
ls -l /dev/disk/by-id/ata* | grep part3


Reboot

You might need to reboot to detect the new partitions..
Create zpool

now we've rebooted we should see the new partitions under /dev/disk-by-id
We want to create the zpool against that device path, this means the disk "set" is portable etc..
First find the paths of the devices:
─➤  ls -l /dev/disk/by-id | grep ata | egrep "sda3|sdb3" 
lrwxrwxrwx 1 root root 10 Aug  1 15:20 ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DC50SF-part3 -> ../../sda3
lrwxrwxrwx 1 root root 10 Aug  1 15:20 ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DEUT7R-part3 -> ../../sdb3

OK lets make a zpool, I'm calling mine tank (omage to The Matrix funnily enough)
zpool create \
    -o ashift=12 \
    -O atime=off \
    -O compression=lz4 \
    -O normalization=formD  \
    -O mountpoint=/tank \
    tank mirror \
    /dev/disk/by-id/ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DC50SF-part3 \
    /dev/disk/by-id/ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DEUT7R-part3

PROFIT

newhostname ~ # zpool create \
    -o ashift=12 \
    -O atime=off \
    -O compression=lz4 \
    -O normalization=formD  \
    -O mountpoint=/tank \
    tank mirror \
    /dev/disk/by-id/ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DC50SF-part3 \
    /dev/disk/by-id/ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DEUT7R-part3
newhostname ~ # zpool status -v
  pool: tank
 state: ONLINE
  scan: none requested
config:

	NAME                                                  STATE     READ WRITE CKSUM
	tank                                                  ONLINE       0     0     0
	  mirror-0                                            ONLINE       0     0     0
	    ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DC50SF-part3  ONLINE       0     0     0
	    ata-WDC_WD2000FYYZ-01UL1B2_WD-WMC1P0DEUT7R-part3  ONLINE       0     0     0

errors: No known data errors

Update host network

We shift from eth0 to using a bridge device.
Note the pointopoint option on eth0
apt-install bridge-utils -y

My hosts network config looks like this:

cat /etc/network/interfaces
auto lo
iface lo inet loopback

#primary nic
## We keep the IP setup here so that we can use pointopoint
auto  eth0
iface eth0 inet static
  address         XXX.243.6.164
  netmask         255.255.255.255
  gateway         XXX.243.6.129
  pointopoint     XXX.243.6.129
  dns-nameservers 8.8.8.8 8.8.4.4


#VM-Bridge (for kvm)
#TODO: do we need to setup the IP here also?
auto  br0
iface br0 inet static
  address         XXX.243.6.164
  netmask         255.255.255.255
  bridge_fd       0
  bridge_maxwait  0
  bridge_ports    none
  bridge_stp      off

  ##adding in XXX.243.248.145/28  (145-158)
  up route add -host XXX.243.248.145 dev br0
  up route add -host XXX.243.248.146 dev br0
  up route add -host XXX.243.248.147 dev br0
  up route add -host XXX.243.248.148 dev br0
  up route add -host XXX.243.248.149 dev br0
  up route add -host XXX.243.248.150 dev br0
  up route add -host XXX.243.248.151 dev br0
  up route add -host XXX.243.248.152 dev br0
  up route add -host XXX.243.248.153 dev br0
  up route add -host XXX.243.248.154 dev br0
  up route add -host XXX.243.248.155 dev br0
  up route add -host XXX.243.248.156 dev br0
  up route add -host XXX.243.248.157 dev br0
  up route add -host XXX.243.248.158 dev br0

Now reboot to ensure the br0 dev comes up OK etc..
allow forwarding

sysctl -w net.ipv4.ip_forward=1

Setting up /home as zfs

mkdir /tmp/home
rsync -va /home/ /tmp/home/
rm -rf /home

zfs create -o mountpoint=/home tank/home
rsync -va /tmp/home/ /home/
rm -rf /tmp/home

KVM setup

prep some zfs storage before installing kvm

zfs create tank/kvm
# This holds snapshots/nvram/dumps
zfs create -o mountpoint=/var/lib/libvirt/qemu tank/kvm/qemu
# Default storage, I use this for CD images
zfs create -o mountpoint=/var/lib/libvirt/images  tank/kvm/images

Install KVM

apt install libvirt-bin qemu-kvm virtinst -y
# reboot to ensure services are OK
reboot

KVM network

Lets define br0 as a network option in virsh
KVM's default network look like this
virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes

Lets create a network definition (just telling it br0 is an option)
# This creates a file called br0 which we will import as a network
cat >br0.xml<<EOF
<network>
  <name>br0</name>
  <forward mode="bridge"/>
  <bridge name="br0" />
</network>
EOF

Import that setting AKA define
virsh net-define br0.xml

Then ensure that network definitions start automatically.
virsh net-autostart br0
virsh net-autostart default #virbr0

Network br0 marked as autostarted

# now start it
virsh net-start br0

Result:
virsh net-list --all                                                                                                                                  
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 br0                  active     yes           yes
 default              active     yes           yes

Download some iso's

cd /var/lib/libvirt/images
wget -c https://stable.release.core-os.net/amd64-usr/current/coreos_production_iso_image.iso
wget -c http://releases.ubuntu.com/14.04/ubuntu-14.04.4-server-amd64.iso
wget -c http://mirrors.ukfast.co.uk/sites/ftp.centos.org/7/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso

KVM guest settings

The VM's nic defaults to an internal NAT device..
This gives em internal network+dhcp etc..
However here I attached the NIC to br0 using virtio
Inside the VM its network settings are:
# The primary network interface
auto eth0
iface eth0 inet static
  address XXX.243.248.145
  netmask 255.255.255.255
  gateway XXX.243.6.164
  pointopoint XXX.243.6.164
  dns-nameservers 8.8.8.8 8.8.4.4

LXD/LXC setup

Think of LXD kind of like a wrapper for LXC, it makes live easier and will allow you to do things live-migrations later if you decide to cluster it
installing lxd

apt install lxd lxd-tools

LXD+ZFS=<3

#create a dataset for lxd
zfs create tank/lxd
#start initial config
lxd init
Name of the storage backend to use (dir or zfs): zfs
Create a new ZFS pool (yes/no)? no
Name of the existing ZFS pool or dataset: tank/lxd
Would you like LXD to be available over the network (yes/no)? no
Do you want to configure the LXD bridge (yes/no)? no
LXD has been successfully configured.

LXC network settings

I'm going to create a unique LXC profile for containers that will have direct public IPs.
I'm calling this profile public-default as this host is a copy of the default except it can have a public IP.
Have a look at the profiles already present there should be one already for docker and default.
ProTip: The docker  one is pretty cool, you can run a docker host inside lxc
#See the currently installed profiles

lxc profile list

# copy 'default' to 'public-default'
lxc profile copy default public-default

# lets edit this and attach it to br0
lxc profile edit public-default

This is for "default" images with public IPs (public-docker to come later)
name: public-default
#enhance for config, read: https://github.com/lxc/lxd/blob/master/doc/configuration.md
# EG you may want to limit RAM/CPU/DISK
config: {}
description: directly connected public IPs
devices:
  eth0:s
    name: eth0
    nictype: bridged
    parent: br0
    type: nic

Creating a container with a public IP

Init a trusty container called testtainer which is attached to the new profile '-p'
lxc init ubuntu:14.04 testtainer -p public-default

Container network config

EG: editing a existing container
lxc config edit testtainer

EG: Sample container
name: testtainer
profiles:
- public-default
config:
  volatile.base_image: 628c432840e1aedc44006d3c6f7ace79d50753d2267b159289cd2e7490f2348f
  volatile.eth0.hwaddr: 00:16:3e:7a:80:e8
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
devices:
  root:
    path: /
    type: disk
ephemeral: false

container basics

start a container

lxc start testtainer

list containers

#see: lxc help list
lxc list -c nsP4tSc

entering a container

The first way is directly from your host:
lxc exec testtainer bash          

root@testtainer:~# cat /etc/network/interfaces.d/eth0.cfg 
auto eth0
iface eth0 inet static
  address XXX.243.248.150
  netmask 255.255.255.255
  gateway XXX.243.6.164
  pointopoint XXX.243.6.164
  dns-nameservers 8.8.8.8 8.8.4.4

Another way is to install your ssh key:
I use this key for example:
lxc file push ~/.ssh/id_rsa_clovis.pub testtainer/home/ubuntu/.ssh/authorized_keys --mode=0600 --uid=1000 

Now log in as the "ubuntu" user (there is no password) and use sudo
snapshots

TODO: ZFS makes this soo soo soo fecking cool
Take a snapshot

Optionally you can now snapshot including the live RAM of the container (also known as a 'stateful'" snapshot)
List snapshots

lxc info testtainer
Name: testtainer
Architecture: x86_64
Created: 2016/08/01 16:21 UTC
Status: Running
Type: persistent
Profiles: public-defaults
Pid: 24413
Ips:
  eth0:	inet	XXX.243.248.150	veth2S40II
  eth0:	inet6	fe80::216:3eff:fe7a:80e8	veth2S40II
  lo:	inet	127.0.0.1
  lo:	inet6	::1
Resources:
  Processes: 15
  Disk usage:
    root: 181.19MB
  Memory usage:
    Memory (current): 38.39MB
    Memory (peak): 298.52MB
  Network usage:
    eth0:
      Bytes received: 37.05MB
      Bytes sent: 6.35MB
      Packets received: 100559
      Packets sent: 66912
    lo:
      Bytes received: 3.00kB
      Bytes sent: 3.00kB
      Packets received: 40
      Packets sent: 40
Snapshots:
  snap0 (taken at 2016/08/01 16:28 UTC) (stateless)
  snap1 (taken at 2016/08/02 10:15 UTC) (stateless)
  snap2 (taken at 2016/08/02 10:15 UTC) (stateless)
  snap3 (taken at 2016/08/02 11:18 UTC) (stateless)
  snap4 (taken at 2016/08/02 11:20 UTC) (stateless)


emergency recovery notes

should to need to boot the rescue image you can access the base OS by:

"clean" importing the raid1 /dev/md0-1 devices
activating lvm
mounting the rootfs
chroot etc..

EG:
# reconstruct md0-1
# u can use "missing" if there is a dead dis
root@rescue ~ # mdadm  --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1
mdadm: /dev/sda1 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug  2 15:38:52 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdb1 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug  2 15:38:52 2016
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
root@rescue ~ # mdadm  --create --assume-clean --level=1 --raid-devices=2 /dev/md1 /dev/sd[ab]2
mdadm: /dev/sda2 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug  2 15:38:52 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: /dev/sdb2 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug  2 15:38:52 2016
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
root@rescue ~ # cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sdb2[1] sda2[0]
      104792064 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sdb1[1] sda1[0]
      523712 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>
root@rescue ~ # pvscan 
  PV /dev/md1   VG vg0   lvm2 [99.93 GiB / 75.93 GiB free]
  Total: 1 [99.93 GiB] / in use: 1 [99.93 GiB] / in no VG: 0 [0   ]
root@rescue /dev/mapper # pvs
  PV         VG   Fmt  Attr PSize  PFree 
  /dev/md1   vg0  lvm2 a--  99.93g 75.93g
root@rescue ~ # vgscan 
  Reading all physical volumes.  This may take a while...
  Found volume group "vg0" using metadata type lvm2
root@rescue ~ # lvscan 
  inactive          '/dev/vg0/root' [20.00 GiB] inherit
  inactive          '/dev/vg0/swap' [4.00 GiB] inherit
root@rescue ~ # cd /dev/mapper
root@rescue /dev/mapper # ll
total 0
crw------- 1 root root 10, 236 Aug  2 15:47 control
root@rescue /dev/mapper # vgcfgrestore vg0
  Restored volume group vg0
root@rescue /dev/mapper # vgchange -ay vg0
  2 logical volume(s) in volume group "vg0" now active
  root@rescue /dev/mapper # ll
total 0
crw------- 1 root root 10, 236 Aug  2 15:47 control
lrwxrwxrwx 1 root root       7 Aug  2 16:01 vg0-root -> ../dm-0
lrwxrwxrwx 1 root root       7 Aug  2 16:01 vg0-swap -> ../dm-1

############OK GOOD


root@rescue /dev/mapper # mkdir /tmp/root
root@rescue /dev/mapper # mount /dev/mapper/vg0-root /tmp/root
root@rescue /dev/mapper # chroot /tmp/root
---fix stuff---
############


restore snapshot

TODO: documet
reference docs


http://containerops.org/2013/11/19/lxc-networking/
https://www.stgraber.org/2016/03/26/lxd-2-0-resource-control-412/
https://wiki.archlinux.org/index.php/LXD
https://wiki.debian.org/LXC/SimpleBridge
http://jotschi.de/2012/04/17/hetzner-lxc-linux-subnet-configuration/
https://insights.ubuntu.com/2016/04/07/lxd-networking-lxdbr0-explained/
http://tj.mk/install-proxmox-4-hetzner-debian/
https://www.sysorchestra.com/2014/11/08/hetzner-root-server-with-kvm-ipv4-and-ipv6-networking/