cstockton/my-provision-script.sh

## my-provision-script.sh
#!/bin/bash
echo "If you're reading this I hope you are in a vm or I probably just saved your ass!"
exit


## Warning:
#
# This script could break the crap out of your machine, it's only for illustration. That
# said- how you manage your data is your business, but if your unsure of best practices I'll
# share the one rule I have that is unwavering:
#
#  1) Never have all your data on one machine in any way shape or form, man. Period.
#       - Do not have them physically connected, doing so will expose them to your
#         system as block device making them vulnerable to human error.
#
# Of my colleagues, friends, Internet trolls and Internet randoms .. those whom
# have suffered true data loss of value, have always violated this rule. The ways
# you can destroy everything on your machine are numerous. If you are a linux novice
# then you are in my opinion less likely to massacre your drives then me or anyone
# who has over a decade (or decades and decades) of experience. Comfort tends to
# be negligences instigator. I've nuked the shit out of my drives more than once
# and what saved my ass each time, was I didn't compromise on my main rule.
#
#
## Crypto
# A couple notes on crypto, I encrypt my root partition as my front line defense
# from physical access and then protect a second zone of data strictly for protection
# in the case someone illegally gained access to my machine. This second encrypted
# zones provide no protection against a warrant. As our government continues to
# over reach you can be certain that you will not have plausible deniability for
# them. If you have data which needs protected from law enforcement and extortion
#   ^ (not mutually exclusive)
# you will need to educate yourself more deeply with luks containers and crypto. That
# is out of scope for some comments in a bash file and people have different positions
# on the subject. Some are passionately against hidden containers but I think they
# can be useful as long as you are methodical and properly implement and use them. Hard
# drive disks I do not believe allow detection of hidden regions of data even in
# a well funded lab setting. I don't know enough about SSD's to say the same for
# them.. they are complex under the hood and I don't know if that complexity
# can leak usage patterns. I doubt it, but you should research if it's a concern.
#
# Hidden volumes are not a direrct feature of luks but the specification is robust
# enough to do so as it supports the luks header being decoupled from the device.
# Which means you may fill a disk with random data and have different regions of
# the device map to a different volume. A few tips to get you a head start:
#
# - Be cautious of your access and the applications you use, be careful not to
#   leak information into system logs and such. Do your research for secure data
#   access or having secure data is pointless.
# - See luks documentation for --header, --align-payload and --offset for alignment.
# - Make sure to do your research on your chosen FS's block allocation
#   characteristics for your workload on the primary partition and after each
#   session of writes assert the checksum at your offset matches your backup. i.e.:
#     dd if=./luks.img bs=1M seek=<your offset> conv=seek_bytes | sha1sum -b &
#     dd if=./luks-backup.img bs=1M seek=<your offset> conv=seek_bytes | sha1sum -b &
#     wait
#
#
## Disk Alignment throughout md, lvm, luks
#
# I've left out everything related to alignment to prevent anyone from copy / pasta
# values that would be worst than defaults (which end up pretty good). It's not worth worrying
# about unles you have specific purpose to. The benefits are negligible under normal load.
#
#
## A little bit about what backup directory structure has worked for me.
#
# This isn't how you set of data retention for a business and you should always
# take particular care to know precisely what you are doing. This is just how I
# manage my home.
#
# Each device/machine has their own /storage/ directory. This is where I.. store data. The
# origin of /storage/ isn't important. It may be from my file server or it may be local.
# My workstation has it's own raid array for /storage/, as well as my nuc and anything
# with a keyboard mostly. I have a NFS server for other devices and the very little data
# I need access to from all machines, mostly my password safes and my `one` config dir.
# Those are stored within a luks container within a regular file on the NFS that
# is mounted as a block device via losetup on each machine.
#
# For backups I follow a few conventions but for me simplicity is key. I know
# myself well enough to know that if managing my data responsibly consumes to
# much time, it won't be done.
#
# I have 3 types of backups, I use the words:
#   ├─local - Lives on the machine / device
#   ├─external - Lives outside of the machine / device
#   └─remote - Lives outside of my home
#
# Local backups don't have any requirements, they serve the local machine and so
# I don't follow any rules. They may be lvm snapshots, rsync, etc.
#
# I follow a simple set of conventions which are shared for my external and remote
# backups. They share identical directory structure, but external and remote have
# a different partition of my data. Remote data is only things that I don't want
# to lose in a worst case scenario, which is everything I have except data I can
# recover from other sources. This all fits well within the limits of what can be
# managed via dropbox, google drive, a remote server, etc these days.
#
# The structure is simple, the commands for illustration:
#
# mkdir -p backup-ext/{,{/live,/snap,/remote,/storage,/devices}}
# mkdir -p backup-ext/live/hostname0{1,2,3}
# mkdir -p backup-ext/snap/hostname0{1,2}-date0{1,2}.tgz
# mkdir -p backup-ext/remote/provider01/{mnt,{,luks/{img,bin}}}
# mkdir -p backup-ext/devices/{nexus{5,7},iphone}
#
# The result:
# tree|awk '{print "# " $0}'
# .
# └── backup-ext
#     ├── storage
#     ├── devices
#     │   ├── iphone
#     │   ├── nexus5
#     │   └── nexus7
#     ├── live
#     │   ├── hostname01
#     │   ├── hostname02
#     │   └── hostname03
#     ├── remote
#     │   └── provider01
#     │       ├── luks
#     │       │   ├── bin
#     │       │   └── img
#     │       └── mnt
#     └── snap
#         ├── hostname01-date01.tgz
#         ├── hostname01-date02.tgz
#         ├── hostname02-date01.tgz
#         └── hostname02-date02.tgz
#
# Storage is my all my data, it's a bit ambigious of a term as each machine has
# local /storage but in this context I refer to it as my one global and organized
# collection of all data. Anything that is important is within this, and it must
# be in order to be backed up external or remote. Where as local machine storage
# may just have some config to run services etc.
#
# Live contains the most up to date system snapshots, it contains everything to
# restore a box including it's system dirs to recover from breaking a machine. It
# includes the /storage directory for the machine. My workstation excludes /storage
# because it is my global /storage and where I manage it.
#
# Snaps are a subset of what is in /live and taken each time I perform a live
# backup. The snapshot varies from device but in general /etc /storage /root
# /home are in a tar file. Sometimes I cp -a live folders to snapshots for some
# reason. It's not a big deal whatever works for my use case.
#
# Devices is just a staging area / live backup. I usually go in there and pluck
# the data out and put it in storage to be immortalized. Pictures of my dog, mostly. :)
#
# Remote I separate by provider, i.e. dropbox is what I use currently as they
# provide remote differential synchronization which is a must for my current
# backup paradigm. How you choose to do this is up to you, but I personally would
# never store unencrypted data on dropbox or any service provider. Seriously. It's
# easy enough to encrypt your data via luks and it's usable across all platforms
# using virtualbox or docker. I have a pair of dockerfiles if anyone has any
# interest I can share them.
#
# All I do is create an initial set of files for my data in chunks. These are
# just regular blobs of urandom data. I then set them up as block devices using
# losetup. Then use the linux device mapper to map them into a single block
# device suitable for luks formatting. You don't need great performance because
# it's out of band but I will take a moment to commend the device mapper as having
# testing hundreds of mapped files the cost is very low. This means you can split
# your remote data into small chunks depending on the amount of data you have
# and the synchronization process is much less painful for any endpoint which
# supports differential sync. It's also nice because you may grow the storage
# anytime by simply adding some more files. This means you can start at 80%
# capacity today without having to stress about growing later.
#
# Also for formatting the block allocation for ext4 fits the patterns for
# this and it's what I suggest using. I use a second key file on my storage
# device to encourage more regular backups through automation. Just don't forget
# to have a primary keyslot that is a mental secret. I hope you never have a
# life event that causes you to use your remote backup, but if you do it's not
# likely that you no longer have access to your storage key.
#
# A slightly modified example script from my dockerfile I use for mounting that
# would work for testing this if you were curious. I already set it up to provide
# a 10GB image composed of 100 100MB files.
#
# LOOP_LABEL=backup_remote
# LOOP_DEVICES=$(losetup -anO NAME,BACK-FILE)
# CUR_SECTOR=0F
#
# { for PART_NUM in $(seq -f "%02g" 0 99)
# do
#   IMG_FILE="./${LOOP_LABEL}.img${PART_NUM}"
#   IMG_DEVICE=$(echo "${LOOP_DEVICES}"|grep "${IMG_FILE}"|awk '{print $1}')
#
#   # You should exit instead of fallocate if not testing, also if using it for
#   # actual backups doesn't hurt to create the files from urandom. Though it's not
#   # really theoretically.. it just feels better.
#   [ -f "${IMG_FILE}" ] || fallocate -l 100M "${IMG_FILE}"
#   [ -n "${IMG_DEVICE}" ] || {
#     IMG_DEVICE=$(losetup --show -f "${IMG_FILE}")
#   }
#   CUR_SECTOR_SIZE=$(blockdev --getsz "${IMG_DEVICE}")
#   echo "${CUR_SECTOR} ${CUR_SECTOR_SIZE} linear ${IMG_DEVICE} 0"
#
#   CUR_SECTOR=$((CUR_SECTOR+$CUR_SECTOR_SIZE))
# done } | dmsetup create "${LOOP_LABEL}"
#
# [ -f "/dev/mapper/${LOOP_LABEL}_crypt" ] || cryptsetup luksOpen --key-file ${LOOP_LABEL}.key /dev/mapper/${LOOP_LABEL} ${LOOP_LABEL}_crypt
# [ -d "/your/path/backup-ext/dropbox/mnt" ] || mount "/dev/mapper/${LOOP_LABEL}_crypt"  "/your/path/backup-ext/dropbox/mnt"
# rsync -a /your/ /stuff/
#
#
# Closing is easy:
# umount "/your/path/backup-ext/dropbox/mnt"
# cryptsetup luksClose "${LOOP_LABEL}_crypt"
# dmsetup remove "${LOOP_LABEL}_crypt"
#
# for x in `losetup -al| grep "${LOOP_LABEL}.img"|awk '{print $1}'`; do
#   losetup -d $x
# done
#
# Below is my provision script, slightly edited and I wouldn't run it.. it's just
# to share how I configure my machine.
#
DISK_BLKS="${@:-$(lsblk -dpo NAME,SIZE|grep /dev/sd|grep 1.8T|awk '{print $1}'|xargs echo -n)}"

echo "================"
echo "Confirm:"
echo "  Disks to be used in raid arrays: ${DISK_BLKS}"
echo
echo "OK? Y/N"

while read -r -n 1 -s CONFIRM; do
  if [[ $CONFIRM = [YyNn] ]]; then
    [[ $CONFIRM = [Nn] ]] && echo "Exiting.." && exit 1
    break
  fi
done

# Clear out all old raids
mdadm --write-mostly --zero-superblock $(for x in `echo $DISK_BLKS`; do echo "${x}*"; done) || {
  printf >&2 "[error] %s: could not zero superblocks for %s\n" "${0}" "${DISK_BLKS}"
  exit 1
}
wipefs --all --force $(for x in `echo $DISK_BLKS`; do echo "${x}*"; done)

# Create disks
for x in $(echo $DISK_BLKS); do
sfdisk "${x}" <<SFDISK-CONFIG
label: gpt
device: /dev/sd${x}
unit: sectors
first-lba: 2048
last-lba: 3907029134

/dev/sd${x}1 : start=2048, size=3907027087, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, name="ssd${x}p0"
SFDISK-CONFIG
done

# Create six disk raid 10 array, only change is layout = far, see man page md for details
mdadm --verbose --create /dev/md/storage --assume-clean --level=10 --layout f2 \
    --raid-devices=6 $(for x in `echo $DISK_BLKS`; do echo "${x}1"; done) || {
  printf >&2 "[error] %s: could not create raid 10 array with %s\n" "${0}" "${DISK_BLKS}"
  exit 1
}

# I have redundancy, bitmaps aren't needed for me
mdadm --grow --bitmap=none /dev/md/storage

# Update conf
grep --quiet /dev/md/storage /etc/mdadm/mdadm.conf || mdadm --detail --scan >> /etc/mdadm/mdadm.conf

# Luks format
cryptsetup luksFormat --verbose --verify-passphrase --key-size=512 --align-payload=1024 /dev/md/storage || {
  printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
  exit 1
}

# Make luks mount key
[ -f "/root/md-storage-key" ] || dd if=/dev/urandom of=/root/md-storage-key bs=1024 count=4

# Add the key for local mounting
cryptsetup luksAddKey /dev/md/storage /root/md-storage-key || {
  printf >&2 "[error] %s: luksAddKey failed\n" "${0}"
  exit 1
}

# Backup headers in root (this is for illustration don't lose these, seriously)
[ -f "/root/md-storage-header" ] || cryptsetup luksHeaderBackup --header-backup-file /root/md-storage-header /dev/md/storage

# Open luks container
cryptsetup luksOpen --key-file /root/md-storage-key /dev/md/storage storage_crypt || {
  printf >&2 "[error] %s: luksOpen failed\n" "${0}"
  exit 1
}


# We have the following `lsblk|awk '{print "# " $0}'` with raid + luks setup
# ----
# sd*                       8:80   0   1.8T  0 disk
# └─sd**                    8:81   0   1.8T  0 part
#   └─storage               9:127  0   5.5T  0 raid10
#     └─storage_crypt     252:5    0   5.5T  0 crypt


# Create lvm physical volume on top of luks container
pvcreate /dev/mapper/storage_crypt

# Create main volume group
vgcreate storage /dev/mapper/storage_crypt

# Create logical volumes
#   - Unencrypted and auto mounted at boot (physical volume is remember)
#     - base - general storage that is not categorized, serves as root mount point
#     - one - my main workstation folder, dev, scripts, workstation stuff
#     - media - non-personal media collection, music, movies, etc, not backuped remotely
#     - backup - contains.. wait for it.. backups!
lvcreate --size 100G --name base storage
lvcreate -L 500G -n one storage
lvcreate -L 1T -n media storage
lvcreate -L 1T -n remote storage

#   - Encrypted and not auto mounted
#     - chris - all my personal stuff, financial data, pictures of friends, familly, etc
#     - vault - data that I don't want to lose, but don't need regular access to.
#     - archive - where data goes to die, a recycle bin that is not often recycled.
#
lvcreate -L 300G -n chris storage
lvcreate -L 500G -n vault storage
lvcreate -L 2T -n archive storage


# Now `lsblk|awk '{print "# " $0}'` with our volumes:
# ----
# sd*                               8:80   0   1.8T  0 disk
# └─sd**                            8:81   0   1.8T  0 part
#   └─md***                         9:127  0   5.5T  0 raid10
#     └─storage_crypt             252:5    0   5.5T  0 crypt
#       ├─storage-one             252:6    0   500G  0 lvm
#       ├─storage-media           252:7    0     1T  0 lvm
#       ├─storage-chris           252:8    0   300G  0 lvm
#       ├─storage-vault           252:9    0   500G  0 lvm
#       ├─storage-archive         252:10   0     2T  0 lvm
#       └─storage-base            252:11   0   100G  0 lvm


# Format the unencrypted logical volumes
#   -q --> quiet
mkfs.ext4 -q /dev/storage/base
mkfs.ext4 -q /dev/storage/one
mkfs.ext4 -q /dev/storage/media
mkfs.ext4 -q /dev/storage/remote

# Create the mount points to prep for luks, -p creates parents if they don't exist
mkdir --parents  /one /storage/{one,media,remote}
mkdir -p  /storage/{chris,vault,archive}

# Mount our public fs's to start syncing data from backups etc
mount /dev/storage/base /storage
mount /dev/storage/one /storage/one
mount /dev/storage/media /storage/media
mount /dev/storage/remote /storage/remote

# My one folder I like to bind to bind at root, this is because some programs (atom ide for example)
# seem to behave poorly with symlinked project roots, likely due fs notify and such. I prefer this
# because it simplifys backups tremendously, i.e. rsync /storage/ /backup/
mount -t none -o bind /dev/storage/one /one


# Now `df -h` with our volumes mounted:
# ----
# /dev/mapper/storage-base       99G   60M   94G   1% /storage
# /dev/mapper/storage-remote   1008G   72M  957G   1% /storage/remote
# /dev/mapper/storage-media    1008G   72M  957G   1% /storage/media
# /dev/mapper/storage-one       493G  2.4G  468G   1% /one


# Luks format storage: chris
cryptsetup luksFormat -vy -s 512 /dev/storage/chris || {
  printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
  exit 1
}

# Luks format storage: vault
cryptsetup luksFormat -vy -s 512 /dev/storage/vault || {
  printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
  exit 1
}

# Luks format storage: archive
cryptsetup luksFormat -vys 512 /dev/storage/archive || {
  printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
  exit 1
}

# Handle encrypted vols
for vol_short in chris archive vault; do
  backup_file="/root/lvm-storage-${vol_short}-header"
  key_file="/root/lvm-storage-${vol_short}-key"
  vol_name="/dev/storage/${vol_short}"
  crypt_name="storage_${vol_short}_crypt"

  [ -f "${key_file}" ] || dd if=/dev/urandom of="${key_file}" bs=1024 count=4

  cryptsetup luksDump "${vol_name}" | grep -qs "Slot 1: ENABLED" || {
    cryptsetup luksAddKey "${vol_name}" "${key_file}" || {
      printf >&2 "[error] %s: luksAddKey '%s' '%s' failed\n" "${0}" "${vol_name}" "${key_file}"
      exit 1
    }
  }

  [ -f "${backup_file}" ] || \
    cryptsetup luksHeaderBackup --header-backup-file "${backup_file}" "${vol_name}"

  cryptsetup status "${crypt_name}" > /dev/null 2>&1 || {
    cryptsetup luksOpen -d "${key_file}" "${vol_name}" "${crypt_name}" || {
      printf >&2 "[error] %s: luksOpen -d '%s' '%s' failed\n" "${0}" "${vol_name}" "${key_file}"
      exit 1
    }
  }
done

# Format and mount the encrypted logical volumes
for vol_short in chris archive vault; do
  mkfs.ext4 -q "/dev/mapper/storage_${vol_short}_crypt" && \
    mount "/dev/mapper/storage_${vol_short}_crypt" "/storage/${vol_short}"
done


# Now `df -h` with our volumes mounted:
# ----
# /dev/mapper/storage-base            99G   60M   94G   1% /storage
# /dev/mapper/storage-one            493G  2.4G  468G   1% /one
# /dev/mapper/storage-media         1008G  557G  400G  59% /storage/media
# /dev/mapper/storage-remote        1008G   72M  957G   1% /storage/remote
# /dev/mapper/storage_chris_crypt    296G   63M  281G   1% /storage/chris
# /dev/mapper/storage_archive_crypt  2.0T   71M  1.9T   1% /storage/archive
# /dev/mapper/storage_vault_crypt    493G   70M  467G   1% /storage/vault


# Now `lsblk|awk '{print "# " $0}'` with our volumes:
# ----
# sd*                               8:80   0   1.8T  0 disk
# └─sd**                            8:81   0   1.8T  0 part
#   └─storage                       9:127  0   5.5T  0 raid10
#     └─storage_crypt             252:5    0   5.5T  0 crypt
#       ├─storage-one             252:7    0   500G  0 lvm    /one
#       ├─storage-remote          252:8    0     1T  0 lvm    /storage/remote
#       ├─storage-media           252:9    0     1T  0 lvm    /storage/media
#       ├─storage-chris           252:10   0   300G  0 lvm
#       │ └─storage_chris_crypt   252:12   0   300G  0 crypt  /storage/chris
#       ├─storage-vault           252:11   0   500G  0 lvm
#       │ └─storage_vault_crypt   252:16   0   500G  0 crypt  /storage/vault
#       ├─storage-base            252:13   0   100G  0 lvm    /storage
#       └─storage-archive         252:14   0     2T  0 lvm
#         └─storage_archive_crypt 252:15   0     2T  0 crypt  /storage/archive


# Is it slow having a few levels of abstraction before your IO? Not really.
#   $ time dd if=/dev/zero of=./8gb.img bs=1M count=8192 conv=fdatasync
#     > 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 11.0654 s, 776 MB/s
#   $ sync && echo 3 > /proc/sys/vm/drop_caches && free -h
#     >               total        used        free      shared  buff/cache   available
#     > Mem:           251G        1.7G        249G         80M        470M        249G
#     > Swap:          6.4G          0B        6.4G
#   $ time dd if=./8gb.img of=/dev/null bs=1M
#     > 8192+0 records in
#     > 8192+0 records out
#     > 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 3.26732 s, 2.6 GB/s
#     >
#     > real	0m3.270s
#     > user	0m0.016s
#     > sys	0m2.248s
#
# Your mileage will vary of course, I'm running a supermicro X10DRH-C with 6 ssd's
# in JBOD via lsi 3108 roc.
	#!/bin/bash
	echo "If you're reading this I hope you are in a vm or I probably just saved your ass!"
	exit


	## Warning:
	#
	# This script could break the crap out of your machine, it's only for illustration. That
	# said- how you manage your data is your business, but if your unsure of best practices I'll
	# share the one rule I have that is unwavering:
	#
	# 1) Never have all your data on one machine in any way shape or form, man. Period.
	# - Do not have them physically connected, doing so will expose them to your
	# system as block device making them vulnerable to human error.
	#
	# Of my colleagues, friends, Internet trolls and Internet randoms .. those whom
	# have suffered true data loss of value, have always violated this rule. The ways
	# you can destroy everything on your machine are numerous. If you are a linux novice
	# then you are in my opinion less likely to massacre your drives then me or anyone
	# who has over a decade (or decades and decades) of experience. Comfort tends to
	# be negligences instigator. I've nuked the shit out of my drives more than once
	# and what saved my ass each time, was I didn't compromise on my main rule.
	#
	#
	## Crypto
	# A couple notes on crypto, I encrypt my root partition as my front line defense
	# from physical access and then protect a second zone of data strictly for protection
	# in the case someone illegally gained access to my machine. This second encrypted
	# zones provide no protection against a warrant. As our government continues to
	# over reach you can be certain that you will not have plausible deniability for
	# them. If you have data which needs protected from law enforcement and extortion
	# ^ (not mutually exclusive)
	# you will need to educate yourself more deeply with luks containers and crypto. That
	# is out of scope for some comments in a bash file and people have different positions
	# on the subject. Some are passionately against hidden containers but I think they
	# can be useful as long as you are methodical and properly implement and use them. Hard
	# drive disks I do not believe allow detection of hidden regions of data even in
	# a well funded lab setting. I don't know enough about SSD's to say the same for
	# them.. they are complex under the hood and I don't know if that complexity
	# can leak usage patterns. I doubt it, but you should research if it's a concern.
	#
	# Hidden volumes are not a direrct feature of luks but the specification is robust
	# enough to do so as it supports the luks header being decoupled from the device.
	# Which means you may fill a disk with random data and have different regions of
	# the device map to a different volume. A few tips to get you a head start:
	#
	# - Be cautious of your access and the applications you use, be careful not to
	# leak information into system logs and such. Do your research for secure data
	# access or having secure data is pointless.
	# - See luks documentation for --header, --align-payload and --offset for alignment.
	# - Make sure to do your research on your chosen FS's block allocation
	# characteristics for your workload on the primary partition and after each
	# session of writes assert the checksum at your offset matches your backup. i.e.:
	# dd if=./luks.img bs=1M seek=<your offset> conv=seek_bytes \| sha1sum -b &
	# dd if=./luks-backup.img bs=1M seek=<your offset> conv=seek_bytes \| sha1sum -b &
	# wait
	#
	#
	## Disk Alignment throughout md, lvm, luks
	#
	# I've left out everything related to alignment to prevent anyone from copy / pasta
	# values that would be worst than defaults (which end up pretty good). It's not worth worrying
	# about unles you have specific purpose to. The benefits are negligible under normal load.
	#
	#
	## A little bit about what backup directory structure has worked for me.
	#
	# This isn't how you set of data retention for a business and you should always
	# take particular care to know precisely what you are doing. This is just how I
	# manage my home.
	#
	# Each device/machine has their own /storage/ directory. This is where I.. store data. The
	# origin of /storage/ isn't important. It may be from my file server or it may be local.
	# My workstation has it's own raid array for /storage/, as well as my nuc and anything
	# with a keyboard mostly. I have a NFS server for other devices and the very little data
	# I need access to from all machines, mostly my password safes and my `one` config dir.
	# Those are stored within a luks container within a regular file on the NFS that
	# is mounted as a block device via losetup on each machine.
	#
	# For backups I follow a few conventions but for me simplicity is key. I know
	# myself well enough to know that if managing my data responsibly consumes to
	# much time, it won't be done.
	#
	# I have 3 types of backups, I use the words:
	# ├─local - Lives on the machine / device
	# ├─external - Lives outside of the machine / device
	# └─remote - Lives outside of my home
	#
	# Local backups don't have any requirements, they serve the local machine and so
	# I don't follow any rules. They may be lvm snapshots, rsync, etc.
	#
	# I follow a simple set of conventions which are shared for my external and remote
	# backups. They share identical directory structure, but external and remote have
	# a different partition of my data. Remote data is only things that I don't want
	# to lose in a worst case scenario, which is everything I have except data I can
	# recover from other sources. This all fits well within the limits of what can be
	# managed via dropbox, google drive, a remote server, etc these days.
	#
	# The structure is simple, the commands for illustration:
	#
	# mkdir -p backup-ext/{,{/live,/snap,/remote,/storage,/devices}}
	# mkdir -p backup-ext/live/hostname0{1,2,3}
	# mkdir -p backup-ext/snap/hostname0{1,2}-date0{1,2}.tgz
	# mkdir -p backup-ext/remote/provider01/{mnt,{,luks/{img,bin}}}
	# mkdir -p backup-ext/devices/{nexus{5,7},iphone}
	#
	# The result:
	# tree\|awk '{print "# " $0}'
	# .
	# └── backup-ext
	# ├── storage
	# ├── devices
	# │ ├── iphone
	# │ ├── nexus5
	# │ └── nexus7
	# ├── live
	# │ ├── hostname01
	# │ ├── hostname02
	# │ └── hostname03
	# ├── remote
	# │ └── provider01
	# │ ├── luks
	# │ │ ├── bin
	# │ │ └── img
	# │ └── mnt
	# └── snap
	# ├── hostname01-date01.tgz
	# ├── hostname01-date02.tgz
	# ├── hostname02-date01.tgz
	# └── hostname02-date02.tgz
	#
	# Storage is my all my data, it's a bit ambigious of a term as each machine has
	# local /storage but in this context I refer to it as my one global and organized
	# collection of all data. Anything that is important is within this, and it must
	# be in order to be backed up external or remote. Where as local machine storage
	# may just have some config to run services etc.
	#
	# Live contains the most up to date system snapshots, it contains everything to
	# restore a box including it's system dirs to recover from breaking a machine. It
	# includes the /storage directory for the machine. My workstation excludes /storage
	# because it is my global /storage and where I manage it.
	#
	# Snaps are a subset of what is in /live and taken each time I perform a live
	# backup. The snapshot varies from device but in general /etc /storage /root
	# /home are in a tar file. Sometimes I cp -a live folders to snapshots for some
	# reason. It's not a big deal whatever works for my use case.
	#
	# Devices is just a staging area / live backup. I usually go in there and pluck
	# the data out and put it in storage to be immortalized. Pictures of my dog, mostly. :)
	#
	# Remote I separate by provider, i.e. dropbox is what I use currently as they
	# provide remote differential synchronization which is a must for my current
	# backup paradigm. How you choose to do this is up to you, but I personally would
	# never store unencrypted data on dropbox or any service provider. Seriously. It's
	# easy enough to encrypt your data via luks and it's usable across all platforms
	# using virtualbox or docker. I have a pair of dockerfiles if anyone has any
	# interest I can share them.
	#
	# All I do is create an initial set of files for my data in chunks. These are
	# just regular blobs of urandom data. I then set them up as block devices using
	# losetup. Then use the linux device mapper to map them into a single block
	# device suitable for luks formatting. You don't need great performance because
	# it's out of band but I will take a moment to commend the device mapper as having
	# testing hundreds of mapped files the cost is very low. This means you can split
	# your remote data into small chunks depending on the amount of data you have
	# and the synchronization process is much less painful for any endpoint which
	# supports differential sync. It's also nice because you may grow the storage
	# anytime by simply adding some more files. This means you can start at 80%
	# capacity today without having to stress about growing later.
	#
	# Also for formatting the block allocation for ext4 fits the patterns for
	# this and it's what I suggest using. I use a second key file on my storage
	# device to encourage more regular backups through automation. Just don't forget
	# to have a primary keyslot that is a mental secret. I hope you never have a
	# life event that causes you to use your remote backup, but if you do it's not
	# likely that you no longer have access to your storage key.
	#
	# A slightly modified example script from my dockerfile I use for mounting that
	# would work for testing this if you were curious. I already set it up to provide
	# a 10GB image composed of 100 100MB files.
	#
	# LOOP_LABEL=backup_remote
	# LOOP_DEVICES=$(losetup -anO NAME,BACK-FILE)
	# CUR_SECTOR=0F
	#
	# { for PART_NUM in $(seq -f "%02g" 0 99)
	# do
	# IMG_FILE="./${LOOP_LABEL}.img${PART_NUM}"
	# IMG_DEVICE=$(echo "${LOOP_DEVICES}"\|grep "${IMG_FILE}"\|awk '{print $1}')
	#
	# # You should exit instead of fallocate if not testing, also if using it for
	# # actual backups doesn't hurt to create the files from urandom. Though it's not
	# # really theoretically.. it just feels better.
	# [ -f "${IMG_FILE}" ] \|\| fallocate -l 100M "${IMG_FILE}"
	# [ -n "${IMG_DEVICE}" ] \|\| {
	# IMG_DEVICE=$(losetup --show -f "${IMG_FILE}")
	# }
	# CUR_SECTOR_SIZE=$(blockdev --getsz "${IMG_DEVICE}")
	# echo "${CUR_SECTOR} ${CUR_SECTOR_SIZE} linear ${IMG_DEVICE} 0"
	#
	# CUR_SECTOR=$((CUR_SECTOR+$CUR_SECTOR_SIZE))
	# done } \| dmsetup create "${LOOP_LABEL}"
	#
	# [ -f "/dev/mapper/${LOOP_LABEL}_crypt" ] \|\| cryptsetup luksOpen --key-file ${LOOP_LABEL}.key /dev/mapper/${LOOP_LABEL} ${LOOP_LABEL}_crypt
	# [ -d "/your/path/backup-ext/dropbox/mnt" ] \|\| mount "/dev/mapper/${LOOP_LABEL}_crypt" "/your/path/backup-ext/dropbox/mnt"
	# rsync -a /your/ /stuff/
	#
	#
	# Closing is easy:
	# umount "/your/path/backup-ext/dropbox/mnt"
	# cryptsetup luksClose "${LOOP_LABEL}_crypt"
	# dmsetup remove "${LOOP_LABEL}_crypt"
	#
	# for x in `losetup -al\| grep "${LOOP_LABEL}.img"\|awk '{print $1}'`; do
	# losetup -d $x
	# done
	#
	# Below is my provision script, slightly edited and I wouldn't run it.. it's just
	# to share how I configure my machine.
	#
	DISK_BLKS="${@:-$(lsblk -dpo NAME,SIZE\|grep /dev/sd\|grep 1.8T\|awk '{print $1}'\|xargs echo -n)}"

	echo "================"
	echo "Confirm:"
	echo " Disks to be used in raid arrays: ${DISK_BLKS}"
	echo
	echo "OK? Y/N"

	while read -r -n 1 -s CONFIRM; do
	if [[ $CONFIRM = [YyNn] ]]; then
	[[ $CONFIRM = [Nn] ]] && echo "Exiting.." && exit 1
	break
	fi
	done

	# Clear out all old raids
	mdadm --write-mostly --zero-superblock $(for x in `echo $DISK_BLKS`; do echo "${x}*"; done) \|\| {
	printf >&2 "[error] %s: could not zero superblocks for %s\n" "${0}" "${DISK_BLKS}"
	exit 1
	}
	wipefs --all --force $(for x in `echo $DISK_BLKS`; do echo "${x}*"; done)

	# Create disks
	for x in $(echo $DISK_BLKS); do
	sfdisk "${x}" <<SFDISK-CONFIG
	label: gpt
	device: /dev/sd${x}
	unit: sectors
	first-lba: 2048
	last-lba: 3907029134

	/dev/sd${x}1 : start=2048, size=3907027087, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, name="ssd${x}p0"
	SFDISK-CONFIG
	done

	# Create six disk raid 10 array, only change is layout = far, see man page md for details
	mdadm --verbose --create /dev/md/storage --assume-clean --level=10 --layout f2 \
	--raid-devices=6 $(for x in `echo $DISK_BLKS`; do echo "${x}1"; done) \|\| {
	printf >&2 "[error] %s: could not create raid 10 array with %s\n" "${0}" "${DISK_BLKS}"
	exit 1
	}

	# I have redundancy, bitmaps aren't needed for me
	mdadm --grow --bitmap=none /dev/md/storage

	# Update conf
	grep --quiet /dev/md/storage /etc/mdadm/mdadm.conf \|\| mdadm --detail --scan >> /etc/mdadm/mdadm.conf

	# Luks format
	cryptsetup luksFormat --verbose --verify-passphrase --key-size=512 --align-payload=1024 /dev/md/storage \|\| {
	printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
	exit 1
	}

	# Make luks mount key
	[ -f "/root/md-storage-key" ] \|\| dd if=/dev/urandom of=/root/md-storage-key bs=1024 count=4

	# Add the key for local mounting
	cryptsetup luksAddKey /dev/md/storage /root/md-storage-key \|\| {
	printf >&2 "[error] %s: luksAddKey failed\n" "${0}"
	exit 1
	}

	# Backup headers in root (this is for illustration don't lose these, seriously)
	[ -f "/root/md-storage-header" ] \|\| cryptsetup luksHeaderBackup --header-backup-file /root/md-storage-header /dev/md/storage

	# Open luks container
	cryptsetup luksOpen --key-file /root/md-storage-key /dev/md/storage storage_crypt \|\| {
	printf >&2 "[error] %s: luksOpen failed\n" "${0}"
	exit 1
	}


	# We have the following `lsblk\|awk '{print "# " $0}'` with raid + luks setup
	# ----
	# sd* 8:80 0 1.8T 0 disk
	# └─sd** 8:81 0 1.8T 0 part
	# └─storage 9:127 0 5.5T 0 raid10
	# └─storage_crypt 252:5 0 5.5T 0 crypt


	# Create lvm physical volume on top of luks container
	pvcreate /dev/mapper/storage_crypt

	# Create main volume group
	vgcreate storage /dev/mapper/storage_crypt

	# Create logical volumes
	# - Unencrypted and auto mounted at boot (physical volume is remember)
	# - base - general storage that is not categorized, serves as root mount point
	# - one - my main workstation folder, dev, scripts, workstation stuff
	# - media - non-personal media collection, music, movies, etc, not backuped remotely
	# - backup - contains.. wait for it.. backups!
	lvcreate --size 100G --name base storage
	lvcreate -L 500G -n one storage
	lvcreate -L 1T -n media storage
	lvcreate -L 1T -n remote storage

	# - Encrypted and not auto mounted
	# - chris - all my personal stuff, financial data, pictures of friends, familly, etc
	# - vault - data that I don't want to lose, but don't need regular access to.
	# - archive - where data goes to die, a recycle bin that is not often recycled.
	#
	lvcreate -L 300G -n chris storage
	lvcreate -L 500G -n vault storage
	lvcreate -L 2T -n archive storage


	# Now `lsblk\|awk '{print "# " $0}'` with our volumes:
	# ----
	# sd* 8:80 0 1.8T 0 disk
	# └─sd** 8:81 0 1.8T 0 part
	# └─md*** 9:127 0 5.5T 0 raid10
	# └─storage_crypt 252:5 0 5.5T 0 crypt
	# ├─storage-one 252:6 0 500G 0 lvm
	# ├─storage-media 252:7 0 1T 0 lvm
	# ├─storage-chris 252:8 0 300G 0 lvm
	# ├─storage-vault 252:9 0 500G 0 lvm
	# ├─storage-archive 252:10 0 2T 0 lvm
	# └─storage-base 252:11 0 100G 0 lvm



	# Format the unencrypted logical volumes
	# -q --> quiet
	mkfs.ext4 -q /dev/storage/base
	mkfs.ext4 -q /dev/storage/one
	mkfs.ext4 -q /dev/storage/media
	mkfs.ext4 -q /dev/storage/remote

	# Create the mount points to prep for luks, -p creates parents if they don't exist
	mkdir --parents /one /storage/{one,media,remote}
	mkdir -p /storage/{chris,vault,archive}

	# Mount our public fs's to start syncing data from backups etc
	mount /dev/storage/base /storage
	mount /dev/storage/one /storage/one
	mount /dev/storage/media /storage/media
	mount /dev/storage/remote /storage/remote

	# My one folder I like to bind to bind at root, this is because some programs (atom ide for example)
	# seem to behave poorly with symlinked project roots, likely due fs notify and such. I prefer this
	# because it simplifys backups tremendously, i.e. rsync /storage/ /backup/
	mount -t none -o bind /dev/storage/one /one


	# Now `df -h` with our volumes mounted:
	# ----
	# /dev/mapper/storage-base 99G 60M 94G 1% /storage
	# /dev/mapper/storage-remote 1008G 72M 957G 1% /storage/remote
	# /dev/mapper/storage-media 1008G 72M 957G 1% /storage/media
	# /dev/mapper/storage-one 493G 2.4G 468G 1% /one


	# Luks format storage: chris
	cryptsetup luksFormat -vy -s 512 /dev/storage/chris \|\| {
	printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
	exit 1
	}

	# Luks format storage: vault
	cryptsetup luksFormat -vy -s 512 /dev/storage/vault \|\| {
	printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
	exit 1
	}

	# Luks format storage: archive
	cryptsetup luksFormat -vys 512 /dev/storage/archive \|\| {
	printf >&2 "[error] %s: could not luks format, bad pass phrase?\n" "${0}"
	exit 1
	}

	# Handle encrypted vols
	for vol_short in chris archive vault; do
	backup_file="/root/lvm-storage-${vol_short}-header"
	key_file="/root/lvm-storage-${vol_short}-key"
	vol_name="/dev/storage/${vol_short}"
	crypt_name="storage_${vol_short}_crypt"

	[ -f "${key_file}" ] \|\| dd if=/dev/urandom of="${key_file}" bs=1024 count=4

	cryptsetup luksDump "${vol_name}" \| grep -qs "Slot 1: ENABLED" \|\| {
	cryptsetup luksAddKey "${vol_name}" "${key_file}" \|\| {
	printf >&2 "[error] %s: luksAddKey '%s' '%s' failed\n" "${0}" "${vol_name}" "${key_file}"
	exit 1
	}
	}

	[ -f "${backup_file}" ] \|\| \
	cryptsetup luksHeaderBackup --header-backup-file "${backup_file}" "${vol_name}"

	cryptsetup status "${crypt_name}" > /dev/null 2>&1 \|\| {
	cryptsetup luksOpen -d "${key_file}" "${vol_name}" "${crypt_name}" \|\| {
	printf >&2 "[error] %s: luksOpen -d '%s' '%s' failed\n" "${0}" "${vol_name}" "${key_file}"
	exit 1
	}
	}
	done

	# Format and mount the encrypted logical volumes
	for vol_short in chris archive vault; do
	mkfs.ext4 -q "/dev/mapper/storage_${vol_short}_crypt" && \
	mount "/dev/mapper/storage_${vol_short}_crypt" "/storage/${vol_short}"
	done


	# Now `df -h` with our volumes mounted:
	# ----
	# /dev/mapper/storage-base 99G 60M 94G 1% /storage
	# /dev/mapper/storage-one 493G 2.4G 468G 1% /one
	# /dev/mapper/storage-media 1008G 557G 400G 59% /storage/media
	# /dev/mapper/storage-remote 1008G 72M 957G 1% /storage/remote
	# /dev/mapper/storage_chris_crypt 296G 63M 281G 1% /storage/chris
	# /dev/mapper/storage_archive_crypt 2.0T 71M 1.9T 1% /storage/archive
	# /dev/mapper/storage_vault_crypt 493G 70M 467G 1% /storage/vault


	# Now `lsblk\|awk '{print "# " $0}'` with our volumes:
	# ----
	# sd* 8:80 0 1.8T 0 disk
	# └─sd** 8:81 0 1.8T 0 part
	# └─storage 9:127 0 5.5T 0 raid10
	# └─storage_crypt 252:5 0 5.5T 0 crypt
	# ├─storage-one 252:7 0 500G 0 lvm /one
	# ├─storage-remote 252:8 0 1T 0 lvm /storage/remote
	# ├─storage-media 252:9 0 1T 0 lvm /storage/media
	# ├─storage-chris 252:10 0 300G 0 lvm
	# │ └─storage_chris_crypt 252:12 0 300G 0 crypt /storage/chris
	# ├─storage-vault 252:11 0 500G 0 lvm
	# │ └─storage_vault_crypt 252:16 0 500G 0 crypt /storage/vault
	# ├─storage-base 252:13 0 100G 0 lvm /storage
	# └─storage-archive 252:14 0 2T 0 lvm
	# └─storage_archive_crypt 252:15 0 2T 0 crypt /storage/archive


	# Is it slow having a few levels of abstraction before your IO? Not really.
	# $ time dd if=/dev/zero of=./8gb.img bs=1M count=8192 conv=fdatasync
	# > 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 11.0654 s, 776 MB/s
	# $ sync && echo 3 > /proc/sys/vm/drop_caches && free -h
	# > total used free shared buff/cache available
	# > Mem: 251G 1.7G 249G 80M 470M 249G
	# > Swap: 6.4G 0B 6.4G
	# $ time dd if=./8gb.img of=/dev/null bs=1M
	# > 8192+0 records in
	# > 8192+0 records out
	# > 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 3.26732 s, 2.6 GB/s
	# >
	# > real 0m3.270s
	# > user 0m0.016s
	# > sys 0m2.248s
	#
	# Your mileage will vary of course, I'm running a supermicro X10DRH-C with 6 ssd's
	# in JBOD via lsi 3108 roc.