Skip to content

Instantly share code, notes, and snippets.

@squarism
Last active April 7, 2017 02:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save squarism/32d164f3c6087f74f77ba48e77a28741 to your computer and use it in GitHub Desktop.
Save squarism/32d164f3c6087f74f77ba48e77a28741 to your computer and use it in GitHub Desktop.
File Systems are Databases

THIS IS A MAJOR WIP - SKIP TO TRUSTY SECTION, ARCH DID NOT WORK WITH THE MYSQL VERSIONS. I WILL BE CLEANING THIS UP AS I DEVELOP THE TALK / BLOG POST.

Install OS

Install Arch (this doesn't matter, just get a Linux box) https://wiki.archlinux.org/index.php/Installation_guide

If you are in a VM, add a 4gb disk.

Ops Stuff

# if you need to get to root, do this su -
useradd -m you
passwd you
edit /etc/group and add yourself to wheel.

pacman -S sudo
visudo
uncomment %wheel ALL=(ALL) NOPASSWD: ALL

# on ubuntu it's `%sudo	ALL=(ALL:ALL) NOPASSWD:ALL`

# relog in to reload groups.
# sudo -i should get you a root shell.  Do this from now on.

pacman -S openssh
systemctl restart ssh
ip addr
# now ssh to this ip.  Do this from now on.

# Copy your public ssh key so you can ssh to your VM easier.
# From your real machine ...
scp .ssh/id_rsa.pub you@vm_ip:
# it copies over with a password
mkdir .ssh
mv id_rsa.pub .ssh/authorized_keys
chmod 700 .ssh
chmod 600 .ssh/authorized_keys
# Now you can ssh without a password prompt

# vm networking - check your interface name with `ip link show`
systemctl enable dhcpcd@ens33.service
systemctl enable sshd.service

Other tools:

pacman -S wget base-devel

Take a VM snapshot.

Install MySQL

This is going to represent our storage engine. Super hand-waving cheating mode but imagine that the filesystem details are a database driver.

sudo pacman -S mysql
:: There are 2 providers available for mysql:
:: Repository extra
   1) mariadb
:: Repository community
   2) percona-server

Choose 2

We need some raw devices. Your VM should have a second disk with a unformatted partition. An unformatted disk is called a raw disk or raw device.

$ sudo -i
# sfdisk -l

... snip ...

Disk /dev/sdb: 4 GiB, 4294967296 bytes, 8388608 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

# /dev/sdb should be displayed :)

4gb is plenty. But there's no partition. So let's fdisk.

fdisk /dev/sdb
n
p
1
[enter]
[enter]
w

Now sfdisk -l shows sdb1.

Raw

We need a kernal module next. Run modprobe raw. Shouldn't output anything. This would be annoying to type every boot so let's put this in autoload (this is arch specific, arch is sort of new to me so send me corrections or anything).

echo raw > /etc/modules-load.d/raw.conf
lsmod | grep raw

If you reboot, lsmod should still show raw loaded. Now create a raw disk.

raw /dev/raw/raw1 /dev/sdb1
raw -qa  # shows it

OOoo I don't think Percona 5.6 is going to fix raw device support. Flagged as WONTFIX.

cp /usr/share/mysql/mysql.server /etc/init.d/mysql
chmod +x /etc/init.d/mysql
update-rc.d mysql defaults
/etc/init.d/mysql restart


[fdisk section]

[raw section]
but echo raw >> /etc/modules
chown mysql /dev/raw/raw1


# /etc/rc.local
# I guess rawdevices isn't a thing in ubuntu
raw /dev/raw/raw1 /dev/sdb1
chown mysql /dev/raw/raw1

Raw DB Init

Edit the config /etc/mysql/my.cnf

# add two lines to the mysqld section
innodb_data_home_dir=
innodb_data_file_path=/dev/raw/raw1:2Gnewraw

Start tailing the log file. tail -f /var/lib/mysql/raw.err Start mysql: /etc/init.d/mysql start

161005 13:48:16  InnoDB: Setting file /dev/raw/raw1 size to 2048 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000

Stop mysql. Change the config.

innodb_data_home_dir=
innodb_data_file_path=/dev/raw/raw1:2Graw

Fails

14.04 (trusty) 5.1 did not work. It started using MyISAM when I added my.cnf flags. Wtf. No innodb storage engine in SHOW ENGINES. Even with configure flags.

5.7 (from apt) I couldn't get to use a raw device on.

2016-10-10T23:22:45.384709Z 0 [ERROR] InnoDB: The innodb_system data file '/dev/sdb1' must be writable
2016-10-10T23:22:45.385039Z 0 [ERROR] InnoDB: The innodb_system data file '/dev/sdb1' must be writable
2016-10-10T23:22:45.385221Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2016-10-10T23:22:45.988524Z 0 [ERROR] Plugin 'InnoDB' init function returned error.
2016-10-10T23:22:45.988958Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2016-10-10T23:22:45.989249Z 0 [ERROR] Failed to initialize plugins.
2016-10-10T23:22:45.989447Z 0 [ERROR] Aborting

I tried a raw device but that didn't work. Permissions are correct. I confirmed the mysql user could dd garbage to that device.

Mariadb 10.1 seems like it might have a patch applied to it to do this. https://github.com/MariaDB/server/pull/85/commits/ee5633a39e5f82546f720b4c487fec6bf5f5c066 Added the debs.

Damn it. Don't tell me it was apparmor the whole time.

[   66.086476] audit: type=1400 audit(1476137663.796:11): apparmor="DENIED" operation="open" profile="/usr/sbin/mysqld" name="/dev/sdb1" pid=2691 comm="mysqld" requested_mask="wr" denied_mask="wr" fsuid=1001 ouid=0
[ 3539.314099] audit: type=1400 audit(1476141133.220:17): apparmor="DENIED" operation="open" profile="/usr/sbin/mysqld" name="/dev/raw/raw1" pid=6387 comm="mysqld" requested_mask="wr" denied_mask="wr" fsuid=1001 ouid=1001

Trusty

14.04.1 Install off server iso. Don't do easy install in vmware. In the wizard, install openssh server. Add your ssh key.

# scp your pub to the VM's ip `ip addr`
mkdir .ssh
chmod 700 .ssh
mv id_rsa.pub .ssh/authorized_keys
chmod 600 .ssh/authorized_keys

Apparmor. Years of this horsecrap.

/etc/init.d/apparmor stop
/etc/init.d/apparmor teardown
update-rc.d -f apparmor remove
apt-get -y remove apparmor
reboot

mysql

# be root already `sudo -i`, just do this always for a VM who cares
apt-get -y install software-properties-common
apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xcbcb082a1bb943db
add-apt-repository 'deb [arch=amd64,i386,ppc64el] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.1/ubuntu trusty main'

apt-get update
apt-get -y install mariadb-server

Do the same partition dance from above.

YOU ABSOLUTELY HAVE TO GET THE PARTITION THE RIGHT SIZE HERE. Use bc. Take the error message:

[ERROR] InnoDB: Data file /dev/sdb1 is of a different size 131008 pages (rounded down to MB) than specified in the .cnf file 131072 pages!

Take that number above, the number it wants, and do this. In my case, I had a 1GB disk/partition I added in vmware.

echo "16 * 1024  * 131008" | bc
2146435072
mysql_install_db --user=mysql --innodb_data_file_path=/dev/sdb1:2146435072 --bootstrap
# tail -f /var/log/syslog while starting, you'll see it complain

# Edit the innodb path to have newraw
# /etc/mysql/conf.d/mariadb.cnf
innodb_data_file_path=/dev/sdb1:2146435072newraw

# sudo -u mysql /bin/bash and run mysqld --skip-innodb-doublewrite
# It should complain about there being no system tables.  Now you are ready to mysql_install_db
# Edit the newraw to be raw now.
# /etc/mysql/conf.d/mariadb.cnf
innodb_data_file_path=/dev/sdb1:2146435072raw

# If you are mysql already then all you need to do is
mysql_install_db
# Installing MariaDB/MySQL system tables in '/var/lib/mysql' ...
# This is after rm -rf /var/lib/mysql/*
# This should init without error but maybe complain about replication stuff
# TODO: I need to repeat this a few times on a clean VM to figure out this step more smoothly

# now edit /etc/mysql/conf.d/mariadb.cnf
innodb_file_per_table=0
innodb_data_home_dir=
innodb_data_file_path=/dev/sdb1:2146435072raw

# Set a mysql password.
/usr/bin/mysqladmin -u root password something

# Start mysql 
/etc/init.d/mysql start
mysql -u root -p

# Sanity check
show engines;  # should show innodb listed
set default_storage_engine=innodb;

create database foo;
use foo;
create table t1 (i int) engine=innodb;

ls /var/lib/mysql/foo
# t1.frm is there BUT THIS IS OK - this is metadata
# You won't see the normal MYD file because our data is on a raw partition!

Regardless of the storage engine you choose, every MySQL table you create is represented on disk by a .frm file that describes the table's format

We can test this out by dumping an .img file before inserting into the table.

dd if=/dev/sdb1 bs=64k | gzip -c > image.gz
ls -l image.gz
# -rw-r--r-- 1 root root 2104530 Oct 12 19:46 image.gz

MariaDB [foo]> insert into t1 values (42);
MariaDB [foo]> insert into t1 values (1);
MariaDB [foo]> insert into t1 values (2);
MariaDB [foo]> insert into t1 values (3);
MariaDB [foo]> insert into t1 values (4);
MariaDB [foo]> insert into t1 values (5);

# image gets bigger
# -rw-r--r-- 1 root root 2103222 Oct 12 19:46 image.gz

Or search for the strings

create table stuff (id int auto_increment primary key, message varchar(55)) engine=innodb;
insert into stuff (message) values ("i love pie");
insert into stuff (message) values ("i love pants");
insert into stuff (message) values ("i love bacon");

apt-get install binutils
strings /dev/sdb1 | grep bacon
i love bacon
i love bacon

There's the data.

Raw Partition Ownership

We need the mysql daemon user to own the /dev/sdb1 device just as it would own the /var/lib/mysql data directory.

# /etc/udev/rules.d/99-mysql-raw-permissions.rules
ENV{DEVNAME}=="/dev/sdb1" OWNER="mysql" GROUP="mysql"

Reboot. You'll see:

$ ls -l /dev/sdb
brw-rw---- 1 root disk 8, 16 Apr  6 18:43 /dev/sdb

$ ls -l /dev/sdb1
brw-rw---- 1 mysql mysql 8, 17 Apr  6 18:43 /dev/sdb1

Other Things

# /etc/mysql/my.cnf
bind-address            = 127.0.0.1

Change to 0.0.0.0, since you are going to connect to your VM from ruby.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment