This article tries to explain how full disk encryption works in Linux.
There are many ways to do it, but we'll explain the LUKS way of doing it. LUKS (Linux Unified Key Setup) is a standard specifying how full disk (or full volume encryption) should be implemented.
LUKS is implemented with:
- cryptsetup (frontend, i.e. the binaries) and
- dm-crypt (kernel module, part of the device mapper infrastructure, see diagram)
First, let's start with a great summary from an AskUbuntu user:
Luks is an encryption layer on a block device, so it operates on a particular block device, and exposes a new block device which is the decrypted version. Access to this device will trigger transparent encryption/decryption while it's in use.
It's typically used on either a disk partition, or a LVM physical volume which would allow multiple partitions in the same encrypted container.
LUKs stores a bunch of metadata at the start of the device. It has slots for multiple passphrases. Each slot has a 256 bit salt that is shown in the clear along with an encrypted message. When entering a passphrase LUKS combines it with each of the salts in turn, hashing the result and tries to use the result as keys to decrypt an encrypted message in each slot. This message consists of some known text, and a copy of the master key. If it works for any one of the slots, because the known text matches, the master key is now known and you can decrypt the entire container. The master key must remain unencrypted in RAM while the container is in use.
Knowing the master key allows you access to all the data in the container, but doesn't reveal the passwords in the password slots, so one user cannot see the passwords of other users.
The system is not designed for users to be able to see the master key while in operation, and this key can't be changed without re-encrypting. The use of password slots, however, means that passwords can be changed without re-encrypting the entire container, and allows for use of multiple passwords.
I installed Linux Mint with "disk encryption" ticked, and ended up with this:
$ lsblk -o name,size,fstype,label,mountpoint
NAME SIZE FSTYPE LABEL MOUNTPOINT
sda 119,2G
├─sda2 488M ext2 /boot
├─sda3 118,3G crypto_LUKS
│ └─sda3_crypt 118,3G LVM2_member
│ ├─mint--vg-root 110,4G ext4 /
│ └─mint--vg-swap_1 7,9G swap [SWAP]
└─sda1 512M vfat /boot/efi
SDA is my only physical disk.
It has three partitions:
- (unencrypted) EFI stuff (nevermind about this, it's required by UEFI - kind of a next-gen BIOS)
- (unencrypted) Boot partition
- (encrypted) crypto_LUKS partition, which is a virtual block device which can host child partitions (all of which will be encrypted)
In my case the crypto_LUKS partition contains LVM2_member (because I set up LVM)
which is another virtual device that under it contains the concrete filesystems
ext4 (/
) and swap. You don't probably have to use LVM to achieve encryption,
but I used it because you can do cool stuff like extend a volume to two hard disks,
snapshotting etc.
You can think of crypto_LUKS and LVM2_member as filters: they take something as input, and give something as an output. crypto_LUKS takes the raw encrypted blocks from the hard disk and give the decrypted blocks as an output. LVM is another filter (and another indirection) which gives additional features.
The above textual graph in a more visual way:
+-----------+
| |
| Hard disk |
| |
+-+---+---+-+
| | |
+--------+ | +--------+
| | |
| | |
+-------v---+ +-----v----+ +-----v------------------+
| | | | | |
| Volume 1 | | Volume 2 | | Volume 3 |
| EFI | | /boot | | crypto_LUKS |
| | | | | (virtual block device) |
+-----------+ +----------+ | |
+------+-----------------+
|
|
+------v------+
| |
| Volume 1 |
| LVM2_member |
| |
+--+--------+-+
| |
| |
+----v-----+ +v-----+
| | | |
| / (root) | | swap |
| ext4 | | |
| | +------+
+----------+
Since the encryption is implemented in software (Linux kernel module), the system has to read some stuff from the hard drive to be able to present a user interface for asking the passphrase to recover the encryption key with (to achieve decryption), so how can you say my data is secret if the computer can read some data off of the disk without my passphrase?
Good question! Basically it boils down to this: the boot partition (/boot
) is
unencrypted and it contains the kernel and the most important modules that are required
before actually booting the machine into usable state. These things are required pre-boot:
- Kernel (
/boot/vmlinuz-4.8.0-53-generic
) - Initial RAM disk (
/boot/initrd.img-4.8.0-53-generic
, read more) - Other stuff from
/boot
like GRUB's (Linux's bootloader) configuration
So the boot process goes like this:
- You press the power button
- The CPU starts, and jumps to a hardcoded address of 0xfffffff0, which is mapped to BIOS ROM by the motherboard/hardware.
- BIOS does a bunch of stuff, but for our purposes it is sufficient to say that it holds a setting from which disk to boot up
- BIOS digs up the MBR (Master Boot Record) from the chosen disk and hands off execution to it. BIOS is not involved from now on.
- MBR (GRUB's stage 1 loader) is so small it can't contain logic to read from
/boot
filesystem, so it starts loading GRUB stage 1.5 from DOS compatibility area of the disk, now stage 1.5 (still too small to contain the entire GRUB) can read/boot
and load/execute GRUB ("stage 2"). - GRUB now reads kernel from
/boot
to RAM, and hand off booting to it. - Kernel starts booting, mounts the initrd so it can access pre-boot utilities/drivers/etc to actually begin booting the system into usable state.
- The Kernel can now load modules and understand partitions like crypto_LUKS and LVM2_member, but will not be able to read them (because it doesn't know the master key).
- cryptsetup will now ask your passphrase, which is used to uncover the master key from one of the key slots (explained later).
- If you gave the correct passphrase, the Kernel can now decrypt stuff from under the crypto_LUKS and start mounting and reading from the LVM volumes!
Let's dive right into this by asking what metadata the LUKS system has on that hard disk:
$ cryptsetup luksDump /dev/sda3
LUKS header information for /dev/sda3
Version: 1
Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha256
Payload offset: 4096
MK bits: 512
MK digest: 0e 28 66 97 5f e3 57 54 49 e1 92 95 11 f8 13 4f 0a 2d 21 0f
MK salt: b1 2f a4 ad 8a 4c 50 28 e1 b7 30 5b 6e 72 b3 b1
a8 40 0a 59 1f eb 49 8d c4 41 36 e7 21 10 ae 8b
MK iterations: 63125
UUID: c8286408-de03-40f5-93ef-274cf534563f
Key Slot 0: ENABLED
Iterations: 512000
Salt: fc ae 72 8e 9b 71 5c 2e 77 8a 8b 23 da e4 0f 2c
be a0 b9 5a 74 0b 9b d5 5e 67 d5 90 2e f2 7b b8
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
Take note from above output:
- Lines with "MK" refer to master key
- There are 8 key slots, of which only the first one I use.
Why key slots? It is an additional indirection, but the added complexity carries its own weight! When you want to change the disk encryption passphrase, and if the passphrase would be the encryption key (or even derived from), you would have to re-encrypt the whole disk because you're effectively changing the encryption key - not good!
Instead, LUKS uses a neat trick: the master key is not something you specify, but it is machine generated and thus contains more entropy and thus is more secure than passphrases that people come up with. The master key is encrypted with your passphrase (know your passphrase => know the encryption key). That means that when you change your passphrase, all LUKS has to do is re-encrypt only the master key with your new passphrase. The master key remains unchanged, but your passphrase (and MK's encrypted form) changes.
Additionally, LUKS supports multiple passphrases (by having multiple "key slots") if you want to have multiple people use the same computer (or possibly for key rotation if two passphrases have to overlap for a short while).
Moving on! Let's say the master key is hunter2
(in reality it isn't human
readable and simple like this but this is easier to explain with).
The master key is not stored on disk - only its encrypted form is. Let's remember the LUKS metadata we looked at before, specifically:
- MK salt
- MK digest
MK digest could be defined like this (this is all pseudocode, and in reality uses the more secure bruteforce-resistant PBKDF2):
masterKeyDigest = sha256(masterKey + masterKeySalt)
masterKeySalt
is not private information (it's stored in plaintext
in the unencrypted portion of the disk). Let's say the salt is salty salt
.
Therefore, plucking those details in to calculate the digest:
$ echo -n "hunter2salty salt" | sha256sum
ffa479b8cbc9f526c12fb50acadd27d665155fe0b0729440a405a60398da4b65
=> Our digest is the ffa479b8...
string.
Now, the volume header would contain these public details:
MK digest: ffa479b8cbc9f526c12fb50acadd27d665155fe0b0729440a405a60398da4b65
MK salt: salty salt
Now, how about the key slots? Remember, they store the master key encrypted with the passphrase of your choosing.
Let's say our passphrase for slot 0 is supersecret
(and the master key was hunter2
).
Salt for key slot 0 will be slot0salt
(salt is public knowledge).
Therefore, our key slot 0 encryption key (= slot 0 passphrase and salt
concatenated) will be supersecretslot0salt
.
Let's now encrypt the master key (with blowfish as an example, but again in reality a better algorithm, PBKDF2 is used to protect brute-forcing the encryption key):
$ echo 'hunter2' | openssl bf -a -k supersecretslot0salt
U2FsdGVkX1/odbGQLAZ95E8uA+THJLMq0Zy3H5H87R0=
Now, here are all the public metadata for the master key slot-based encryption to work:
MK digest: ffa479b8cbc9f526c12fb50acadd27d665155fe0b0729440a405a60398da4b65
MK salt: salty salt
Key Slot 0: ENABLED
Salt: slot0salt
Master key encrypted: U2FsdGVkX1/odbGQLAZ95E8uA+THJLMq0Zy3H5H87R0=
Key Slot 1: DISABLED
...
Key Slot 7: DISABLED
So, let's stitch this all together to see how the system boots up!
Your machine can only read the unencrypted /boot
, which contains the kernel and supporting
stuff to get the booting process started. The system knows that there are encrypted partitions,
knows the metadata we listed in the previous heading, and will ask you for the passphrase.
You enter the correct passphrase for key slot 0, supersecret
.
Now we decode the master key from slot 0 by combining the passphrase and salt into it:
$ echo U2FsdGVkX1/odbGQLAZ95E8uA+THJLMq0Zy3H5H87R0= | openssl bf -d -a -k supersecretslot0salt
hunter2
Now, if the passphrase was wrong, the system doesn't know if the master key was decoded
correctly. Let's imagine that we gave passphrase wrongpw
and thus the master key was
decoded as wrongmasterkey
.
This is where the master key digest and salt step in. Let's try digesting the wrongmasterkey
:
$ echo -n "wrongmasterkeysalty salt" | sha256sum
c7bbe305ba3092ac42e6503e54022f0102a97141e79b4568690face6dda04150
The digest should've been ffa479b8...
so that passphrase for that slot was wrong.
When entering the passphrase, it doesn't know (or ask) to which slot the passphrase
belongs to, but rather just iterates all the slots (0..7) and tries to decrypt the master
key with the above process I outlined. If the try is unsuccesful, it moves to
next slot. If all slot tries are unsuccesful, it was a wrong passphrase.
Now let's try it again with the correct master key decrypted with the correct slot 0 passphrase:
$ echo -n "hunter2salty salt" | sha256sum
ffa479b8cbc9f526c12fb50acadd27d665155fe0b0729440a405a60398da4b65
That matched the metadata (MK Digest
), so now we succesfully uncovered the master key
from slot 0 by knowing slot 0's correct passphrase.
Hopefully I managed to shed some light on how Linux full-disk encryption works (particularly, the LUKS kind with cryptsetup and dm-crypt), and by extension now you know how other systems achieve full disk encryption, because they're somewhat similar anyway.
Even though you can most certainly use encryption without understanding how it's implemented under the hood, you get way more confidence if it's not just "black magic" to you, but rather you actually understand at least the basics of what happens under the hood.
- How does GNU GRUB work (technical internals)
- Inspecting the Content of an Initrd File
- Android disk encryption is similar: Revisiting Android disk encryption
- Dissecting LUKS (from a crypto standpoint)
Very nice explanation! Thank you!
Do you know if we would be able to encrypt an external hard drive before partitioning it? I am interested in having a single key for the whole disk, while still being able to split it into different partitions.