Skip to content

Instantly share code, notes, and snippets.

@fbeauchamp
Last active June 29, 2023 08:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fbeauchamp/e026323c3a3ebae74f5667aee16082c9 to your computer and use it in GitHub Desktop.
Save fbeauchamp/e026323c3a3ebae74f5667aee16082c9 to your computer and use it in GitHub Desktop.

Backup immutability

Definition and goal

immutability use a trusted third party that prevents us from changing or deleting files. That way, even if our system is compromised, we can't delete or break any of our backups.

it is a key feature of the fight against ransomware

state of the art

https://www.veeam.com/blog/installing-ubuntu-linux-veeam-hardened-repository.html

backend

S3

S3 can provide this feature, with a mode 'compliance', that prevent even the root account from deleting the file during the lock duration

regular FS

install an agent on the server that

  • create immut_DATE folder in each VM used in an immutable backup
  • add the +a extended attribute to the folder ( meaning file can be created and appended but not deleted or modified)
  • remote the a extended attributes to the older folders

on XO side :

  • use new immut_DATE forlder for the xva
  • open file in append only mode ( 'a' )
  • cleanup shoud try to remove file, but should handle gracefully locked files
  • backup should show a warning if immut_DATE are missing ( meaning the script did not run on the file server side )

https://superuser.com/questions/1095043/automatically-apply-chattr-to-new-files-directories

full VM backup, disk backup , metadata backup

we only need to ensure that the object lock duration is less than the retention

Delta backups

since we need to be able to modify the merging and merged vhd, and that all the vhd in the following chains depends on theses, it means that the full chain of VHD should be under the object lock protection to garantee immutability (the merge will be done on the full/delta out of object lock). That means that the retention should be , at the minimum, equal to the vhd chain length + the desired immutability duration, increasing the backup size notably. At the minimum there will be 2 full with one under object lock.

Proposition:

  • upload all the blocks of the delta
  • upload a full BAT, including the vhd uuid and block number to the inherited blocks of all the ancestors. (size: 32-64MB depending on the path length for a 2TB vhd without any empty block)
  • refresh the object lock on all the block of the parent (and their parents)

cleanup :

  • list all the blocks (million / billion of blocks).
  • delete all the blocks with an expired lock
  • delete the metadata and alias of the vhd which have at least one bock deleted

Having an index of blocks and their usage can speed up the search for expired blocks and depending index

V2

First, if yyour remote support natively some sort of immutability, use it, it will be more robust than any user level layer we add over it. Native immutability can be named "object lock" on S3, retention policy depending on the provider. Also you won't be required to take a proxy license for it, since XO already play nice with locked remote since 2022.

proxy installed on remote, should have storage as local ( maybe nfs4.2+ ) , run as root

  • remote lock duration should be a config file, only accessible locally to root

  • after backup on a immutable remote

    • store a hash of the file in the metadata
    • add 'immutable:true' to the json metadata with https://stedolan.github.io/jq/
    • find <remotepath>/vm -ctime -7 \( -iname \*.vhd -o -iname \*.xva -o -iname \*.json \) -exec chattr +a {} \; (using + a attributes instead of +i since content added atthe end of a json or vhd would make the content invalid, but setting +i to a file being transfered would make it fails. using ctime ensure we don't relock file after merges)
  • before cleanVm find <remotepath> -ctime +7 \( -iname \*.vhd -o -iname \*.xva -o -iname \*.json \) -exec chattr -R -a {}\; from https://unix.stackexchange.com/questions/15308/how-to-use-find-command-to-search-for-multiple-extensions

  • update the UX showing immutable backups

  • handle immutable error as gracefull as possible

caveats

  • File are vulnerable during backup
  • multiple backup in parallel shouldn't lead to race condition in normal use since we split by mtime
  • change in the conf are applied at the end of the next backup
  • maybe we should store a vhd hash to the metadata ?
  • the number of files should be limited with VhdFile and find should be ok . VhdBlock may need tu use another approach, with inotify
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment