The storage requirements for a geth archive node with a full sync from genesis are >10TB. The initial sync of the chain requires high IOPS that are usually provided by flash storage such as SSDs or NVMEs to process each block. As the archive node stores the entire state of the chain, the space requirements are drastically higher than a regularly-pruned full node.
For many users, having >10TB of SSD or NVME storage is not feasible. It is conceivable that the vast majority of state data is at rest on the node, and as such may benefit from a tiered cache approach where enough IOPS are provided for syncing, with a large HDD backend for ultimate data storage.
This method uses lvmcache with two physical disks:
- 1 TB NVME -
/dev/disk/by-id/nvme-1TB
- used for the cache volume - 20 TB HDD -
/dev/disk/by-id/ata-HDD
- used for the origin, or data volume
N.B. to reap the benefits of the NVME IOPS during syncing, which includes a significant number of writes to the drive, the node will use writeback cache mode. This method delays the writes from the cache volume to the origin. The loss of the cache device may result in the loss of data.
Create physical volumes (PV) using each disk, create a volume group (VG) including both disks, and create logical volumes (LV) for the origin and cache volumes, assigning them to their respective disks.
Create PVs:
sudo pvcreate /dev/disk/by-id/nvme-1TB
sudo pvcreate /dev/disk/by-id/ata-HDD
Create VG called ethvol
:
sudo vgcreate ethvol /dev/disk/by-id/ata-HDD /dev/disk/by-id/nvme-1TB
Create origin LV called rpdata_lv
on the HDD using 100% of its capacity:
sudo lvcreate -n rpdata_lv -l +100%FREE ethvol /dev/disk/by-id/ata-HDD
Create cache volumes on the NVME. This consists of cache data and metadata volumes that are added to a cache pool. The cache metadata volume will be 500 MB, and the cache data volume will use (almost) the remaining space. (The lvcreate
for cache_data_lv
uses -l +99%FREE
as the subsequent conversion to a cache pool requires some additional space.)
sudo lvcreate -n cache_metadata_lv -L 500M ethvol /dev/disk/by-id/nvme-1TB
sudo lvcreate -n cache_data_lv -l +99%FREE ethvol /dev/disk/by-id/nvme-1TB
Create the cache pool.
sudo lvconvert --type cache-pool --poolmetadata ethvol/cache_metadata_lv ethvol/cache_data_lv
Add the cache pool to the origin HDD volume.
sudo lvconvert --type cache --cachepool ethvol/cache_data_lv ethvol/rpdata_lv
The default cache mode is writethrough. Convert to writeback.
sudo lvchange --cachemode writeback ethvol/rpdata_lv
Create ext4 filesystem on the rpdata_lv
volume, using the Rocket Pool documentation as a guide.
sudo mkfs.ext4 -m 0 -L rocketarchive /dev/ethvol/rpdata_lv
Grab the UUID for the created filesystem.
sudo blkid | grep rpdata_lv ### grab the UUID
Edit /etc/fstab
to include the new volume to be mounted at /mnt/rpdata
:
### /etc/fstab
/dev/disk/by-uuid/<uuid from blkid> /mnt/rpdata ext4 defaults 0 0
Mount the drive
sudo mount /mnt/rpdata
Check IOPS. This operation should just hit the cache drive and the IOPS should be equivalent to testing the NVME disk directly.
cd /mnt/rpdata
sudo fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
Results:
test: (groupid=0, jobs=1): err= 0: pid=1957977: Sun Aug 21 11:21:10 2022
read: IOPS=29.4k, BW=115MiB/s (120MB/s)(3070MiB/26731msec)
bw ( KiB/s): min= 32, max=229088, per=100.00%, avg=117628.15, stdev=63351.55, samples=53
iops : min= 8, max=57272, avg=29406.96, stdev=15837.93, samples=53
write: IOPS=9825, BW=38.4MiB/s (40.2MB/s)(1026MiB/26731msec); 0 zone resets
bw ( KiB/s): min= 8, max=76816, per=100.00%, avg=39310.42, stdev=21235.16, samples=53
iops : min= 2, max=19204, avg=9827.45, stdev=5308.75, samples=53
cpu : usr=6.83%, sys=33.93%, ctx=98809, majf=0, minf=8
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
At this point, the instructions simply follow the Rocket Pool documentation for a Docker install. The new mountpoint /mnt/rpdata
will be used for Docker data and can be set up using the instructions at Configuring Docker's Storage Location.
Edit /etc/docker/daemon.json
to include the following:
{
"data-root": "/mnt/rpdata/docker"
}
Create the Docker dir and restart the Docker daemon:
sudo mkdir /mnt/rpdata/docker
sudo systemctl restart docker
Finally, configure the Rocket Pool Smartnode to run geth in archive mode by providing the --syncmode full --gcmode archive
flags to geth in the Execution Client settings.
At the time of this writing, the archive node sync is in progress.