This document shows Prysm benchmarks on disk & network I/O and CPU & memory usage undergoing initial sync.
Total time taken to sync up to the current epoch and slot as of Sept 9th, to slot 2028735 was approximately 24 hours.
We tried a few different settings but overall found default is the best.
The final state size of the beacon chain was ~25GB.
Overall it seemed that the block-batch-limit
had the most effect on the sync speed and success & error rate.
Error rate seemed to increase as we got closer to the head of the chain.
We recommend reducing the block-batch-limit
to 32 from the default of 64 near the chain head if the error rate is very high.
Benchmarks from 2 settings are documented below.
Benchmarks were carried out a heavily overprivision instance.
An AWS EC2 m5.2xlarge was used, with 4 vCPUs and 32GB of RAM.
Additionally for storage we used a AWS GP3 EBS SSD with 8000 IOPS and 500mb/s throughput.
This is the default settings for Prysm.
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
Average: ens5 174.07 136.55 143.83 18.41 0.00 0.00 0.00 0.00
Average: DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util
Average: dev259-0 406.37 0.00 4136.79 0.00 10.18 0.01 0.92 4.80
17602 ubuntu 20 0 10.6g 2.6g 1.0g S 354.5 8.3 8:44.38 beacon-chain-v1
Observation length: ~1hr
[2021-09-08 12:32:08] INFO initial-sync: Processing block batch of size 31 starting from 0x860d9237... 893313/2023358 - estimated time remaining 202h31m1s blocksPerSecond=1.6 peers=9
[2021-09-08 13:27:10] INFO initial-sync: Processing block batch of size 64 starting from 0x6680b10c... 981344/2023633 - estimated time remaining 10h10m35s blocksPerSecond=28.4 peers=47
Blocks per second varied from 12-32. Cumulative average was around 24 blocks per second.
This setting showed the best consistent blocks per second of 50 in the streaming output but cumulative average was 27. The theory was:
- High
slots-per-archive-point
to reduce disk I/O and storage space requirements, and prevent any halting for disk writes - Higher
p2p-max-peers
to improve chances of getting new blocks quickly - High
block-batch-limit
which showed a consistent 50 blocks per second being processed, better than a higher limit of 786 and a lower limit of 384. - Increase
max-goroutines
to benefit from any possible concurrency boost
block-batch-limit: 512
p2p-max-peers: 500
slots-per-archive-point: 8192
max-goroutines: 40000
Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
Average: ens5 932.33 1055.31 107.40 168.37 0.00 0.00 0.00 0.00
Average: DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util
Average: dev259-0 451.45 0.00 4678.72 0.00 10.36 0.01 0.95 4.93
15329 ubuntu 20 0 10.6g 6.9g 4.7g S 223.3 22.4 810:29.98 beacon-chain-v1
We saw a CPU usage of ~223-400, approxiomately 25-100% of the available capacity. This was more or less the average.
We saw a memory footprint of 6.9GB used on average. This is about 3x what was used with the default configuration.
We observed this setting for around 4 hours. While most streaming outputs claimed a 50 blocks per second were being processed, the real rate was 27 blocks per second.
Attribute this to more errors with the higher block batch limit with block batches not being processed due to a "no good block in batch" error.
There were 2 types of errors that popped up.
No good block in batch
- This would make the syncing process discard the block of batches it had just fetched and retry for a new batch. This was more visible with higherblock-batch-limit
No parent found in DB for node ...
- A slighlty more cryptic error. Sometimes blocks would be fetched and then discarded if a parent node was not found in the database. This seemed independent of any settings but may be related to having a high peer count.