Skip to content

Instantly share code, notes, and snippets.

@yorickdowne
Last active March 13, 2024 01:10
Show Gist options
  • Star 76 You must be signed in to star a gist
  • Fork 17 You must be signed in to fork a gist
  • Save yorickdowne/3323759b4cbf2022e191ab058a4276b2 to your computer and use it in GitHub Desktop.
Save yorickdowne/3323759b4cbf2022e191ab058a4276b2 to your computer and use it in GitHub Desktop.
Pruning Geth 1.10.x, 1.11.x, 1.12.x

Note: PBSS in Geth >=1.13.0 removes the need to prune manually.


Old content for reference

Overview

Geth (Go-Ethereum) as of July 2022 takes about 650 GiB of space on a fast/snap sync, and then grows by ~ 14 GiB/week with default cache, ~ 8 GiB/week with more cache.

This will fill a 2TB SSD in a year or two, to the point where space usage should be brought down again with an offline prune.

Happily, Geth 1.10.x introduced "snapshot offline prune", which brings it back down to about its original size. It takes roughly 4-6 hours to prune the Geth database, and this has to be done while Geth is not running.

Caveat that while several folx have used offline pruning successfully, there is risk associated with it. The two failure modes we have seen already are:

  • There is 37 GiB or less of free disk space
  • The pruning process is interrupted partway through.

Prerequisites

  • This is not an archive node. Do not try to prune an archive node.
  • The volume Geth stores its database on has roughly 40 (?) GiB of free space or more. We know 37 GiB is not enough.
  • Geth is fully synced
  • Geth has finished creating a snapshot, and this snapshot is 128 blocks old or older, about 35 minutes. You can tell it is done creating the snapshot when it is no longer showing "state snapshot generation" messages in logs. Geth generates a snapshot by default, right after it is done syncing.
  • tmux or similar such as screen installed: sudo apt install tmux. This intro is useful for navigating tmux. tmux just makes sure the prune process continues even when the user is logged out, e.g. because of an idle timer or connection break.

What you expect to see

Geth will prune in 3 stages: "Iterating state snapshot", "Pruning state data", and "Compacting database". During the "Compacting database" stage, it may not output any log entries for an hour or so (mainstream SSD IOPS). Don't restart it when this happens, let it run!

If you see messages about "state snapshot generation" during the prune, you don't actually have a snapshot yet! Either the --datadir and/or USER aren't right, or Geth just didn't have enough time to complete the snapshot. In that case, do stop the process, run Geth normally again, and observe its logs until snapshot has completed and is 128 blocks old.

When Geth is done pruning, the process will exit and you will see a log line that contains the phrase State pruning successful.

Pruning if you are using systemd to run Geth

systemd will run something like a geth service, with a User specified in the /etc/systemd/system/geth.service file, and an ExecStart in the same file that runs geth, which also specifies the --datadir path.

Stop Geth: sudo systemctl stop geth

If Geth does not have enough time to shut down cleanly, the prune may fail. You may need to give it 180s to shut down cleanly, depending on your storage hardware

You now have two options, choose whichever is easiest for you.

Systemd option A, use sudo

  • First, start tmux or screen. This is so you can get disconnected and the prune will continue running.
  • Then, with the USER and PATH to --datadir from the systemd service file, run sudo -u USER geth --datadir PATH snapshot prune-state. If you set up Geth following Somer Esat's current guide, that's sudo -u geth geth --datadir /var/lib/geth snapshot prune-state, or with Somer's original guide it's sudo -u goeth geth --datadir /var/lib/goethereum snapshot prune-state. If you followed CoinCashew's instructions to set up Geth, it'd just be geth snapshot prune-state.

Note that running as the user Geth usually runs as is critical for the Geth service to still have permissions to its own DB files, when you start it up again.

Once pruning is complete, start Geth again: sudo systemctl start geth

If you don't want to run tmux, you could modify the Geth service instead.

Systemd option B, modify the existing service

  • Edit the existing file: sudo nano /etc/systemd/system/geth.service and add this to the very end of ExecStart: snapshot prune-state

Add this to the existing arguments, do not replace the existing arguments. Geth still needs to know where its --datadir is at.

  • Tell systemd you made changes: sudo systemctl daemon-reload
  • Start the Geth service: sudo systemctl start geth
  • You can observe prune progress with journalctl -fu geth

Note: Unless you also change the restart parameter, systemd will restart the prune after it finishes, which will fail. Once you restore the service to its previous state, you expect Geth to run successfully again.

Once Geth has finished pruning, undo the changes you made:

  • Edit the existing file: sudo nano /etc/systemd/system/geth.service and remove this from ExecStart: snapshot prune-state
  • Tell systemd you made changes: sudo systemctl daemon-reload
  • Start the Geth service: sudo systemctl start geth
  • You can observe that Geth starts correctly with journalctl -fu geth

Pruning if you are using docker-compose to run Geth

If you are using docker-compose, all you need to do is stop the Geth service, and start it again with pruning parameters.

eth-docker supports ./ethd prune-geth which handles the below steps for you. It also offers an auto-prune.sh script that can kick off pruning when disk space goes below a threshold, or will just output a warning that crontab can email to you if run as auto-prune.sh --dry-run.

Rocketpool uses rocketpool service prune-eth1 to prune Geth

Generic docker-compose, this won't work in eth-docker:

  • docker-compose stop execution && docker-compose rm execution
  • docker-compose run --rm --name geth_prune -d execution snapshot prune-state
  • Observe pruning progress with: docker logs -f --tail 500 geth_prune
  • When pruning is done: docker-compose up -d execution
  • And observe that Geth is running correctly: docker-compose logs -f execution
@yorickdowne
Copy link
Author

Stop Geth
Run geth --datadir THEDATADIR snapshot prune-state
Start Geth when finished

THEDATADIR is whatever you're currently telling Geth to use.

@mfiumara
Copy link

mfiumara commented Jan 2, 2022

I got 43GB left is this enough for pruning? If not is it safer to completely remove my geth chain and resync from the start?

@yorickdowne
Copy link
Author

Should be. Worst case it fails and you start over - it only takes 4-5 hours for the pruning process.

@mfiumara
Copy link

mfiumara commented Jan 2, 2022

Thought the same, it has been running now for approx. 5 hours and I'm at the compacting stage, monitoring output of df seems that my available disk space is increasing now so I think all's well.

@yorickdowne
Copy link
Author

What's the lowest it went to? We may be able to dial in the space guidance in this gist a little better.

@anthonyyim
Copy link

Thanks for writing this @yorickdowne. I followed Somer Esat's guide and then followed your instructions (went with option A) and everything went without a hitch.

It ended up taking me about 6hrs for the pruning to complete. It only freed about 150GB of space (out of 800GB of free space I had), but it's enough runway for me to not have to worry about it for a while.

@salbright2192
Copy link

salbright2192 commented Jul 24, 2022

Geth version 1.10.20
Prune initiated with tmux
Error message 'ERROR[07-24|15:43:26.259] Error in block freeze operation err="block receipts missing, can't freeze block 14222575"'
Seems to progress up till compaction faze then this error pops over and over.

@yorickdowne
Copy link
Author

That sounds like a corrupted Geth DB, @salbright2192 . You may need to removedb (or just rm the contents of --datadir) and then sync Geth from scratch.

@joeytwiddle
Copy link

joeytwiddle commented Aug 3, 2022

I think this gist should mention the disadvantages with pruning. Are there any?

Perhaps we cannot look up old transactions? Or we can, but it's slower to respond, because they are not cached?

@Sajjon
Copy link

Sajjon commented Aug 13, 2022

How much disk spaces have you managed to get back after successful pruning?

I'm running geth 1.10.21 on a Mac Mini M1 (unable to install more SSD...), geth has eaten up 1.5 TB of my 2 TB and probably need to free some disk space soon.

@yorickdowne
Copy link
Author

There is no disadvantage to pruning. Lookups aren’t slower, and all lookups still work.

@yorickdowne
Copy link
Author

Pruning Geth brings it down to around the original size. 650 to 700 GiB.

@daron4ever
Copy link

When I start prune, should I need to include my custom flags (--syncmode "snap" --cache 32000 --rpc.txfeecap 0........)?

@yorickdowne
Copy link
Author

Those are not needed; they also shouldn't hurt.

@daron4ever
Copy link

Those are not needed; they also shouldn't hurt.

Thanks for the info.

@n-kutsev
Copy link

Can you please explain why do i need tmux for option A? If i do this remotely through SSH and don't need to use the machine in the next few hours, i may do it without tmux? Is that right?

@yorickdowne
Copy link
Author

Screen, tmux, nohup, anything that makes sure that if/when your SSH disconnects, the prune keeps running

@jhfnetboy
Copy link

Thank Yorick!
I pruned again today and waiting for the state improvement: ethereum/go-ethereum#25390.

@hydrodan
Copy link

For the record, 37 GB is too little space now. Ran a prune and it failed about 5 hours in no space left on device

@yorickdowne
Copy link
Author

Good to know, thank you!

@nockuno
Copy link

nockuno commented Apr 23, 2023

Option A with Somer Esat's guide: sudo -u geth geth --datadir /var/lib/geth snapshot prune-state

@alex-miller-0
Copy link

In case someone is deciding whether or not to prune right now, be advised that it took me much longer than the estimated 4-6 hours. For reference, I am running a NUC 12 with i7 CPU, 32 GB RAM, 2TB SSD mounted with noatime, so not exactly resource constrained here.

  • Time spent: 14.5h
  • Space recovered: 275GB

@yorickdowne
Copy link
Author

Thanks! Pruning is so rare that it’s absolutely possible this takes longer in 2023 than it did in 2022.

@PanosChtz
Copy link

Pruning took about 16 hours to complete on my RPi4 (after 5 months of continuous geth running)

@Allen-yan
Copy link

Pruning took about 16 hours to complete on my RPi4 (after 5 months of continuous geth running)

16 hours..
Maybe it's more faster to sync from initial with snap mode than pruning

@yorickdowne
Copy link
Author

3 hours on x64 with "mainstream" NVMe, recently.

But rejoice: Pruning should soon be a thing of the past. PBSS is now in master, and with 1.13.0 it should release, in a few months.

@hanik244
Copy link

win10 2TB 剩下3G 怎麼辦?

@chong-he
Copy link

win10 2TB 剩下3G 怎麼辦?

you can remove the consensus database, then proceed with pruning / update to latest geth and use the online pruning feature. Then sync consensus client with checkpoint sync

@DZDomi
Copy link

DZDomi commented Jan 17, 2024

Latest Intel NUC 12 with i5-1240P, 32 GB of RAM and 2 TB Samsung 980 Pro NVME:

State pruning successful                 pruned=301.19GiB elapsed=3h42m34.156s

Geth was running since around half a year, down from 1,2 TB to 905 GB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment