Best practices for maintaining & running substrate-based (para)chain (wip)
- Ensure all of your systems are kept up-to-date, especially with security updates.
- Ensure you can fully bootstrap a new system from scratch easily if needed.
- Upload archive chain backups frequently. Potentially make these available to the community down the road.
- Have database recovery methods in place.
- Use infrastructure as code. Never modify anything manually on your servers.
- Ensure you have a monitoring stack set up WITH alerts (alertmanager/grafana alerts/bots etc).
- Keep alerts actionable, otherwise they become noise.
- If a service crashes, make sure it automatically restarts in some way (via systemd or kubernetes).
- Have a dedicated RPC node for developers to use.
- Have a dedicated archive node.
- Test new binaries periodically during development on a testnet.
- Run your own telemetry instance.
- Do not expose RPC ports other than on localhost besides on dedicated RPC nodes! libp2p should be the only thing that needs to listen publicly on your nodes.
- Do not expose your RPC nodes your apps are relying on publicly; use them only over an internal VPN/overlay network.
- Use a VPN/VPC or other overlay network to send metrics from your servers to Prometheus.
- Enforce a strict set of user accounts and SSH keys at a path only modifiable by root.
- Deploy a hardened SSH configuration.
hey there, thanks for the above checklist and the sub0 talk.
Here are some likely annoying "show me the way" questions!
Do you perhaps have any guides available for any of these points?
on point 2) what's the best way to set up new system from scratch?
I have a network with 3 validators required to finalize blocks. I use docker, but i find that sometimes they don't all peer, so i either need to modify the docker network/bridge or more simply launch it on a different server, which increase the likelihood it peers.
Do i need all 3 validators to peer with each other before i can expect blocks to be produced? Or can i go ahead and insert the session keys if 2 validators are peering with only 1 and 1 is peering with the 2?
Do you have a guide on how to set up a Kubernetes cluster in the substrate context? if that is required to set up and manage a system more easily.
On this point:
Do you have any guides for how to set this up in a Docker overlay setting?