scottsb/resetting-csync2-cluster.md

## resetting-csync2-cluster.md

      
    Raw
  

              resetting-csync2-cluster.md
            
          
    Guide to Resetting a csync2 Cluster

Introduction

These are possible steps to reset a csync2 cluster that has been seriously fubared.
This is an apocalyptic approach and should only be used when more surgical fixes (like
correcting an individual conflict) aren't workable.
Use Cases

This will solve errors like:
    ERROR from peer 10.0.0.1: File is also marked dirty here!

And, depending on the cause, it may fix errors like:
    Database backend is exceedingly busy => Terminating (requesting retry).
    ERROR from peer 10.0.0.1: Connection closed.

Important Notes

References to lsyncd assume you're triggering csync2 via lsyncd
(as with puppet-clustersync) but can
be safely ignored if that doesn't apply to you.
As written these commands do not specify a csync2 config file. If you are using csync2
with a named config, each csync2 command should include -C [config-name].
Instructions

Phase 1


On all servers, forcibly disable all the automatic syncing:
 service lsyncd stop
 killall csync2


Choose one server that's closest to the correct state and manually adjust files as
needed to make it the authoritative master.


On all servers, update csync2 database to match filesystem but without marking anything
as needing to be synced out to other servers:
 csync2 -cIr /


Phase 2

Run all commands in this phase from the authoritative server selected in the last phase.


Find all differences between authoritative server and remotes and mark for sync:
 csync2 -TUXI


Reset database to force current server to be winner on any conflicts:
 csync2 -fr /


Run a sync to all other servers:
 csync2 -xr /


Phase 3


On all servers, confirm that all files are in sync (if successful will list no files):
 csync2 -T


On all servers, restart automatic sync service:
 service lsyncd start