#How To: Clone A Cluster
##Summary This gist will walk through the procedure of altering a secondary cluster using the ring configuration of a primary cluster. This configuration allows partitions to be transferred between clusters using any file transfer utility.
##Restrictions
-
Supported Riak Versions: 1.2
-
Both clusters must have same ring size and node count.
-
Nodes must be down while data is being transferred
##Instructions
-
Stop all nodes on secondary cluster
-
Backup and move configuration files on secondary cluster
mv PATH_TO_ETC PATH_TO_ETC.bak
-
Backup and move ring files on secondary cluster
mv PATH_TO_DATA/ring PATH_TO_DATA/ring.bak
-
Copy ring and configuration files from one node on primary cluster to one node on secondary cluster. Repeat until all node on secondary cluster have configuration and ring files from a node on the primary cluster
cp cluster1node1/PATH_TO_ETC cluster2node1/PATH_TO_ETC cp cluster1node1/PATH_TO_DATA/ring cluster2node1/PATH_TO_DATA/ring cp cluster1node2/PATH_TO_ETC cluster2node2/PATH_TO_ETC cp cluster1node2/PATH_TO_DATA/ring cluster2node2/PATH_TO_DATA/ring REPEAT FOR ALL NODES
-
Start all nodes on secondary cluster
-
Check
riak-admin member_status
on each cluster. The output should match. If the output does not match, revert to step 1. -
Note the claimant node on the secondary cluster using
riak-admin ring_status
. Ring Ready flag should report true. -
Stop all nodes on secondary cluster except the claimant node
-
From the claimant, run
riak-admin down <nodename>
for each node currently stopped in the cluster -
Verify correct topology with
riak-admin member_status
. All nodes except claimant should be marked down. -
Delete ring files on all downed nodes
`rm -rf PATH_TO_DATA/ring`
- Replace configuration files on all downed nodes
`mv PATH_TO_ETC.bak PATH_TO_ETC`
-
Start nodes. Issue
riak-admin cluster join <claimant_nodename>
on all nodes with new configurations. -
Issue
riak-admin cluster force-replace <primary_nodename> <secondary_nodename>
for each newly joining node. Note the replacement command, the primary cluster nodename we are replacing will map directly to the secondary cluster nodename we are adding. Data can only be transferred across clusters between these nodes as their partition ownership is the same. -
Issue
riak-admin cluster plan
on any node to confirm ownership changes. All nodenames listed should match secondary cluster names except the claimant node. -
Issue
riak-admin cluster commit
to confirm ownership changes. -
Wait until
riak-admin member_status
reports all ownership changes are finished. All nodes will be marked valid, and no data will be displayed in the pending column. -
Issue
riak-admin ring_status
and verify Ring Ready field returns True. -
Stop the claimant node, issue
riak-admin down <claimant_nodename>
from any node. -
Delete ring file from claimant node
`rm -rf PATH_TO_DATA/ring`
- Replace configuration files
`mv PATH_TO_ETC.bak PATH_TO_ETC`
-
Start node. Issue
riak-admin cluster join <nodename>
where is any node in the secondary cluster. -
Issue
riak-admin cluster force-replace <old_claimant_nodename> <new_secondary_nodename>
. Again, note the primary nodename and corresponding secondary nodename as these nodes will be paired during file transfer. -
Issue
riak-admin cluster commit
. -
Verify correct ownership with
riak-admin member_status
. All nodenames should match secondary cluster nodenames. Verifyriak-admin ring_status
returns Ring Ready True and no ownership changes are pending.
##Transferring Data
Partition ownership on the secondary cluster now maps directly to the primary cluster based on the primary nodename to secondary nodename relationship determined by the riak-admin cluster force-replace <primary_nodename> <secondary_nodename>
commands issued above.
While the secondary cluster is down, we can copy the data_directory from the primary node to the cooresponding secondary node. If you choose to copy the entire Riak data_directory be sure to exclude the ring file from the transfer as we want to preserve the ringfile on the secondary cluster.
Once the data transfer is complete, Riak can be restarted on the secondary cluster and all data should be available via any supported API call. You can also setup multi datacenter replication between these clusters to negotiate any changes in data via fullsync and real-time repl.