Skip to content

Instantly share code, notes, and snippets.

@bsparrow435
Created August 28, 2012 00:12
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bsparrow435/3493656 to your computer and use it in GitHub Desktop.
Save bsparrow435/3493656 to your computer and use it in GitHub Desktop.
Clone a cluster

#How To: Clone A Cluster

##Summary This gist will walk through the procedure of altering a secondary cluster using the ring configuration of a primary cluster. This configuration allows partitions to be transferred between clusters using any file transfer utility.

##Restrictions

  1. Supported Riak Versions: 1.2

  2. Both clusters must have same ring size and node count.

  3. Nodes must be down while data is being transferred

##Instructions

  1. Stop all nodes on secondary cluster

  2. Backup and move configuration files on secondary cluster

    mv PATH_TO_ETC PATH_TO_ETC.bak

  3. Backup and move ring files on secondary cluster

    mv PATH_TO_DATA/ring PATH_TO_DATA/ring.bak

  4. Copy ring and configuration files from one node on primary cluster to one node on secondary cluster. Repeat until all node on secondary cluster have configuration and ring files from a node on the primary cluster

    cp cluster1node1/PATH_TO_ETC cluster2node1/PATH_TO_ETC
    cp cluster1node1/PATH_TO_DATA/ring cluster2node1/PATH_TO_DATA/ring
    cp cluster1node2/PATH_TO_ETC cluster2node2/PATH_TO_ETC
    cp cluster1node2/PATH_TO_DATA/ring cluster2node2/PATH_TO_DATA/ring
    REPEAT FOR ALL NODES
    
  5. Start all nodes on secondary cluster

  6. Check riak-admin member_status on each cluster. The output should match. If the output does not match, revert to step 1.

  7. Note the claimant node on the secondary cluster using riak-admin ring_status. Ring Ready flag should report true.

  8. Stop all nodes on secondary cluster except the claimant node

  9. From the claimant, run riak-admin down <nodename> for each node currently stopped in the cluster

  10. Verify correct topology with riak-admin member_status. All nodes except claimant should be marked down.

  11. Delete ring files on all downed nodes

`rm -rf PATH_TO_DATA/ring`
  1. Replace configuration files on all downed nodes
`mv PATH_TO_ETC.bak PATH_TO_ETC`
  1. Start nodes. Issue riak-admin cluster join <claimant_nodename> on all nodes with new configurations.

  2. Issue riak-admin cluster force-replace <primary_nodename> <secondary_nodename> for each newly joining node. Note the replacement command, the primary cluster nodename we are replacing will map directly to the secondary cluster nodename we are adding. Data can only be transferred across clusters between these nodes as their partition ownership is the same.

  3. Issue riak-admin cluster plan on any node to confirm ownership changes. All nodenames listed should match secondary cluster names except the claimant node.

  4. Issue riak-admin cluster commit to confirm ownership changes.

  5. Wait until riak-admin member_status reports all ownership changes are finished. All nodes will be marked valid, and no data will be displayed in the pending column.

  6. Issue riak-admin ring_status and verify Ring Ready field returns True.

  7. Stop the claimant node, issue riak-admin down <claimant_nodename> from any node.

  8. Delete ring file from claimant node

`rm -rf PATH_TO_DATA/ring`
  1. Replace configuration files
`mv PATH_TO_ETC.bak PATH_TO_ETC`
  1. Start node. Issue riak-admin cluster join <nodename> where is any node in the secondary cluster.

  2. Issue riak-admin cluster force-replace <old_claimant_nodename> <new_secondary_nodename>. Again, note the primary nodename and corresponding secondary nodename as these nodes will be paired during file transfer.

  3. Issue riak-admin cluster commit.

  4. Verify correct ownership with riak-admin member_status. All nodenames should match secondary cluster nodenames. Verify riak-admin ring_status returns Ring Ready True and no ownership changes are pending.

##Transferring Data

Partition ownership on the secondary cluster now maps directly to the primary cluster based on the primary nodename to secondary nodename relationship determined by the riak-admin cluster force-replace <primary_nodename> <secondary_nodename> commands issued above.

While the secondary cluster is down, we can copy the data directory from the primary node to the cooresponding secondary node. If you choose to copy the entire Riak data directory be sure to exclude the ring file from the transfer as we want to preserve the ringfile on the secondary cluster.

Once the data transfer is complete, Riak can be restarted on the secondary cluster and all data should be available via any supported API call. You can also setup multi datacenter replication between these clusters to negotiate any changes in data via fullsync and real-time repl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment