Skip to content

Instantly share code, notes, and snippets.

@irvingpop
Last active July 16, 2021 10:18
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save irvingpop/22b40ddb6ccf54771dbb to your computer and use it in GitHub Desktop.
Save irvingpop/22b40ddb6ccf54771dbb to your computer and use it in GitHub Desktop.
Cloning a Chef Server 12 installation

Customer Scenario

A customer has a Chef Server 12 (HA - DRBD) in Production. They want to test an in-place upgrade (or maintenance) using their current OPC Production data and config. This gives us a good chance to make corrections if we find that their data is too broken for the migrations to handle, and gives the customer experience in managing the upgrade in Production.

The sequence of events will broadly be these:

  • Install the same version of Chef Server on the target HA Test cluster
  • Restore data from Production instance backup (LVM snapshot or full-stop backup)
  • Test

The Process

  1. Assuming a clean target system, nothing in /etc/opscode, /opt/opscode or /var/opt/opscode, no processes running which match ps -ef |grep opscode

Create the backup on the production cluster

  1. Create data backup from Production source (the bootstrap backend, which should also be the ACTIVE backend) A. If the server is 12.2.0 or greater, use the chef-server-ctl backup tool and create an OFFLINE backup B. Otherwise, you will need to create a manual backup:
  2. on the standby backend: chef-server-ctl stop keepalived
  3. the rest of the steps on the active backend:
  4. chef-server-ctl stop # to stop all services except keepalived
  5. Ensure keepalived still considers this server to be active and keeps the DRBD volume mounted, but all services except keepalived are stopped: chef-server-ctl ha-status, chef-server-ctl status and mount | grep drbd
  6. tar -czf /tmp/chefbackup-destination.tar.gz /etc/opscode /etc/opscode-reporting /etc/opscode-manage /var/opt/opscode/drbd/data
  7. Copy the tarball off of the system and restore all services that were stopped

On the target system: Install and Verify CS12

  1. Install CS12 on the backends and frontends as described in the documentation and validate the system is working correctly
  2. Restore the Production cluster data A. If your backup data was created by a Chef Server 12.10.0 or greater version using chef-server-ctl backup
  3. Copy the backup tarball onto the primary/bootstrap backend
  4. chef-server-ctl restore /path/to/backup.tar.gz B. If your backup data was created by an older CS12 cluster, follow the same steps as 2B to prepare the cluster for restore
  5. on the standby backend: chef-server-ctl stop keepalived
  6. the rest of the steps on the active backend:
  7. chef-server-ctl stop # to stop all services except keepalived
  8. Ensure keepalived still considers this server to be active and keeps the DRBD volume mounted, but all services except keepalived are stopped: chef-server-ctl ha-status, chef-server-ctl status and mount | grep drbd
  9. Remove the DRBD data: rm -rf /var/opt/opscode/drbd/data/*
  10. Restore the backup on the current bootstrap Primary target system:
tar -xvz chefbackup-source.tar -C /
  1. chef-server-ctl reconfigure && opscode-manage-ctl reconfigure
  2. chef-server-ctl start
  3. Copy the configuration folders: (/etc/opscode, /etc/opscode-manage, /etc/opscode-reporting) to the frontends and secondary backend
  4. Reconfigure the frontends and TEST

TEST

  1. On each chef server, edit the /etc/hosts file
  2. determine the api_fqdn value (ex: chef.mycompany.com)
  3. determine the primary IP address of the given node (ex: 10.10.10.5)
  4. Alias the api_fqdn to the local host by adding an entry to the /etc/hosts file like so:
10.10.10.5 chef.mycompany.com
  1. NOTE: It is safe to leave this entry permanently, it is only relied upon by the test suite
  2. Test the system:
  3. chef-server-ctl test
  4. curl -k http://localhost/_status
  5. Check DRBD status to be sure we are replicating with the secondary
  6. Test a selection of orgs and operations on those orgs: client lists, node lists, chef-client runs, group memberships, and any other important or desired tests.
@ShahSonali
Copy link

I wanted to know what will happen to chef server which is not active, if I stop the controller process i.e. chef-server-ctl and start it. will it become active.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment