Skip to content

Instantly share code, notes, and snippets.

@sneal
Last active September 8, 2023 18:51
Show Gist options
  • Save sneal/f99832d666555a9f0344f4b52c9b66b1 to your computer and use it in GitHub Desktop.
Save sneal/f99832d666555a9f0344f4b52c9b66b1 to your computer and use it in GitHub Desktop.
TAS External MySQL MIgration -> Internal TAS MySQL

Migration to TAS Internal MySQL

Migrating from an externally hosted MySQL DB server such as RDS or Oracle MySQL to the internally managed TAS HA cluster MySQL requires some extra steps to avoid application downtime. To avoid unnecessary downtime and churn we want to avoid having TAS system components ever see an empty MySQL database.

At a high level the overall procedure is:

  1. create single internal mysql instance
  2. bosh stop cloud_controller
  3. mysqldump
  4. mysql (restore)
  5. reconfigure databases in TAS
  6. apply changes
  7. bosh start cloud_controller

Except for the first step, there will be some downtime during the execution of the migration in the CF control plane (cf push etc). Apps will remain running without issue during the migration, however scaling will be disabled.

Pre-Migration - Create MySQL VMs

This section should be completed safely before the migration window which will incur downtime for the CF control plane (cf push, not apps).

Since we have previously selected to use an external MySQL database for TAS, the internal MySQL resource config is not available in the UI. To get around these restrictions and deploy an internal MySQL cluster we need to go around the UI restrictions by using the OpsMan API to change the MySQL VM instance count and hand edit the extracted TAS tile to force it to deploy the MySQL job.

First we need to modify the extracted TAS tile on the OpsMan VM.

  1. SSH into your OpsMan VM
  2. Change to the tempest-web user: sudo su tempest-web
  3. cd /var/tempest/workspaces/default/metadata
  4. Find the TAS tile yaml file grep -r '^name: cf' . Can there be more than one?
  5. Edit the yaml file and replace pxc_enabled: (( .properties.system_database.selected_option.parsed_manifest(enable_pxc) )) with a hardcoded pxc_enabled: true.
  6. Save the file. Changes will take effect immediately.

With the tile pxc_enabled change complete, we now need to force create an internal TAS MySQL VM. Since our TAS tile is currently configured to use an external MySQL database we can only achieve changing the MySQL VM instance count via the OpsMan API. From your workstation with the om CLI installed, create a file named mysql.yml with the following content:

product-name: cf
resource-config:
  mysql:
    instance_type:
      id: automatic
    instances: 3
    persistent_disk:
      size_mb: automatic
  mysql_proxy:
    instances: 2
  mysql_monitor:
    instances: 1

You may need to change the instance type to a larger size or increase the persistent disk size from the defaults. You will need to inspect your existing external MySQL cluster to make these sizing decisions.

With the mysql.yml created we can apply the new MySQL resource config to OpsMan via the om CLI.

om configure-product -c mysql.yml

Before applying changes, validate the configuration changes are going to add our new MySQL VMs:

om bosh-diff -p cf

If the changes look correct go ahead and apply changes

om apply-changes -n cf

We should now have a cluster of MySQL VMs, MySQL Proxy VMs, and a MySQL Monitor VM deployed which we can verify by running bosh vms. You should see the MySQL VMs in the output similar to this:

mysql/7c5a8c1f-1f87-4479-8fee-cbd4739f0567          running  az1  192.168.2.49  vm-587fed91-c6df-4e37-b125-c218a26fa9b3
mysql/8cee5002-c995-461c-891c-84a4c05b0a30          running  az2  192.168.2.48  vm-c80933b7-2809-4878-9b5b-766c2b62d132
mysql/bd382415-39d7-4e60-b04d-f3d61ea0df12          running  az3  192.168.2.50  vm-84482f1c-926b-446a-b154-36b6a8b7aecd
mysql_monitor/688d8445-b77a-45bb-a6f2-1b2e226f60a2  running  az1  192.168.2.51  vm-2a643ede-5aa4-4eca-8b07-8090d50fe0d5
mysql_proxy/11cb0fb7-b329-408d-bce9-5b144a99acfc    running  az1  192.168.2.47  vm-47ce0437-168f-44ee-926c-adec19474e6d
mysql_proxy/81df4235-95a0-43a1-99f8-a2984b2fc8c8    running  az2  192.168.2.46  vm-b201680f-0ff4-4d22-a0e9-d4141d5b1160

This completes the required pre-migration steps that can be completed before the data migration in the next section.

Data Migration

Stop Cloud Controllers

During the migration we want to shutdown the CF control plane to keep users of the platform from making changes, like pushing apps. This can be accomplished simply by stopping all the cloud controller instances via bosh. Replace the cf-GUID below with your TAS BOSH deployment name:

bosh -d cf-GUID stop cloud_controller

That will leave the cloud controller VMs running in your IaaS but stop all the CF API jobs from running. Developers won't be able to use the CF CLI or Apps Manager.

Export MySQL

Using mysqldump backup or dump all of the TAS DBs from your external DB server. Below you'll find a script you can use as a starting point to export out each TAS DB. At a minimum you'll need to update the host address and set the MYSQL_PASSWORD env var.

#!/usr/bin/env bash

databases=(ccdb notifications autoscale app_usage_service routing diego account nfsvolume networkpolicyserver silk locket uaa credhub)

for db in "${databases[@]}"; do
  echo "Dumping database: $db"
  mysqldump \
    -h mysql01.example.com \
    -P 3306 \
    -u root \
    -p"$MYSQL_PASSWORD" \
    -v \
    --skip-add-locks \
    --single-transaction \
    --result-file="$HOME/mysql-bak/$db.sql" \
    "$db"
done

This will create a .sql file for each database that needs to be backed up and restored.

NOTE - We can't use BBR because the BBR backup scripts interact with the cloud controller which must be up and running for BBR to execute. Remember we stopped the cloud controllers in the previous step.

Restore MySQL

With the MySQL DBs backed up, lets copy the backups to the primary mysql instance (mysql/0) we created at the very beginning of these instructions. Replace the cf-GUID below with your TAS BOSH deployment name.

bosh -d cf-GUID scp -r "$HOME/mysql-bak" mysql/0:/tmp 

The backup sql files should now be in the /tmp directory on the primary TAS MySQL server. Now SSH into the mysql/0 instance, replacing the cf-GUID below with your TAS deployment name:

bosh -d cf-GUID ssh mysql/0 

From the mysql/0 instance, use the below script to restore each of the DBs into the internal TAS MySQL instance.

#!/usr/bin/env bash

# run from bosh ssh session on mysql/0
databases=(ccdb notifications autoscale app_usage_service routing diego account nfsvolume networkpolicyserver silk locket uaa credhub)

for db in "${databases[@]}"; do
  echo "Restoring database: $db"
  sudo mysql \
    --defaults-file=/var/vcap/jobs/pxc-mysql/config/mylogin.cnf \
    -v \
    "$db" < "/tmp/mysql-bak/$db.sql"
done

The DBs are now all restored, time to reconfigure TAS to use them.

Reconfigure Databases in TAS

Login to Operations Manager and open the TAS tile. Go to the databases tab in the TAS tile. Select the Internal databases - MySQL - Percona XtraDB Cluster radio button. Click Save.

Now navigate to the Operations Manager dashboard and click Review Pending Changes. Unselect all tiles except the TAS tile. Click the See Changes link on the TAS tile. Confirm that the only changes you see are to the following instances to reconfigure their connections to the mysql.service.cf.internal MySQL instance:

  • backup_restore
  • diego_database
  • uaa
  • cloud_controller
  • cloud_controller_worker
  • mysql_monitor
  • clock_global
  • credhub

Disable or uncheck all errands. Click Apply Changes.

NOTE - If you don't disable the errands you will end up with a failure because the Cloud Controllers are still stopped.

Once apply changes is complete your TAS instance should now be using the internal MySQL cluster and you can stop the old external database instance(s).

Re-enable Cloud Controller

With the migration effectively done we can re-enable the CF API and run smoke tests to validate our changes.

bosh -d cf-GUID start cloud_controller

Once all cloud controller instances are healthy we should run the TAS smoke tests, replacing the cf-GUID below with your TAS deployment name:

bosh -d cf-GUID run-errand smoke_tests

If the smoke tests pass, congrats! You've successfully migrated from an external MySQL server to the internally hosted TAS MySQL cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment