Skip to content

Instantly share code, notes, and snippets.

@lilmuckers
Last active September 3, 2019 14:13
Show Gist options
  • Save lilmuckers/fdbe1507a9a6050e2965e86b5e625c2d to your computer and use it in GitHub Desktop.
Save lilmuckers/fdbe1507a9a6050e2965e86b5e625c2d to your computer and use it in GitHub Desktop.
AWS S3 bucket region move

AWS S3 Bucket Region Migration

This is a simple script to "automate" the migration of a bucket between regions on AWS.

The way that AWS is built makes it annoyingly difficult to move an S3 bucket between regions. This is often nessecary if you have a case where a AWS service that reads from S3 isn't available in the region you have your data - so moving the bucket is the best way to ensure that the service can access the data quickly and without additinoal S3 costs.

This is only really an issue on data heavy applications like EMR, Athena, Sagemaker, Quicksight, and suchlike.

This tool was written very quickly so i didn't have to babysit the process overnight, and could verify the results in the morning after the job had completed.

  1. Creates a new temporary bucket in the target region
  2. Syncs temporary bucket with source bucket in source region
  3. Delete source bucket
  4. Creates new bucket in target region with same name as source bucket
    • Will keep trying until AWS makes the bucket name available again
  5. Copys data from temporary bucket to the new bucket (very fast)
  6. Deletes temporary bucket.

Usage

$ ./migrate.sh <bucket_name> <target region>
#!/bin/bash
#####################################################################
# S3 bucket region migration script #
# #
# This script is a really simple and stupid way to migrate #
# a bucket across regions. This was (very quickly) developed #
# due to AWS having no internal utility to do this without #
# some fragile manual processes. This allowed me to do it #
# and walk away from the computer for migrating dozens of buckets #
# #
# For terabytes or petabytes of data, then a commercial s3 #
# migration tool should be used. But this works for gigabytes of #
# data. #
# #
# Requires AWS-CLI to be installed, and configured with credentials #
# to read/write/create buckets/delete buckets #
# #
# This will delete S3 buckets! so be careful! #
# #
# I run this on EC2 instances in the region I'm migrating to - but #
# this will mean you need to remove the s3 endpoint from your VPC #
# #
# USAGE #
# ./migrate.sh <bucket_name> <destination region> #
# #
# Will create a new temporary bucket in destination region, sync #
# the source bucket to that temporary bucket, delete the source #
# bucket. #
# Then it will attempt to recreate the source bucket name in the #
# destination region, on a loop, until AWS lets it do it. Then it #
# will sync the temporary bucket to the new bucket, and delete the #
# temporary bucket #
#####################################################################
BUCKET_TO_MIGRATE=$1
REGION_TO_MOVE_TO=$2
RANDOM_KEY=$(tr -cd '[:alnum:]' < /dev/urandom | fold -w30 | head -n1 | tr '[:upper:]' '[:lower:]')
TEMPORARY_BUCKET_NAME="$BUCKET_TO_MIGRATE-$RANDOM_KEY"
SCRIPT_START_TIME="$(date -u +%s)"
# Create temporary bucket
echo "[$(date)] Create the temporary bucket"
echo ">> aws s3 mb s3://$TEMPORARY_BUCKET_NAME --region $REGION_TO_MOVE_TO"
aws s3 mb s3://$TEMPORARY_BUCKET_NAME --region $REGION_TO_MOVE_TO > log.txt 2>error.txt
# Ensure the data is synced
echo "[$(date)] Ensure that the buckets are synced before doing the migration process"
echo ">> aws s3 sync s3://$BUCKET_TO_MIGRATE s3://$TEMPORARY_BUCKET_NAME"
aws s3 sync s3://$BUCKET_TO_MIGRATE s3://$TEMPORARY_BUCKET_NAME > log.txt 2>error.txt
# delete the old bucket from the old region
echo "[$(date)] Deleting the old bucket"
echo ">> aws s3 rb s3://$BUCKET_TO_MIGRATE --force"
aws s3 rb s3://$BUCKET_TO_MIGRATE --force > log.txt 2>error.txt
BUCKET_CREATION_START_TIME="$(date -u +%s)"
ATTEMPTS=1
echo "[$(date)] Starting creation of replacement bucket at $(date)"
echo ">> aws s3 mb s3://$BUCKET_TO_MIGRATE --region $REGION_TO_MOVE_TO"
until aws s3 mb s3://$BUCKET_TO_MIGRATE --region $REGION_TO_MOVE_TO > log.txt 2>error.txt;
do
ATTEMPTS="$(($ATTEMPTS+1))"
sleep 60;
done
BUCKET_CREATION_END_TIME="$(date -u +%s)"
BUCKET_CREATION_TIME="$(($BUCKET_CREATION_END_TIME-$BUCKET_CREATION_START_TIME))"
echo "[$(date)] Bucket created in $BUCKET_CREATION_TIME seconds, after $ATTEMPTS attempts"
echo "[$(date)] Starting sync of data"
BUCKET_SYNC_START_TIME="$(date -u +%s)"
echo ">> aws s3 sync s3://$TEMPORARY_BUCKET_NAME s3://$BUCKET_TO_MIGRATE"
aws s3 sync s3://$TEMPORARY_BUCKET_NAME s3://$BUCKET_TO_MIGRATE > log.txt 2>error.txt
BUCKET_SYNC_END_TIME="$(date -u +%s)"
BUCKET_SYNC_TIME="$(($BUCKET_SYNC_END_TIME-$BUCKET_SYNC_START_TIME))"
echo "[$(date)] Finished sync of data in $BUCKET_SYNC_TIME seconds"
echo "[$(date)] Delete temporary bucket"
echo ">> aws s3 rb s3://$TEMPORARY_BUCKET_NAME --force"
aws s3 rb s3://$TEMPORARY_BUCKET_NAME --force > log.txt 2>error.txt
SCRIPT_END_TIME="$(date -u +%s)"
SCRIPT_RUN_TIME="$(($SCRIPT_END_TIME-$SCRIPT_START_TIME))"
echo "[$(date)] All done in $SCRIPT_RUN_TIME seconds!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment