Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rupeshtiwari/3b1ee8603b149aa2d6b26f8e62151023 to your computer and use it in GitHub Desktop.
Save rupeshtiwari/3b1ee8603b149aa2d6b26f8e62151023 to your computer and use it in GitHub Desktop.
Migration Assistant for Amazon OpenSearch Service

Migration Assistant for Amazon OpenSearch Service Workshop

Prerequisite

  • aws cli installed
  • aws credentials configured
  • Installed session manager plugin
brew install --cask session-manager-plugin

image

Note: Templates and Aliases will only go via fetch command only. So if u re-run fetch it will check if indices exist on target it wont copy it.

Deploying Demo Setup Steps

Step 1: Launch the bootstrap stack

image

Step 2: Setup the bootstrap instance

1. Add below policy to admin user

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "ssm:StartSession",
			"Resource": [
				"arn:aws:ec2:us-east-1:147228461610:instance/i-07bb142a4e1d1015f",
				"arn:aws:ssm:us-east-1:147228461610:document/SSM-dev-BootstrapShell"
			]
		}
	]
}

2. Check the EC2 instnce (optional)

aws ec2 describe-instances --instance-ids i-07bb142a4e1d1015f --query 'Reservations[].Instances[].[InstanceId, State.Name, PrivateIpAddress, InstanceType, VpcId, SubnetId, IamInstanceProfile.Arn]' --output text

output:

i-07bb142a4e1d1015f     running 10.0.1.233      t2.large        vpc-027d21c529ff47624   subnet-03e74a323742dee98        arn:aws:iam::147228461610:instance-profile/migration-assistant-BootstrapEC2InstanceInstanceProfile987ED32E-8mOHqiIafFF9

3. Access bootstrap instance

aws ssm start-session --document-name SSM-dev-BootstrapShell --target i-07bb142a4e1d1015f --region us-east-1

image

4.Run script to prepare the bootstrap instance for deploying the migration pieces

Wait 5-10 minutes to finsh

./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration

image

This will create couple of script files such as accessContainer.sh script to connect to the migration console in ECS container.

image

Step 3: Step 4: Deploy the migration stacks

export AWS_DEFAULT_REGION=us-east-1
# This will take 1-2 mins
cdk bootstrap --c contextId=demo-deploy

image

Check what are we going to deploy in this demo (optional)

cdk ls "*" --c contextId=demo-deploy --require-approval never --concurrency 3

image

Expected output:

App Registry mode is enabled for CFN stack tracking. Will attempt to import the App Registry application from the MIGRATIONS_APP_REGISTRY_ARN env variable of arn:aws:servicecatalog:us-east-1:147228461610:/applications/0a62vr23uwjs7kbbj3q08u4gof and looking in the configured region of us-east-1
Received following context block for deployment: 
{
  stage: 'dev',
  engineVersion: 'OS_2.9',
  domainName: 'demo-opensearch-cluster',
  dataNodeCount: 2,
  availabilityZoneCount: 2,
  openAccessPolicyEnabled: true,
  domainRemovalPolicy: 'DESTROY',
  enableDemoAdmin: true,
  trafficReplayerEnableClusterFGACAuth: true,
  captureProxyESServiceEnabled: true,
  fetchMigrationEnabled: true,
  sourceClusterEndpoint: 'https://capture-proxy-es.migration.dev.local:19200'
}
End of context block.
networkStack-default (OSMigrations-dev-us-east-1-default-NetworkInfra)
openSearchDomainStack-default (OSMigrations-dev-us-east-1-default-OpenSearchDomain)
migrationInfraStack (OSMigrations-dev-us-east-1-MigrationInfra)
mskUtilityStack (OSMigrations-dev-us-east-1-MSKUtility)
analyticsDomainStack (OSMigrations-dev-us-east-1-AnalyticsDomain)
migration-analytics (OSMigrations-dev-us-east-1-MigrationAnalytics)
fetchMigrationStack (OSMigrations-dev-us-east-1-FetchMigration)
capture-proxy-es (OSMigrations-dev-us-east-1-CaptureProxyES)
traffic-replayer-default (OSMigrations-dev-us-east-1-default-TrafficReplayer)
migration-console (OSMigrations-dev-us-east-1-MigrationConsole)

Deploy demo setup:

# Deploy demo setup, This may take upto 1 hour 
cdk deploy "*" --c contextId=demo-deploy --require-approval never --concurrency 3

image

When the stack is deployed then enable monitoring.

Monitoring

image

image

Activate AWS Cost Explorer

Confirm cost tags associated with the solution

Before running this make sure the Stack creation is complete state CREATE_COMPLETE.

aws cloudformation describe-stacks --stack-name OSMigrations-dev-us-east-1-MSKUtility --region us-east-1 --query 'Stacks[0].StackStatus'

image

Now add the tag

image

Activate cost allocation tags associated with the solution

Verify Deployment

This demo will deploy 4 ECS clusters, one of them is migration-dev-capture-proxy-es here you will see ES 7.10 is already created with no data in it. Also in the same ECS service you will see Capture Proxy is deployed and running too. So Capture Proxy is listnening to port 9200 and ES 7.10 is listening to 19200. This setup is already done. If you want to deploy Capture Proxy in your own self-hosted ES server follow the steps here.

  1. Check All 4 ECS clusters are created in the region where you deployed.

    1. migration-dev-traffic-replayer-default
    2. migration-dev-migration-console
    3. migration-dev-capture-proxy-es
    4. migration-dev-otel-collector

    image

  2. Check EC2 bootstrap-dev-instance is created

image

  1. Check MSK is created

image

Running Migration Assistant Tool

We will do 2 demos:

  1. Migrating historical data ES to AOS
  2. Migrating live data from ES to AOS

Connecting to Migration Console

# 1. Open Local Terminal and Authenticate
IAM Identity Center copy and execute AWS environment variables
# 2. Execute below to connect to bootstrap instance
aws ssm start-session --document-name SSM-dev-BootstrapShell --target i-07bb142a4e1d1015f --region us-east-1

image

# 3. Navigate to deployment folder to access scripts
cd deployment/cdk/opensearch-service-migration/

image

# 3. Execute below from bootstrap instance, connect to migration console terminal ./accessContainer.sh migration-console STAGE REGION
./accessContainer.sh migration-console dev us-east-1

image

Historical Data Migration

Modify the runTestBenchmarks.sh script to target port 19200 and index test data into the ES cluster.

  1. Update script
vi runTestBenchmarks.sh

image

  1. Run benchmark script, to populate couple of indices in ES 7.10
./runTestBenchmarks.sh
  1. Check for new indices in the ES7.10 Cluster
vi catIndices.sh

image

Run Fetch command from console terminal

# This will execute the script and print the required ECS run task command
./showFetchMigrationCommand.sh

expected output:

aws ecs run-task --task-definition arn:aws:ecs:us-east-1:147228461610:task-definition/migration-dev-fetch-migration:1 --cluster migration-dev-ecs-cluster --launch-type FARGATE --network-configuration '{"awsvpcConfiguration":{"subnets":["subnet-05643745bbd732274","subnet-03bfe3b54d1f6b4fc"],"securityGroups":["sg-08a49f1b54cb7ec43","sg-079bfe228c2bee4ae"]}}'

Run the command above to initiate the fetch migration, which will launch an ECS task in a new container to execute the data preparation pipeline, transferring data directly from the source ES cluster to Amazon OpenSearch without involving MSK or Capture Proxy.

It will take some time to migrate all data, based on the size. Confirm it migrated all indices, templates and aliases.

image

image

Live Data Real-Time Migration

Note: data in kafka is not deleted after replayer

Same traffic pattern in replayer as source.

source ---|-------|----|
replayer      |-------|----|

Send real-time data to ES 7.10 cluster, Data will go to Capture proxy, MSK and ES 7.10 that's it.

Execute below to send real-time data to ES 7.10

# Run in background
nohup python3 ./live-data.py --endpoint https://capture-proxy-es:9200 & 

Execute below to start traffic replayer

aws ecs update-service --cluster migration-dev-ecs-cluster --service migration-dev-traffic-replayer-default --desired-count 1

Confirm migration is happening

./stats.sh

image

image

2024-04-17 index only created with 259 docs in ES7.10

image

Execute below to start traffic replayer to send data to AOS:

image

Check the count of the docs in target cluster (AOS).

image

image

Apendix

Elastic Search Related

curl https://capture-proxy-es:9200/_cat/templates?v -u admin:admin --insecure
curl -X PUT "https://capture-proxy-es:19200/_template/template_1?pretty" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["te*", "bar*"],
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "_source": {
      "enabled": false
    },
    "properties": {
      "host_name": {
        "type": "keyword"
      },
      "created_at": {
        "type": "date",
        "format": "EEE MMM dd HH:mm:ss Z yyyy"
      }
    }
  }
}
'



image

curl -X PUT "https://capture-proxy-es:19200/logs-221998/_alias/alias1?pretty" -u admin:admin --insecure 

image

Kafka related

Check the count of payload in the kafka topic. If there is any events in the MSK topic logging-traffic-topic

./kafka-tools/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --topic logging-traffic-topic --time -1 --command-config kafka-tools/aws/msk-iam-auth.properties

output:

logging-traffic-topic:0:9

0 partitions and 9 events

Show the progress of the traffic replayer how far it gone processsing of those records in the MSK. How many records was consumed from the kafka.

./kafka-tools/kafka/bin/kafka-consumer-groups.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --timeout 100000 --describe --group logging-group-default --command-config kafka-tools/aws/msk-iam-auth.properties

output:

GROUP                 TOPIC                 PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                                           HOST            CLIENT-ID
logging-group-default logging-traffic-topic 0          8               9               1               consumer-logging-group-default-1-261cc380-f44f-45b1-97d0-beb1e1a966d1 /10.0.1.202     consumer-logging-group-default-1

Let's stop traffic replayer and delete the kafka topic. Next we will index some data on 9200 port which will go via capture proxy and

Execute to stop traffic replayer:

aws ecs update-service --cluster migration-dev-ecs-cluster --service migration-dev-traffic-replayer-default --desired-count 0

image

Delete Kafka Topic named "logging-traffic-topic"

./kafka-tools/kafka/bin/kafka-topics.sh --bootstrap-server "$MIGRATION_KAFKA_BROKER_ENDPOINTS" --delete --topic logging-traffic-topic --command-config kafka-tools/aws/msk-iam-auth.properties

Kill python process

ps aux|grep python

kill -9 <id>

## Example
root@ip-10-0-1-144:~# ps aux|grep python
root      8407  1.4  1.2  29512 24892 pts/0    S    21:36   0:12 python3 ./live-data.py --endpoint https://capture-proxy-es:9200
root      8486  0.0  0.1   4024  1992 pts/0    S+   21:51   0:00 grep --color=auto python
root@ip-10-0-1-144:~# kill -9 8407

copy paste not working

printf '\e[?2004l'

References

  1. https://docs.aws.amazon.com/solutions/latest/migration-assistant-for-amazon-opensearch-service/solution-overview.html

  2. Fetch Command : https://github.com/opensearch-project/opensearch-migrations/blob/main/deployment/cdk/opensearch-service-migration/README.md#kicking-off-fetch-migration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment