Skip to content

Instantly share code, notes, and snippets.

@harryge00
Created October 18, 2019 03:19
Show Gist options
  • Save harryge00/7701cce36ac18d41b0c027b13473f7fa to your computer and use it in GitHub Desktop.
Save harryge00/7701cce36ac18d41b0c027b13473f7fa to your computer and use it in GitHub Desktop.

使用腾讯云备份Cassandra

2.5.0-3.11.3版本开始,dcos cassandra 兼容S3协议对象存储的备份,腾讯云的的对象存储cos兼容AWS S3协议,所以针对s3的脚本也可以应用于腾讯云。

备份 Cassandra 数据

以腾讯云上海地区的存储(cos.ap-shanghai.myqcloud.com)为例,执行如下命令,cassandra的数据将被备份:

export AWS_ACCESS_KEY_ID=#{腾讯云的KEY_ID}	
export AWS_SECRET_ACCESS_KEY=#{腾讯云的secret_key}
export S3_BUCKET_NAME=#{腾讯云cos桶名}
dcos cassandra plan start backup-s3 \
    -p SNAPSHOT_NAME=Your_SNAPSHOT_NAME \
    -p CASSANDRA_KEYSPACES="space1 space2" \
    -p AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
    -p AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
    -p AWS_REGION=cos.ap-shanghai.myqcloud.com \
    -p S3_BUCKET_NAME=$S3_BUCKET_NAME \
    -p S3_ENDPOINT_URL=http://cos.ap-shanghai.myqcloud.com
  • CASSANDRA_KEYSPACES:是需要备份的keyspace名。如有多个,则用空格隔开, 如CASSANDRA_KEYSPACES="space1 space2。 如果需要备份所有keyspace, 则CASSANDRA_KEYSPACES 可以为空。
  • AWS_REGION : 对象存储所在的区,如果是北京,则是cos.ap-beijing.myqcloud.com
  • S3_ENDPOINT_URL : 对象存储所在区的url,如 http://cos.ap-beijing.myqcloud.com
  • SNAPSHOT_NAME: 是自定义的快照名称,会以这个名称在桶中创建目录。
  • AWS_ACCESS_KEY_ID: 腾讯云的KEY_ID
  • AWS_SECRET_ACCESS_KEY: 腾讯云的secret_key

执行后,framework scheduler会重启,然后启动新的容器,依次对所有cassandra node进行备份。

cassandra node则不受影响,继续运行。完成备份后这些短任务容器会退出:

Completed backup

一次备份分为如下步骤:

backup-s3 (serial strategy) (WAITING)
├─ backup-schema (serial strategy) (PENDING)
│  ├─ node-0:[backup-schema] (PENDING)
│  ├─ node-1:[backup-schema] (PENDING)
│  └─ node-2:[backup-schema] (PENDING)
├─ create-snapshots (parallel strategy) (PENDING)
│  ├─ node-0:[snapshot] (PENDING)
│  ├─ node-1:[snapshot] (PENDING)
│  └─ node-2:[snapshot] (PENDING)
├─ upload-backups (serial strategy) (PENDING)
│  ├─ node-0:[upload-s3] (PENDING)
│  ├─ node-1:[upload-s3] (PENDING)
│  └─ node-2:[upload-s3] (PENDING)
└─ cleanup-snapshots (serial strategy) (PENDING)
   ├─ node-0:[cleanup-snapshot] (PENDING)
   ├─ node-1:[cleanup-snapshot] (PENDING)
   └─ node-2:[cleanup-snapshot] (PENDING)
  • backup-schema 是备份cassandra的schema到容器的本地目录,
  • create-snapshots 创建快照,保存到cassandra的本地目录
  • upload-backups 上传快照到S3存储
  • cleanup-snapshots 上传后,清除所有前一步创建的本地快照。

手动上传腾讯云

如果是2.5.0-3.11.3之前的cassandra版本,则upload-backups这一步会有问题,无法上传腾讯云:

dcos cassandra plan status backup-s3 --name=cassandra2                     
backup-s3 (serial strategy) (IN_PROGRESS)
├─ backup-schema (serial strategy) (COMPLETE)
│  ├─ node-0:[backup-schema] (COMPLETE)
│  ├─ node-1:[backup-schema] (COMPLETE)
│  └─ node-2:[backup-schema] (COMPLETE)
├─ create-snapshots (parallel strategy) (COMPLETE)
│  ├─ node-0:[snapshot] (COMPLETE)
│  ├─ node-1:[snapshot] (COMPLETE)
│  └─ node-2:[snapshot] (COMPLETE)
├─ upload-backups (serial strategy) (STARTING)
│  ├─ node-0:[upload-s3] (STARTING)
│  ├─ node-1:[upload-s3] (PENDING)
│  └─ node-2:[upload-s3] (PENDING)
└─ cleanup-snapshots (serial strategy) (PENDING)
   ├─ node-0:[cleanup-snapshot] (PENDING)
   ├─ node-1:[cleanup-snapshot] (PENDING)
   └─ node-2:[cleanup-snapshot] (PENDING)

可以看到 upload-backups 一直卡住。这个时候需要手动通过dcos task exec进入容器,执行上传命令。

  1. 先列出所有cassandra节点:
# dcos task|grep "node-[0-9]\+-server"
node-0-server     10.0.1.234  nobody    R    node-0-server__c7c13d35-d089-457a-99ca-35d99ace3f4a              b32358e8-f3e5-4780-85cf-6debfe8a3c66-S0  aws/us-west-2  aws/us-west-2c  
node-1-server     10.0.1.145  nobody    R    node-1-server__c79885fc-bdde-4f4d-8f14-398e4124b688              b32358e8-f3e5-4780-85cf-6debfe8a3c66-S4  aws/us-west-2  aws/us-west-2c  
node-2-server     10.0.3.125  nobody    R    node-2-server__395ea8f1-d100-40cd-b15f-521ce2e4710e              b32358e8-f3e5-4780-85cf-6debfe8a3c66-S3  aws/us-west-2  aws/us-west-2c  

可以看到有3个节点 2. 进入容器执行上传: 进入容器的命令是 dcos task exec -ti node-0-server__c7c13d35-d089-457a-99ca-35d99ace3f4a bashnode-0-server__c7c13d35-d089-457a-99ca-35d99ace3f4a 是第一步列出的task id。

 export AWS_ACCESS_KEY_ID=XXXXXXXX
 export AWS_SECRET_ACCESS_KEY=XXXXXXXXX
 export PATH=$PATH:$PWD/python-dist/bin/
 export S3_ENDPOINT_URL=http://cos.ap-beijing.myqcloud.com
 export S3_BUCKET_NAME=dcos-s3-1251975970
 export SNAPSHOT_NAME=cassandra-20191017
 aws s3 cp container-path/snapshot/ s3://${S3_BUCKET_NAME}/${SNAPSHOT_NAME}/node-${POD_INSTANCE_INDEX}/ --recursive --endpoint-url=${S3_ENDPOINT_URL}


 

依次在所有节点执行命令。S3_BUCKET_NAMESNAPSHOT_NAME替换成希望存储的桶和快照名称。

  1. 结束备份 上传完成后,可以结束备份的plan:
dcos cassandra plan stop backup-s3   --name=cassandra2 upload-backups node-0:[upload-s3]
dcos cassandra plan stop backup-s3   --name=cassandra2 upload-backups node-1:[upload-s3]
dcos cassandra plan stop backup-s3   --name=cassandra2 upload-backups node-2:[upload-s3]

dcos cassandra plan force-complete backup-s3 --name=cassandra2

恢复 Cassandra 数据

以腾讯云上海地区的存储(cos.ap-shanghai.myqcloud.com)为例,执行如下命令,cassandra的数据将被恢复

export AWS_ACCESS_KEY_ID=#{腾讯云的KEY_ID}	
export AWS_SECRET_ACCESS_KEY=#{腾讯云的secret_key}
export S3_BUCKET_NAME=#{腾讯云cos桶名}
dcos cassandra plan start restore-s3 \
    -p SNAPSHOT_NAME=Your_SNAPSHOT_NAME \
    -p CASSANDRA_KEYSPACES="space1 space2" \
    -p AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
    -p AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
    -p AWS_REGION=cos.ap-shanghai.myqcloud.com \
    -p S3_BUCKET_NAME=$S3_BUCKET_NAME \
    -p S3_ENDPOINT_URL=http://cos.ap-shanghai.myqcloud.com

执行后,framework scheduler会重启,然后启动新的容器,依次对所有cassandra node进行恢复。cassandra node则不受影响,继续运行。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment