Skip to content

Instantly share code, notes, and snippets.

@cm-igarashi-ryosuke
Created January 23, 2017 08:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cm-igarashi-ryosuke/95ffef583a45dc61c5ecd48b57569cfd to your computer and use it in GitHub Desktop.
Save cm-igarashi-ryosuke/95ffef583a45dc61c5ecd48b57569cfd to your computer and use it in GitHub Desktop.
DynamoDBのテーブルから安全にフルスキャンするスクリプト
require 'optparse'
require 'aws-sdk'
require 'pp'
### パラメータ ###
params = ARGV.getopts('l:t:p:', 'limit:', 'table:', 'profile:')
# pp params
LATE_LIMIT = (params['l'] || params['limit'] || '50').to_f / 100.0
TABLE_NAME = params['t'] || params['table'] || ''
if LATE_LIMIT > 1.0 || LATE_LIMIT < 0.01
pp 'limitは1-100の範囲で設定してください'; exit
end
### credentials ###
Aws.config.update({
region: 'ap-northeast-1',
profile: params['p'] || params['profile'] || nil
})
### IAMユーザを使う場合はこっち(profile指定と同じ) ###
# Aws.config.update({
# # access_key_id: 'ACCESS_KEY_ID',
# # secret_access_key: 'SECRET_ACCESS_KEY',
# })
$dynamodb = Aws::DynamoDB::Client.new
### IAMRoleをSwitchして使う場合はこっち ###
# # AWSのアクセスキー
# access_key_id = ''
# # AWSのアクセスシークレットキー
# secret_access_key = ''
# # AssumeRole対象のIAM Role のARN
# assume_role_arn = 'arn:aws:iam::000000000000:role/role_name' # prd
# # # session名は任意の文字列
# assume_role_session_name = 'session-name'
#
# # STSオブジェクトの取得
# role_credentials = Aws::AssumeRoleCredentials.new(
# client: Aws::STS::Client.new(
# access_key_id: access_key_id,
# secret_access_key: secret_access_key,
# ),
# role_arn: assume_role_arn,
# role_session_name: assume_role_session_name
# )
# $dynamodb = Aws::DynamoDB::Client.new(credentials: role_credentials)
### ここまで ###
# 消費されたCUと設定されたCU、消費CUの上限設定から適切な時間sleepする
def interval_sleep(used_capacity_units)
describe_option = {
table_name: TABLE_NAME, # required
}
describe_result = $dynamodb.describe_table(describe_option)
# 設定されたReadCU
read_capacity_units = describe_result.table.provisioned_throughput.read_capacity_units
# 消費されたCUがLATE_LIMITになるようにSleep時間を設定する
sleep_time = used_capacity_units / (read_capacity_units * LATE_LIMIT)
sleep(sleep_time)
end
begin
scan_option = {
table_name: TABLE_NAME,
# limit: 1_000_000,
# attributes_to_get: ["id", "name"],
return_consumed_capacity: "TOTAL",
# exclusive_start_key: last_key
}
begin
scan_result = $dynamodb.scan(scan_option)
# A) 出力したい項目を絞る場合
# scan_result.items.each {|item| puts [item["id"], item["name"]].join("\t")}
# B) 全項目を出力する場合、ただし入れ子になったJSON構造はそのまま出力される
scan_result.items.each {|item| puts item.flatten.join("\t")}
interval_sleep(scan_result.consumed_capacity.capacity_units)
end while scan_option[:exclusive_start_key] = scan_result.last_evaluated_key
# rescue Aws::DynamoDB::Errors::ServiceError
rescue Aws::DynamoDB::Errors => e
# rescues all errors returned by Amazon DynamoDB
pp e
# condition_expressionの条件に一致しなければExceptionが発生する
rescue Aws::DynamoDB::Errors::ConditionalCheckFailedException => e
pp e
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment