Skip to content

Instantly share code, notes, and snippets.

@dollschasingmen
Last active May 26, 2017 19:11
Show Gist options
  • Save dollschasingmen/22a11af31d4cc219d8b77843b3a815cb to your computer and use it in GitHub Desktop.
Save dollschasingmen/22a11af31d4cc219d8b77843b3a815cb to your computer and use it in GitHub Desktop.
stripped down sync_list w/ timings
def sync_list(table, bucket_name, prefix)
aws_resource = Aws::S3::Resource.new
t1 = Time.now
bucket = aws_resource.bucket(bucket_name)
rows = []
bucket.objects(prefix: prefix).each do |obj|
s3_path = "s3://#{bucket_name}/#{obj.key}"
row = LumosEtl::RedshiftIncrementalLoadFile.where(destination_table: table, file_name: s3_path).first
rows << row
end
puts rows.size
t2 = Time.now
delta = t2 - t1 # in seconds
puts delta
end
adjust_rows = sync_list('adjust.events', 'lumos-adjust', '')
impressions_rows = sync_list('appnexus.impressions', 'lumos-appnexus', 'log-level-data')
visitor_rows = sync_list('events.visitor', 'lumos-partitioned-events-user-production', 'yyyy=2017')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment