Skip to content

Instantly share code, notes, and snippets.

View christine-le's full-sized avatar

Christine christine-le

View GitHub Profile
@christine-le
christine-le / aws_s3_multipart_uploads.md
Last active March 14, 2019 16:54
Use AWS multi upload commands to upload large files to S3

Description

AWS has two ways of performing S3 uploads: the "Low-Level" "aws s3api" set of commands and the "High-Level" "aws s3 cp" command. This page outlines how to use the S3's "Low-Level" "aws s3api" commands that allows us to upload very large files.

We have run into a common scenario where S3 uploads of very large files will fail if they exceed the security token expiration of 1-hour window. Other workarounds, such as using a mesos slave to upload from may have disk space limitations. In these cases, we cannot easily use the typical "High-Level" "aws s3 cp" command. Although this command also performs automatic multipart uploads behind the scenes, any timeout will completely cancel file uploads with no way of resuming where the upload last left off.

On the other hand, although the "Low-Level" "aws s3api" set of commands can be pretty tedious, any failed uploads can be re-tried without interrupting or canceling any other successful uploads.

Step-by-step guide

@christine-le
christine-le / HDFS-Hive-Presto Setups.md
Last active February 27, 2019 18:36
HDFS, Hive, and Presto Setups

Hadoop

Pre-Requisites:

  $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  $ chmod 0600 ~/.ssh/authorized_keys

Edit two files:

@christine-le
christine-le / cp-7-part2.md
Last active August 7, 2017 00:52
cp-7(part2): Transform a data model into a database

Checkpoints 7 (part 2, Transform a data model into a database): Quiz Questions

**1. What is a database migration? **

a) The process of upgrading the DBMS version

b) The process of making changes to a database's schema ← Correct

c) The process of optimizing existing SQL queries

@christine-le
christine-le / cp-10.md
Last active August 2, 2017 03:58
cp-10

Checkpoint 10 (Subqueries): Quiz Questions

**1. Which is NOT a characteristic of subqueries? **

a) Are nested inside of another query.

b) Usually used in the WHERE clause of the outer query.

c) Are excuted last, after the outer query excutes. ← Correct

@christine-le
christine-le / cp-7.md
Last active July 25, 2017 22:00
cp-7.md

Checkpoint 7: Quiz Questions

**1. What is a database model? **

a) A representation of the applications that interfaces with a database

b) A representation of the logical structure of a database ← Correct

c) A representation of unstructured data

@christine-le
christine-le / cp-4-6.md
Last active July 20, 2017 22:17
cp-4-6.md

Checkpoints 4-6: Quiz Questions

Checkpoint 4

1. Which of the following are valid keywords in a basic select statement?

a) SELECT, FROM, and WHERE ← Correct

b) EXTRACT, FROM

c) GET, FROM, and WHERE