Skip to content

Instantly share code, notes, and snippets.

@WAlekseev
Last active April 22, 2022 17:20
Show Gist options
  • Save WAlekseev/c5d3b715be5a48d9a543c23b75b98041 to your computer and use it in GitHub Desktop.
Save WAlekseev/c5d3b715be5a48d9a543c23b75b98041 to your computer and use it in GitHub Desktop.
Storage recipes

AWS S3 Storage definition

Storage classes

Offical definition

  • S3 Standard
    • Milliseconds, high throughput

      Big data analytics, mobile and gaming, CDN

  • S3 Standard-IA
    • Milliseconds, high throughput - lower cost than S3 Standart, but have a cost for retrieval

      Disaster recovery, backups

  • S3 One Zone-IA
    • Cheaper than S3 Standard-IA, cause limited to one AZ

      Secondary backups and data you can recreate

  • S3 Intelligent-Tiering
    • Frequent Access tier: for default
    • Infrequent Access tier: for objects not accessed for 30 days
    • Archive Instant Access tier: for objects not accessed for 90 days
    • Archive Access tier: for objects not accessed from 90 to 700+ days
    • Deep Archive Access tier: for objects not accessed from 180 to 700+ days

      Auto tiering - moves object automatically between access tiers

  • S3 Glacier Instant Retrieval
    • Milliseconds, low-cost, price for storage and object retrieval cost

      Archiving/backup for data accessed once a quarter, minimum storage duration of 90 days

  • S3 Glacier Flexible Retrieval
    • Flexibility: Expidited: 1 to 5 minutes, cost ~$0.03 per GB and $10 per 1000 requests
    • Flexibility: Standart: 3 to 5 hours, cost ~$0.01 per GB and $0.03 per 1000 requests
    • Flexibility: Bulk: 5 to 12 hours, cost ~$0.0025 per GB and $0.025 per 1000 requests

      Archiving/backup for data accessed once a quarter, minimum storage duration of 90 days Restore available via links that have expire date

  • S3 Glacier Deep Archive
    • Flexibility: Standart: 12 hours
    • Flexibility: Bulk: 48 hours

      Archiving/backup for data accessed once a quarter, minimum storage duration of 180 days

  • S3 Outpost

Buckets, objects, keys

  • Objects (files) have a Key
  • Key is FULL path: s3://bucket/file.txt, s3://bucket/folder/sub_folder/file.txt
  • Key is composed of prefix+object_name: s3://bucket/folder/sub_folder/file.txt

Backup concepts

In version-enabled bucket object have one current and null or more noncurrent versions.

Buckets can be in one of three states:

  • Unversioned (the default)
  • Versioning-enabled
  • Versioning-suspended

Delete marker - is the marker that mark object as deleted but actualy object still exists in bucket and will be billied by AWS. Thus, if you have three versions of an object stored, you are charged for three objects.

For backup case it's important to define:

  • storage class
  • lifetime for stored object
  • expiration policy
  • lifecycle configuration
  • access policy

Concerns and compliance

  1. Upload one file per transaction cause cloud may have cost based on storage class and uploaded object can be precisely classfied by rules and policies
  2. Define lifetime for object i.e. storage expiration and deletion policy
  3. File with backup data must be uploaded to S3 bucket right away after creation (to protect data from encryption on host)
  4. S3 bucket policy must prevent delete or overwrite object via upload operation from host side. I.e. WORM (write once read many)policy must be tested and deined.

AWS S3 backup policy implementation

  1. Define IAM users
  2. Create S3 bucket with defined storage class, lifetime, expiration policy
  3. Setup policy for IAM principal
  4. Setup bucket access policy
  5. Setup AWS cli on access and configure access profile
  6. Check and test upload
  7. Write AWS S3 backup upload script
  8. Setup cron
  9. Setup backup audit and monitoring tools

Backup scheduler

Last 2 weeks, Every 3 months, Every 6 months

Cron

# Every 2 weeks on 1st and 15th of every month at 1:30AM
30 01 1,15 * * root /etc/scripts/backup.sh weekly

# Every 3 months on 1st month day at 1:30AM
30 01 1 */3 * root /etc/scripts/backup.sh 3month

# Every 6 months on 1st month day at 1:30AM
30 01 1 */6 * root /etc/scripts/backup.sh 6month

Then write a script with copy object to target bucket prefix and then attach a LifeCycle policy that clean a specific bucket prefix by days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment