WAlekseev/AWS S3 Storage.md

## AWS S3 Storage.md

      
    Raw
  

              AWS S3 Storage.md
            
          
    AWS S3 Storage definition

Storage classes

Offical definition

S3 Standard

Milliseconds, high throughput

Big data analytics, mobile and gaming, CDN


S3 Standard-IA

Milliseconds, high throughput - lower cost than S3 Standart, but have a cost for retrieval

Disaster recovery, backups


S3 One Zone-IA

Cheaper than S3 Standard-IA, cause limited to one AZ

Secondary backups and data you can recreate


S3 Intelligent-Tiering

Frequent Access tier: for default
Infrequent Access tier: for objects not accessed for 30 days
Archive Instant Access tier: for objects not accessed for 90 days
Archive Access tier: for objects not accessed from 90 to 700+ days
Deep Archive Access tier: for objects not accessed from 180 to 700+ days

Auto tiering - moves object automatically between access tiers


S3 Glacier Instant Retrieval

Milliseconds, low-cost, price for storage and object retrieval cost

Archiving/backup for data accessed once a quarter, minimum storage duration of 90 days


S3 Glacier Flexible Retrieval

Flexibility: Expidited: 1 to 5 minutes, cost ~$0.03 per GB and $10 per 1000 requests
Flexibility: Standart: 3 to 5 hours, cost ~$0.01 per GB and $0.03 per 1000 requests
Flexibility: Bulk: 5 to 12 hours, cost ~$0.0025 per GB and $0.025 per 1000 requests

Archiving/backup for data accessed once a quarter, minimum storage duration of 90 days
Restore available via links that have expire date


S3 Glacier Deep Archive

Flexibility: Standart: 12 hours
Flexibility: Bulk: 48 hours

Archiving/backup for data accessed once a quarter, minimum storage duration of 180 days


S3 Outpost

Buckets, objects, keys


Objects (files) have a Key
Key is FULL path: s3://bucket/file.txt, s3://bucket/folder/sub_folder/file.txt
Key is composed of prefix+object_name: s3://bucket/folder/sub_folder/file.txt

Backup concepts

In version-enabled bucket object have one current and null or more noncurrent versions.
Buckets can be in one of three states:

Unversioned (the default)
Versioning-enabled
Versioning-suspended

Delete marker - is the marker that mark object as deleted but actualy object still exists in bucket and will be billied by AWS. Thus, if you have three versions of an object stored, you are charged for three objects.
For backup case it's important to define:

storage class
lifetime for stored object
expiration policy
lifecycle configuration
access policy

Concerns and compliance


Upload one file per transaction cause cloud may have cost based on storage class and uploaded object can be precisely classfied by rules and policies
Define lifetime for object i.e. storage expiration and deletion policy
File with backup data must be uploaded to S3 bucket right away after creation (to protect data from encryption on host)
S3 bucket policy must prevent delete or overwrite object via upload operation from host side. I.e. WORM (write once read many)policy must be tested and deined.

AWS S3 backup policy implementation


Define IAM users
Create S3 bucket with defined storage class, lifetime, expiration policy
Setup policy for IAM principal
Setup bucket access policy
Setup AWS cli on access and configure access profile
Check and test upload
Write AWS S3 backup upload script
Setup cron
Setup backup audit and monitoring tools

Backup scheduler

Last 2 weeks, Every 3 months, Every 6 months
Cron
# Every 2 weeks on 1st and 15th of every month at 1:30AM
30 01 1,15 * * root /etc/scripts/backup.sh weekly

# Every 3 months on 1st month day at 1:30AM
30 01 1 */3 * root /etc/scripts/backup.sh 3month

# Every 6 months on 1st month day at 1:30AM
30 01 1 */6 * root /etc/scripts/backup.sh 6month

Then write a script with copy object to target bucket prefix and then attach a LifeCycle policy that clean a specific bucket prefix by days