Skip to content

Instantly share code, notes, and snippets.

@enoren5
Last active February 4, 2023 15:41
Show Gist options
  • Save enoren5/6c60b71340a93a55a66eecbf07cda245 to your computer and use it in GitHub Desktop.
Save enoren5/6c60b71340a93a55a66eecbf07cda245 to your computer and use it in GitHub Desktop.
Guide to backing up Postgres

How do I download, cultivate, maintain, handle, and restore a manual archive of Postgres backups?

I composed this note to self (gist) on Friday 9 April 2021 with an early morning start with multiple subsequent revisions. The main guide that I am working with is "Heroku PGBackups" The lesser guide I also used was "Heroku Postgres import export".

BACKING UP (DOWNLOADING)

There are two methods of backup data. The easy way is using Heroku's web GUI Dashboard. The hardway is via CLI.

Method #1: Heroku Dashboard (quick)

Visit the location in my Heroku Dashboard for the NAVY Postgres instance. (I keep a copy of this image locally in my /home/<user>/Dropbox/bax/web/tarot_juicer_Heroku_AWS/NAVY-real-prod/ directory.) As you can see, all you need to do is click the purple: "Create Manual Backup" button and then click the white/purple "Download" button to the location on your local machine.

Method #2: CLI (long winded)

To duplicate or clone a Postgres instance, use:

$ heroku pg:backups:capture NAVY -a tarot-prod # grab 'real' content db

NAVY is the name of the silo which contains my ‘real’ prod content as of this writing but it may be different next time, so be careful.

To create a public (but ephemeral - valid for 1 hour) hyperlink to the db where b008 is the number as it appears at the very bottom of the output of the previous command.

$ heroku pg:backups:url b008 -a tarot-prod

Example output can include:

More on downloading your backups can be found be found here.

To check the backup status:

$ heroku pg:backups -a tarot-prod

This should print a list of backups, including the crucial piece of information: “ID” (which is used in the next command). To get more detailed information about a given backup. Here is the info command:

$ heroku pg:backups:info b008 -a tarot-prod

Where b008 is the ID as it appears in the list of the above command and tarot-prod is the name of the app.

That will show the backup details for the silo ID’ed as b008.

The next step is to download the latest dump using this command:

$ heroku pg:backups:download -a tarot-prod

This will download the most recent backup generated using the commands above showing at the very top of the list. The file name will show as: latest.dump which I can then rename and move to a different location on my local machine, such as my Dropbox folder.

RESTORING (UPLOADING)

The advantage of maintaining a local archive is that Heroku’s silo backups are limited to 5 (I think) and aren’t retained past 30 days. So if there is a catastrophe (like there was on April 2nd, 2021), then my data will be lost. The advantage of keeping a list of local backups, I can retain the backups going as far back as I’d like. If this is the case, then the above technique for duplicating databases in the cloud is not the solution I need.

If you aren’t copying one silo to another all in the cloud, and instead are maintaininn an archive of Postgres blobs locally to my home PC’s Dropbox ‘bax’ folder, then the main requirement for Heroku is that the blob must be uploaded to a public web hosting provider such as aws S3.

What follows is the background and full preliminary documentation onslaught required to understand how to complete backups/restores of Postgres DB’s to Heroku for my tarot_juicer project.

Background:

The Heroku Doc titled "Importing and Exporting Heroku Postgres Databases" states that: “In order for PG Backups to access and import your dump file you will need to upload it somewhere with an HTTP-accessible URL. We recommend using Amazon S3 with a signed url.”

When I read this in the Heroku Doc, my first reaction was that setting up an S3 bucket on AWS is a near-impossible task just because AWS is notorious for it's unrealistic learning curve in general. But I conquered it after a consultation on Upwork and with help from a kind soul for free on GitHub. In the Heroku doc above, there is a link to another Heroku guide titled: "Using AWS S3 to Store Static Assets and File Uploads" which makes setting up S3 with AWS look straightforward and easy to follow along.

As this process above, I leveraged "Signed URLs" on Amazon's website.

With my collection of Postgres silos downloaded to Dropbox, to restore them, you need to use Amazon S3.

The Raw Commands

I’m assuming that you have already logged in as an IAM (root user) in AWS and created a bucket based on the guides above.

Next, for your S3 bucket (I called mine postgres-restore-tarot-juicer), you need to add some JSON to the “Bucket Policy” under the bucket’s Permissions tab. Here is the JSON that UmarGit came up with:

{
    "Version": "2012-10-17",
    "Id": "Policy1625513624874",
    "Statement": [
        {
            "Sid": "Stmt1625513623122",
            "Effect": "Allow",
            "Principal": "*",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::postgres-restore-tarot-juicer/*"
        }
    ]
}

Next you can generate a signed URL using the aws console:

$ aws s3 presign s3://your-bucket-address/your-object

As of this writing, today on 7 July 2021, this was the command I used:

$ aws s3 presign s3://postgres-restore-tarot-juicer/2021June25_8801def6-27e0-4b88-875b-842be5704f0b

That should generate a long bucket URI such as this (and in this format): https://postgres-restore-tarot-juicer.s3.amazonaws.com/2021June25_8801def6-27e0-4b88-875b-842be5704f0b?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAY6KWYZACZM3OX2O6%2F20210707%2FOhio%2Fs3%2Faws4_request&X-Amz-Date=20210707T000659Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=e722bdbe239d26a66f9bd51817ded67824bc56bbd051129ec09489128db9a0b1

To test whether that will work, you can open it in a web browser. If Chrome prompts you with the download-to-file dialog box, then this will be proof that it’s working and you can proceed to the next step. But if you get an XML error, please head down to the bottom of this gist to the TROUBLESHOOTING heading for workarounds and things you can try.

Once you know that link is valid, the next step is to restore. Be sure to use the presigned URL and enter the right COLOR from Heroku:

$ heroku pg:backups:restore '<working presin’ed URI>' HEROKU_POSTGRESQL_<COLOR> --remote heroku --app tarot-prod --confirm tarot-prod

If you accidentally enter the wrong COLOR, you will overwrite the old one, losing data. So be extra careful!

This concludes how to backup/restore a Heroku Postgres DB.

Troubleshooting Restores with S3

These are some potential errors you could encounter while presign’ing an S3 bucket (described earlier).

Here is the first one:

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<X-Amz-Expires>3600</X-Amz-Expires>
<Expires>2021-07-07T01:06:59Z</Expires>
<ServerTime>2021-07-07T10:53:35Z</ServerTime>
<RequestId>3WD5533RANDGPJK8</RequestId>
<HostId>AD2cM2WTE171ZEQdPr7ELlZkeesO/XfSaT5pz6iGIFjRkvX++31cRjyl6wWL/skukRuXY3gvotU=</HostId>
</Error>

If you get that specific error, that indicates that your URI has expired (it only lives for 3600 seconds or 1 hour). So if you generated the URI last night and it is 8 hours later, it won’t work. You’ll need to generate another URI.

There are other XML errors that you may encounter such as this one:

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>AuthorizationQueryParametersError</Code>
<Message>Error parsing the X-Amz-Credential parameter; the region 'Ohio' is wrong; expecting 'us-east-2'</Message>
<Region>us-east-2</Region>
<RequestId>ZEQYTCQ6BZ0PPPC0</RequestId>
<HostId>I2rlPtFOMWE88MMZG34vFeUd1p3BnyABGBa5P0vXW2tFvbH5H0k+iF/HbsbV4YMsoK/q0jkp/3g=</HostId>
</Error>

If you get this error, this means your local awscli configuration files inside ~/.aws include incorrect entries. To make a correction, execute these commands in your local shell:

$ export AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
$ export AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key>
$ export AWS_DEFAULT_REGION=us-east-2

You can find these unique variables in your AWS Console as described in answers to this Stack Overflow question: https://stackoverflow.com/questions/21440709/how-do-i-get-aws-access-key-id-for-amazon

FURTHER READING:

I may be able to use these Stack Overflow answers on how to restore a Postgres from file, although I'm not sure if this would work with Heroku.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment