Skip to content

Instantly share code, notes, and snippets.

View anilktechie's full-sized avatar
💭
I may be slow to respond.

ak anilktechie

💭
I may be slow to respond.
View GitHub Profile
@anilktechie
anilktechie / how-to-copy-aws-rds-to-local.md
Created March 8, 2021 15:54 — forked from syafiqfaiz/how-to-copy-aws-rds-to-local.md
How to copy production database on AWS RDS(postgresql) to local development database.
  1. Change your database RDS instance security group to allow your machine to access it.
    • Add your ip to the security group to acces the instance via Postgres.
  2. Make a copy of the database using pg_dump
    • $ pg_dump -h <public dns> -U <my username> -f <name of dump file .sql> <name of my database>
    • you will be asked for postgressql password.
    • a dump file(.sql) will be created
  3. Restore that dump file to your local database.
    • but you might need to drop the database and create it first
    • $ psql -U <postgresql username> -d <database name> -f <dump file that you want to restore>
  • the database is restored
@anilktechie
anilktechie / postgres-cheatsheet.md
Created March 8, 2021 19:07 — forked from Kartones/postgres-cheatsheet.md
PostgreSQL command line cheatsheet

PSQL

Magic words:

psql -U postgres

Some interesting flags (to see all, use -h or --help depending on your psql version):

  • -E: will describe the underlaying queries of the \ commands (cool for learning!)
  • -l: psql will list all databases and then exit (useful if the user you connect with doesn't has a default database, like at AWS RDS)
@anilktechie
anilktechie / pg_flatten_json.sql
Created March 23, 2021 23:02 — forked from imamdigmi/pg_flatten_json.sql
Flattening json data in PostgreSQL
create or replace function create_jsonb_flat_view
(table_name text, regular_columns text, json_column text)
returns text language plpgsql as $$
declare
cols text;
begin
execute format ($ex$
select string_agg(format('%2$s->>%%1$L "%%1$s"', key), ', ')
from (
select distinct key
@anilktechie
anilktechie / list_objects_google_storage_boto3.py
Created March 26, 2021 08:07 — forked from gleicon/list_objects_google_storage_boto3.py
How to use boto3 with google cloud storage and python to emulate s3 access.
from boto3.session import Session
from botocore.client import Config
from botocore.handlers import set_list_objects_encoding_type_url
import boto3
ACCESS_KEY = "xx"
SECRET_KEY = "yy"
boto3.set_stream_logger('')
@anilktechie
anilktechie / lambda_function.py
Created May 26, 2021 02:20 — forked from psa-jforestier/lambda_function.py
AWS Lambda function to gzip compress file when upload to S3 (will replace original file with gz version)
###
### This gist contains 2 files : settings.json and lambda_function.py
###
### settings.json
{
"extensions" : ["*.hdr", "*.glb", "*.wasm"]
}
### lambda_function.py
{
"StreamName": "$input.params('stream-name')"
}
@anilktechie
anilktechie / json-split.py
Created June 22, 2021 16:51 — forked from 97-109-107/json-split.py
A tiny python thing to split big json files into smaller junks.
#!/usr/bin/env python
# based on http://stackoverflow.com/questions/7052947/split-95mb-json-array-into-smaller-chunks
# usage: python json-split filename.json
# produces multiple filename_0.json of 1.49 MB size
import json
import sys
with open(sys.argv[1],'r') as infile:
o = json.load(infile)
@anilktechie
anilktechie / 13b_explore_source_data.py
Created July 16, 2021 18:59 — forked from lucadefra92/13b_explore_source_data.py
#aws #copy_data_from_s3_to_redshift #redshift #s3
# Define S3 client
s3 = boto3.client(
"s3",
aws_access_key_id = access_key_id,
aws_secret_access_key = secret_access_key
)
# Get object containing file to be staged
obj = s3.get_object(
Bucket = "data-to-migrate",
@anilktechie
anilktechie / 07b_get_cluster_parameters.py
Created July 16, 2021 19:00 — forked from lucadefra92/07b_get_cluster_parameters.py
#aws #copy_data_from_s3_to_redshift #redshift #s3
import configparser
# Read AWS credentials from the config file
cfg_data = configparser.ConfigParser()
cfg_data.read('dl.cfg')
# Save Redshift cluster
cluster_identifier = cfg_data["Redshift"]["cluster_identifier"]
cluster_type = cfg_data["Redshift"]["cluster_type"]
node_type = cfg_data["Redshift"]["node_type"]
###############################################
# Script settings and constants.
###############################################
SCRIPT_PATH = 'script.sql'
DB_CONNECTION = {
'db_host': 'myhost.redshift.amazonaws.com',
'db_name': 'somedb',
'db_username': 'user',
'db_password': 'pA$$word'