Skip to content

Instantly share code, notes, and snippets.

@coingraham
coingraham / generate_presigned.py
Created January 15, 2019 18:28
Presigned URL Generator
"""
This script is used to create pre-signed URLS used in AMS CFT deployments.
You can read more about pre-signed URLs here:
https://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html
"""
# You will need to install python, and install the boto3 package. Once python is installed
# you should be able to install it with "pip install boto3"
import boto3
@coingraham
coingraham / gist:030b3a3eb525103845246e3321f5aeeb
Created December 12, 2018 14:12
Cloudwatch Filter Port 25
[version, accountid, interfaceid, srcaddr, dstaddr, srcport, distport=25, protocol, packets, bytes, start, end, action, logstatus]
@coingraham
coingraham / get_created_date_snippet.py
Created December 5, 2018 16:53
Get created date python snippet
date_created_at = datetime.strptime(image.creation_date, "%Y-%m-%dT%H:%M:%S.000Z")
Import-Module ADFS
Add-ADFSRelyingPartyTrust -Name "Amazon Web Services & AD Groups" -MetadataURL "https://signin.aws.amazon.com/static/saml-metadata.xml" -MonitoringEnabled:$true -AutoUpdateEnabled:$true
$ruleSet = New-AdfsClaimRuleSet -ClaimRuleFile ((pwd).Path + "\claims-AD-Groups.txt")
$authSet = New-AdfsClaimRuleSet -ClaimRuleFile ((pwd).Path + "\auth.txt")
Set-AdfsRelyingPartyTrust -TargetName "Amazon Web Services & AD Groups" -IssuanceTransformRules $ruleSet.ClaimRulesString -IssuanceAuthorizationRules $authSet.ClaimRulesString
@coingraham
coingraham / getpresignedurl.py
Created June 28, 2018 15:19
Get a Presigned URL
import boto3
profile = "myprofile"
region = "us-east-1"
expiration = 36000 # one hour in seconds
bucket = "mybucket"
key = "myobjectkey"
session = boto3.session.Session(profile_name=profile, region_name=region)
@coingraham
coingraham / glue_python_spark_hello_world_dataframe.py
Created June 6, 2018 17:18
Glue Python Spark Hello World Job Dataframe
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
@coingraham
coingraham / glue_python_spark_hello_world_job.py
Last active June 6, 2018 17:19
Glue Python Hello World Job Dataframe
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue.dynamicframe import DynamicFrame
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
@coingraham
coingraham / emr_glue_spark_step.py
Created June 6, 2018 17:08
EMR Glue Catalog Python Spark Pyspark Step Example
from pyspark.context import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
if __name__ == "__main__":
# Create the spark session (include Hive Support)
spark = SparkSession\
.builder\
.appName("SparkEMRUsingGlueCatalot")\
@coingraham
coingraham / emr_spark_step_hello_world.py
Created June 6, 2018 17:03
Hello World for EMR Spark Step Python Pyspark
from pyspark.context import SparkContext
from pyspark.sql import SparkSession
if __name__ == "__main__":
# Create the spark session
spark = SparkSession\
.builder\
.appName("SparkEMR")\
.getOrCreate()
@coingraham
coingraham / spark_s3_dataframe_gdelt.py
Created June 4, 2018 14:46 — forked from jakechen/spark_s3_dataframe_gdelt.py
Creating PySpark DataFrame from CSV in AWS S3 in EMR
# Example uses GDELT dataset found here: https://aws.amazon.com/public-datasets/gdelt/
# Column headers found here: http://gdeltproject.org/data/lookups/CSV.header.dailyupdates.txt
# Load RDD
lines = sc.textFile("s3://gdelt-open-data/events/2016*") # Loads 73,385,698 records from 2016
# Split lines into columns; change split() argument depending on deliminiter e.g. '\t'
parts = lines.map(lambda l: l.split('\t'))
# Convert RDD into DataFrame
from urllib import urlopen
html = urlopen("http://gdeltproject.org/data/lookups/CSV.header.dailyupdates.txt").read().rstrip()