Skip to content

Instantly share code, notes, and snippets.

@njeirath
Created November 1, 2017 03:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save njeirath/2fb9a7d18301350c332d6fc7c5a46954 to your computer and use it in GitHub Desktop.
Save njeirath/2fb9a7d18301350c332d6fc7c5a46954 to your computer and use it in GitHub Desktop.
Parses AWS instance list
# Download https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json to local directory as index.json
#
# Install pandas
import json
import pandas as pd
f = file("index.json", "r")
raw_data = json.loads(f.read())
lumpy_data = [v for k, v in raw_data["products"].iteritems()]
data = []
for ld in lumpy_data:
i = {k: v for k,v in ld["attributes"].iteritems()}
i["sku"] = ld["sku"]
i["productFamily"] = ld["productFamily"]
data.append(i)
df = pd.DataFrame(data)
#[nan, u'AWS GovCloud (US)', u'Asia Pacific (Mumbai)', u'Asia Pacific (Seoul)', u'Asia Pacific (Singapore)',
# u'Asia Pacific (Sydney)', u'Asia Pacific (Tokyo)', u'Canada (Central)', u'EU (Frankfurt)', u'EU (Ireland)',
# u'EU (London)', u'South America (Sao Paulo)', u'US East (N. Virginia)', u'US East (Ohio)',
# u'US West (N. California)', u'US West (Oregon)']
df["location"] = df["location"].map({
"US East (Ohio)": "us-east-2", "US East (N. Virginia)": "us-east-1", "US West (N. California)": "us-west-1",
"US West (Oregon)": "us-west-2", "Asia Pacific (Mumbai)": "ap-south-1", "Asia Pacific (Seoul)": "ap-northeast-2",
"Asia Pacific (Singapore)": "ap-southeast-1", "Asia Pacific (Sydney)": "ap-southeast-2",
"Asia Pacific (Tokyo)": "ap-northeast-1", "Canada (Central)": "ca-central-1", "EU (Frankfurt)": "eu-central-1",
"EU (Ireland)": "eu-west-1", "EU (London)": "eu-west-2", "South America (Sao Paulo)": "sa-east-1"})
print df[(df["operatingSystem"] == "Linux")
& (df["location"].notnull())
& (df["productFamily"] == "Compute Instance")][["location", "instanceType"]].\
drop_duplicates().to_json(orient="records")
@aliasguru
Copy link

aliasguru commented Nov 20, 2017

Hi njeirath,

I've made a few adaptions to this script here, maybe there are a few features which you'd like to implement from that. The changes are:

  • use Python 3 syntax (as I only have that installed)
  • add variables to specify an output file and the minimum number of CPU cores an instance shall have (the file var walks up one dir level because I have that script installed in a brenda-web subfolder. Using this walk I can directly update the instances.json file)
  • implement code to write the file instead of printing it to the command line (no chance for copy&paste on Windows, DOS shell sucks!)

Here's the code so far:

`
# Download https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json to local directory as index.json
#
# Install pandas

import json
import pandas as pd

__output_file__ = '../app/instances.json'
__min_number_of_cores__ = 18

f = open("index.json", "r")

raw_data = json.loads(f.read())

lumpy_data = [v for k, v in raw_data["products"].items()]
data = []

for ld in lumpy_data:
    i = {k: v for k,v in ld["attributes"].items()}
    i["sku"] = ld["sku"]
    i["productFamily"] = ld["productFamily"]
    data.append(i)

df = pd.DataFrame(data)

#[nan, u'AWS GovCloud (US)', u'Asia Pacific (Mumbai)', u'Asia Pacific (Seoul)', u'Asia Pacific (Singapore)',
# u'Asia Pacific (Sydney)', u'Asia Pacific (Tokyo)', u'Canada (Central)', u'EU (Frankfurt)', u'EU (Ireland)',
# u'EU (London)', u'South America (Sao Paulo)', u'US East (N. Virginia)', u'US East (Ohio)',
# u'US West (N. California)', u'US West (Oregon)']

df["location"] = df["location"].map({
    "US East (Ohio)": "us-east-2", "US East (N. Virginia)": "us-east-1", "US West (N. California)": "us-west-1",
    "US West (Oregon)": "us-west-2", "Asia Pacific (Mumbai)": "ap-south-1", "Asia Pacific (Seoul)": "ap-northeast-2",
    "Asia Pacific (Singapore)": "ap-southeast-1", "Asia Pacific (Sydney)": "ap-southeast-2",
    "Asia Pacific (Tokyo)": "ap-northeast-1", "Canada (Central)": "ca-central-1", "EU (Frankfurt)": "eu-central-1",
    "EU (Ireland)": "eu-west-1", "EU (London)": "eu-west-2", "South America (Sao Paulo)": "sa-east-1"})

_instanceData = df[(df["operatingSystem"] == "Linux") 
				& (df["location"].notnull()) 
				& (df["productFamily"] == "Compute Instance") 
				& (df['vcpu'].astype(float) >= __min_number_of_cores__)][["location", "instanceType"]].\
			drop_duplicates().to_json(orient="records")

try:
	_t = open(__output_file__, 'w')
	_t.write(_instanceData)
	print('data written to instances.json')

except Exception as exc:
	print('Failed to write file:', exc)

finally:
	_t.close()

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment