Skip to content

Instantly share code, notes, and snippets.

View franc3000's full-sized avatar

Franklin franc3000

View GitHub Profile
@franc3000
franc3000 / zipcode_research.sql
Created June 20, 2018 15:07
zipcode research
select a.fips, b.zipcode, count_n, cities, p_50_sqft, p_50_estval
from (
select fips, zipcode, cast(sum(n) as unsigned) count_n -- cast to int, otherwise decimal
from feature_fips_zips_counts
where fips = '48453'
and sfr=1
group by 1,2
) a
left join feature_tax_zipcode_city b
on a.zipcode=b.zipcode
@franc3000
franc3000 / bash_profile
Created April 10, 2018 17:13
bash_profile
export PS1="\T \[\033[36m\]\u\[\033[m\] @ \[\033[32m\]\h:\[\033[33;1m\]\w\[\033[m\]\$ "
alias ll='ls -lahG'
@franc3000
franc3000 / fe.py
Last active April 7, 2018 13:33
Feature engineering automation
import numpy as np
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, Binarizer
from sklearn.base import TransformerMixin, BaseEstimator
"""
PyData Chicago
@franc3000
franc3000 / custom_columns.txt
Created January 22, 2018 22:59
columns for custom lists
SitusAddress,
SitusCity,
SitusState,
SitusZip,
SitusZip4,
SitusCounty,
SitusStateCountyFIPS,
PropertyType,
PropertyZoning,
PropertyGroup,
# empty dictionary for storing DataFrames
results = {}
# list of params for query
my_list = ['MX19', 'MX20', 'MX21', 'MX22', 'MX23', 'MX24', 'MX25', 'MX26', 'MX27', 'MX28', 'MX29', 'MX30', 'MX31', 'MX32']
for x in my_list:
ADM = """
SELECT a.SQLDATE,
@franc3000
franc3000 / pandas_subsets.py
Created November 10, 2017 18:36
pandas subsets
# To select rows whose column value equals a scalar, some_value, use ==:
df.loc[df['column_name'] == some_value]
# To select rows whose column value is in an iterable, some_values, use isin:
df.loc[df['column_name'].isin(some_values)]
# Combine multiple conditions with &:
@franc3000
franc3000 / update_inner_join.sql
Created November 3, 2017 19:40
update inner join
update tableA a
inner join tableB b
on a.RTPropertyUniqueIdentifier=b.RTPropertyUniqueIdentifier
SET a.fips=b.SitusStateCountyFIPS,
a.col2=b.col2;
@franc3000
franc3000 / ec2_pricing.txt
Last active October 31, 2017 17:18
ec2 pricing #aws
General Purpose - Current Generation
| vCPU | ECU | Memory (GiB) | Instance Storage (GB) | Linux/UNIX Usage |
============|======|==========|==============|=======================|==================|==========
t2.nano | 1 | Variable | 0.5 | EBS Only | $0.0058 | per Hour
t2.micro | 1 | Variable | 1 | EBS Only | $0.0116 | per Hour
t2.small | 1 | Variable | 2 | EBS Only | $0.023 | per Hour
t2.medium | 2 | Variable | 4 | EBS Only | $0.0464 | per Hour
t2.large | 2 | Variable | 8 | EBS Only | $0.0928 | per Hour
t2.xlarge | 4 | Variable | 16 | EBS Only | $0.1856 | per Hour
@franc3000
franc3000 / update_ubuntu.sh
Created October 31, 2017 17:05
update ubuntu #ubuntu
sudo apt-get update # Fetches the list of available updates
sudo apt-get upgrade # Strictly upgrades the current packages
sudo apt-get dist-upgrade # Installs updates (new ones)
@franc3000
franc3000 / table_size.sql
Last active October 31, 2017 17:05
db table sizes
SELECT
table_schema,
table_name,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`,
sum(table_rows) Rows
FROM information_schema.TABLES
WHERE table_schema='paretodb'
GROUP BY 1,2
ORDER BY (data_length + index_length) DESC
LIMIT 100