Skip to content

Instantly share code, notes, and snippets.

View hrwgc's full-sized avatar

Chris Herwig hrwgc

  • Google
  • San Francisco
View GitHub Profile
@hrwgc
hrwgc / validate.sh
Created November 13, 2013 19:57
bash wget - check if file exists at url before downloading
#!/bin/bash
# simple function to check http response code before downloading a remote file
# example usage:
# if `validate_url $url >/dev/null`; then dosomething; else echo "does not exist"; fi
function validate_url(){
if [[ `wget -S --spider $1 2>&1 | grep 'HTTP/1.1 200 OK'` ]]; then echo "true"; fi
}
@hrwgc
hrwgc / wkt_to_proj4.py
Created January 2, 2013 16:37
WKT to Proj4 string GDAL python
from osgeo import osr
srs = osr.SpatialReference()
wkt_text = 'PROJCS["Transverse Mercator",GEOGCS["GCS_Everest_1830",DATUM["D_Everest_1830",SPHEROID["Everest_1830",6377299.36,300.8017]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500295.0],PARAMETER["False_Northing",-2000090.0],PARAMETER["Central_Meridian",90.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]'
srs.importFromWkt(wkt_text)
srs.ExportToProj4()

perf guide

Profiling tools are critical for trying to understand performance bottlenecks in code.

Without real data about what a program is doing while running, detecting bottlenecks is at best a process of trial and error for both users and developers. While thoughtful testing of various program configurations along with measuring time elapsed for decrete runs can often be enough - why not learn faster and funner ways to rapidly collect real data about what a program is doing?

Actual data about program execution like which functions are being called while a program is active helps point to hot parts of the code where most time may be being spent. While users of applications may not easily understand the output of profiling tools, being equipped to generate profiling output can be extremely useful for sharing with developers, since the time to set up robust test cases for developers is can be greater than the time it takes to understand and optimize slow code paths. Therefore it can be invaluable to get

@hrwgc
hrwgc / README.md
Last active September 4, 2023 11:15
VIIRS Nighttime Lights 2012 processing
@hrwgc
hrwgc / aws-cli-s3cmd-du.sh
Last active June 19, 2023 15:32
aws-cli get total size of all objects within s3 prefix. (mimic behavior of `s3cmd du` with aws-cli)
#!/bin/bash
function s3du(){
bucket=`cut -d/ -f3 <<< $1`
prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
aws s3api list-objects --bucket $bucket --prefix=$prefix --output json --query '[sum(Contents[].Size), length(Contents[])]' | jq '. |{ size:.[0],num_objects: .[1]}'
}
s3du $1;
@hrwgc
hrwgc / evernote_parse.py
Last active January 1, 2022 10:13
Evernote note export wrangling to sqlite -> markdown jekyll data
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sqlite3
import sys
import re
import uuid
from bs4 import *
import lxml
import unicodedata
@hrwgc
hrwgc / README.md
Last active May 3, 2020 01:12
download all of your gists from gist.github.com

gist list

A command line script to retrieve json for all of your gists.

usage:

github.sh [username] [password] [total number of gists] [oath or user:password]
@hrwgc
hrwgc / translate.sh
Created December 12, 2012 20:34
Command line translator application using Bing's free Translator service.
#!/bin/bash
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
echo 'Usage: translate.sh ["Original Text"] ["Source Language"] ["Target Language"]'
echo 'Example: translate.sh "Hello World" "en" "fr"'
declare -a LANG_NAMES=('Arabic' 'Czech' 'Danish' 'German' 'English' 'Estonian' 'Finnish' 'French' 'Dutch' 'Greek' 'Hebrew' 'Haitian Creole' 'Hungarian' 'Indonesian' 'Italian' 'Japanese' 'Korean' 'Lithuanian' 'Latvian' 'Norwegian' 'Polish' 'Portuguese' 'Romanian' 'Spanish' 'Russian' 'Slovak' 'Slovene' 'Swedish' 'Thai' 'Turkish' 'Ukrainian' 'Vietnamese' 'Simplified Chinese' 'Traditional Chinese');
declare -a LANG_CODES=('ar' 'cs' 'da' 'de' 'en' 'et' 'fi' 'fr' 'nl' 'el' 'he' 'ht' 'hu' 'id' 'it' 'ja' 'ko' 'lt' 'lv' 'no' 'pl' 'pt' 'ro' 'es' 'ru' 'sk' 'sl' 'sv' 'th' 'tr' 'uk' 'vi' 'zh-CHS' 'zh-CHT');
for IX in $(seq 0 $((${#LANG_CODES[@]} - 1))); do
echo ${LANG_CODES[$IX]}: ${LANG_NAMES[$IX]};
done
@hrwgc
hrwgc / postgresql_quantiles.md
Last active May 31, 2018 13:36
Calculate quantile distributions for PostgreSQL column
SELECT
ntile,
CAST(avg(length) AS INTEGER) AS avgAmount,
CAST(max(length) AS INTEGER)  AS maxAmount,
CAST(min(length) AS INTEGER)  AS minAmount 
FROM (SELECT length, ntile(6) OVER (ORDER BY length) AS ntile FROM countries) x
GROUP BY ntile
ORDER BY ntile;
@hrwgc
hrwgc / mapbox_learn.md
Last active September 5, 2016 15:24
Resources for those just starting with Mapbox, TileMill, and Quantum GIS