Skip to content

Instantly share code, notes, and snippets.

@willettk
willettk / quench_collate.py
Created September 11, 2013 00:26
Quick Python code to collate the GZ: Quench data. Takes ~15-20 minutes to run on MacBookPro laptop.
import re
import csv
import numpy as np
import operator
from astropy.io import fits as pyfits
'''
To create the full collated data for GZ: Quench:
>>> import quench_collate as qc
@willettk
willettk / remove_sdss_duplicates
Last active December 29, 2015 01:09
Remove duplicate images (ie, the black file not found image) from a directory with SDSS images. Cutout center region (should be pure black), use ImageMagick's identify to locate duplicates, and then delete them.
#!/bin/sh
for img in *.jpg; do
filename=${img%.*}
newfilename=${img%.*}_cropped
convert "$filename.jpg" -crop 100x100+162+162 "$newfilename.jpg"
done
# Probably needs to be run on subsets of images if using the full GZ set due to memory limitation
# Include at least one black image (badimage_cropped.jpg) so subset has a comparison
@willettk
willettk / gist:7797723
Created December 4, 2013 23:46
Extract the JPG number from the GZ2 location field in TOPCAT.
parseLong(substring(split($41,"/")[4],toInteger(0),length(split($41,"/")[4])-4))
@willettk
willettk / gist:7797882
Created December 5, 2013 00:01
Copy list of files with awk
cat filelist.txt | awk '{print "cp source_dir/" $1, " targetdir/"}' | bash
@willettk
willettk / gist:7798902
Created December 5, 2013 01:50
Tar a list of files and pipe the verbose results to a file
tar -cvf target_dir/foo.tar -T source_dir/filelist.txt > target_dir/results.txt
@willettk
willettk / gist:7919434
Created December 11, 2013 22:14
Perl snippet for extracting matched patterns from file. Necessary since grep no longer supports -P option. See http://what-if.xkcd.com/75/
perl -nle'print $& if m{\b([yuiophjklbnm]+( |$)){3}' chatlines.txt
# Get the average classification counts for each group of subjects in Galaxy Zoo
db.galaxy_zoo_subjects.aggregate([{$group : {_id : "$metadata.survey", nClass : {$avg : "$classification_count"}}}])
# Export data from MongoDB to a CSV file from the shell command line
mongoexport --db ouroboros --collection galaxy_zoo_subjects --csv --fields classification_count --query '{"metadata.survey":"ukidss"}' --out ukidss_nclass.csv
@willettk
willettk / gist:b89da6aace158fc8348e
Created July 18, 2014 10:00
Plot bar chart of nationalities for conference attendees of AGN clustering conference; Garching, July 2014
import pandas
import numpy
import matplotlib.pyplot as plt
data = pandas.read_csv('/Users/willettk/Astronomy/meetings/garching2014/participants.csv',names=('person','country','institution'))
darkblue = '#00008b'
colordict = {
@willettk
willettk / gist:ff7c5ee7338b4e77d527
Created September 24, 2014 18:30
Missions to Mars success rate
Success of missions to Mars by space agency*
Information from https://en.wikipedia.org/wiki/List_of_missions_to_Mars
China: 0/1
ESA: 1/2
Japan: 0/1
India: 1/1
USA: 17/23
Russia/USSR: 2/21
from bs4 import BeautifulSoup as bs
import requests
import io
# Scrape the Nobel website to get data on what was served at the Nobel banquets.
# Can be turned into word clouds via Tagxedo
def get_result(year):
result = requests.get('http://www.nobelprize.org/ceremonies/menus/menu-%4i.html' % year)