I like wikipedia. There must be some sort of project I could do with this data.
- Wikipedia There are accessible dumps of wikipedia data.
I was listening on NPR today and heard that within the UN, there are about a dozen different blocs that vote together on global warming issues:
So often, job sites give candidates job listings that are far off topic. The job title is often not applicable for the candidate, and less often, the location does not match the cadidate's location.
Can we build a better system for users by applying a recommender system to existing public listings?
Homeaway has data on vacation rentals. The data is not nearly so worked over as AirBNB data. Possibly there is something interesting in there to disover.
Here is how I copied data from one S3 bucket to another:
aws s3 sync s3://bitly-challenges/hdb_sanitized s3://hughdbrown/data-capstone
""" | |
Python script to backup data in src to dst using sha1 hashes of the files | |
in a backing directory. | |
Hugh Brown | |
hughdbrown@yahoo.com | |
""" | |
from hashlib import sha1 | |
import os |
import numpy | |
import scipy.stats as scs | |
def a_b_test(new_views, new_clicks, old_views, old_clicks, size=10000): | |
new_site = scs.beta(a=new_clicks + 1, b=new_views + 1).rvs(size=size) | |
old_site = scs.beta(a=old_clicks + 1, b=old_views + 1).rvs(size=size) | |
return (new_site > old_site).mean() |