Skip to content

Instantly share code, notes, and snippets.

View t-redactyl's full-sized avatar

Jodie Burchell t-redactyl

  • JetBrains
  • Berlin, Germany
View GitHub Profile
@t-redactyl
t-redactyl / data_extraction.py
Created April 12, 2016 09:50
Companion code for blog post
import urllib2
import json
import math
import numpy as np
from pandas import Series, DataFrame
import pandas as pd
import matplotlib.pyplot as plt
def expired_listings(site, searchterm):
@t-redactyl
t-redactyl / mysql_setup.sql
Last active December 18, 2015 02:07
MySQL code for the blog post: Finding the highest rated Christmas movies in MovieLens 10M (23/12/2015)
-- Create tables
DROP TABLE IF EXISTS ratingsdata;
CREATE TABLE ratingsdata (
userid INT,
itemid INT,
rating INT,
timestamp INT,
PRIMARY KEY (userid, itemid));
DROP TABLE IF EXISTS movies;
@t-redactyl
t-redactyl / web-scraping.py
Created December 18, 2015 01:02
Web scaping code for the blog post: Finding the highest rated Christmas movies in MovieLens 10M (23/12/2015)
import lxml.html
from lxml.cssselect import CSSSelector
import requests
def get_title(node):
'''
Extracts the movie title from the URL http://www.timeout.com/london/film/the-50-best-christmas-movies
taking into account that some titles are tagged as h3, and some as h3 a.
'''
h3_elem = node.cssselect('div.feature-item__text h3')[0]
@t-redactyl
t-redactyl / cat_class_1_method.py
Created November 12, 2015 03:02
Code associated with blog post:
def name_print(cat):
'''Print the name of the cat.'''
print "The cat is called %s." % cat.name
name_print(felix)
@t-redactyl
t-redactyl / cleaning_data.R
Created November 4, 2015 05:52
Code associate with blog post
mtcars$am.f <- as.factor(mtcars$am); levels(mtcars$am.f) <- c("Automatic", "Manual")
mtcars$cyl.f <- as.factor(mtcars$cyl); levels(mtcars$cyl.f) <- c("4 cyl", "6 cyl", "8 cyl")
mtcars$vs.f <- as.factor(mtcars$vs); levels(mtcars$vs.f) <- c("V engine", "Straight engine")
mtcars$gear.f <- as.factor(mtcars$gear); levels(mtcars$gear.f) <- c("3 gears", "4 gears", "5 gears")
mtcars$carb.f <- as.factor(mtcars$carb)
@t-redactyl
t-redactyl / centred_chart.R
Created October 29, 2015 00:23
Code associated with blog post
library(ggplot2); library(gridExtra)
g1 <- ggplot(data=mtcars, aes(x=wt, y=mpg)) +
geom_point(alpha = 0.7, colour = "#0971B2") +
ylab("Miles per gallon") +
ylim(10, 35) +
xlab("Weight (`000 lbs)") +
ggtitle("Untransformed Weight") +
geom_vline(xintercept = 0) +
theme_bw()
av_peds_2 <- ddply(p.subset, c("date", "collapsed_sensors_2"), summarise,
n_peds = sum(Hourly_Counts))
# Extract weekday versus weekend
av_peds_2$day <- weekdays(av_peds_2$date, abbreviate = FALSE)
av_peds_2$weekend <- ifelse((av_peds_2$day == "Saturday" | av_peds_2$day == "Sunday"),
"Weekend", "Weekday")
av_peds_2$weekend <- as.factor(av_peds_2$weekend)
# Extract time of day
# Load required packages
require(ggplot2); require(gridExtra)
# Set the colours for the graphs
barfill <- "#4271AE"
barlines <- "#1F3552"
line1 <- "black"
line2 <- "#FF3721"
# Plotting histogram of sample 1
use http://www.ats.ucla.edu/stat/stata/library/depress, clear
reshape long dep, i(subj)
rename _j time
drop pre
# Generate the 95% confidence interval.
lci <- -1 * qt(c(.975), 78)
uci <- qt(c(.975), 78)