Skip to content

Instantly share code, notes, and snippets.

@IanHopkinson
IanHopkinson / knitrview.R
Created August 20, 2013 15:41
Some R to knit a view on ScraperWiki
#!/usr/bin/Rscript
# Script to knit a file 2013-08-08
# Ian Hopkinson
library(knitr)
.libPaths('/home/tool/R/libraries')
render_html()
knit("/home/tool/view.Rhtml",output="/home/tool/http/index.html")
@IanHopkinson
IanHopkinson / code.js
Created August 20, 2013 15:40
A little bit of JavaScript that helps setup a box on ScraperWiki for tools inside that box
function save_api_stub(){
scraperwiki.exec('echo "' + scraperwiki.readSettings().target.url + '" > ~/tool/dataset_url.txt; ')
}
function run_once_install_packages(){
scraperwiki.exec('run-one tool/runonce.R &> tool/log.txt &')
}
$(function(){
save_api_stub();
[Results2008Cohort.csv]
CharacterSet=65001
Format=Delimited(|)
ColNameHeader=False
Col1="TESTID" Integer
Col2="VEHICLEID" Integer
Col3="TESTDATE" Char
Col4="TESTCLASSID" Char
Col5="TESTTYPE" Char
Col6="TESTRESULT" Char
@IanHopkinson
IanHopkinson / MOT_upload_big_files.sql
Created July 15, 2013 14:05
SQL to upload the two big MOT data files to a MySQL database, requires the tables to by using the MyIASM engine.
USE MOT;
ALTER TABLE testitem DISABLE KEYS;
LOAD DATA LOCAL INFILE '[yourpath \\ separated in Windows]UAT_test_item_2011.txt'
INTO TABLE testitem
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
;
@IanHopkinson
IanHopkinson / MOT_populate_db.sql
Created July 15, 2013 14:00
SQL to upload the smaller tables of the MOT test dataset
USE MOT;
LOAD DATA LOCAL INFILE '[yourpath \\ as separator in windows]failure_location.txt'
INTO TABLE failure_location
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
;
LOAD DATA LOCAL INFILE '[yourpath \\ as separator in windows]item_detail.txt'
INTO TABLE testitem_detail
@IanHopkinson
IanHopkinson / MOT_setup_db.sql
Created July 15, 2013 13:53
A schema for uploading MOT test data, as suggested in the user guide with the exception of the `SET storage_engine=MyISAM` which is there to allow the `ENABLE KEYS\DISABLE KEYS` mechanism used in the large table uploads. Also the field `RFRINSPMANDESC TEXT(500) #Variation from suggested schema due to 1074 error` is suggested as `char` in the use…
SET storage_engine=MyISAM;
CREATE SCHEMA IF NOT EXISTS `MOT` ;
USE `MOT` ;
DROP TABLE IF EXISTS TESTRESULT;
CREATE TABLE TESTRESULT (
TESTID INT UNSIGNED
,VEHICLEID INT UNSIGNED
,TESTDATE DATE
@IanHopkinson
IanHopkinson / doctests.py
Last active December 18, 2015 21:39
A demonstration of doctests
def threshold_above(hist, threshold_value):
"""
>>> threshold_above(collections.Counter({518: 10, 520: 20, 530: 20, 525: 17}), 15)
[520, 530, 525]
"""
if not isinstance(hist,collections.Counter):
raise ValueError("requires collections.Counter")
above = [k for k, v in hist.items() if v > threshold_value]
return above
@IanHopkinson
IanHopkinson / TwitterRScraperWiki
Created April 5, 2013 10:44
A gist in R showing how to plot twitter data from ScraperWiki as a smoothScatter plot
# Some experiments in R to handle twitter profile data
# Ian Hopkinson 2013-03-28
# First we get the data in JSON format and then convert to a data.frame
library(rjson)
TwitterRecordsFile = 'http://box.scraperwiki.com/bdaw7xa/a6e8e71d96b2449/sql?q=select+*+from+twitter_followers'
TwitterRecords = fromJSON(file=TwitterRecordsFile, method='C')
Twitterdf=data.frame(t(sapply(TwitterRecords,c)))
# smoothScatter is in base R
smoothScatter(Twitterdf$followers_count,Twitterdf$following_count,xlim=c(0,5000),ylim=c(0,5000),main="Followers vs following count for a subset of twitter users")