Skip to content

Instantly share code, notes, and snippets.

View mhermans's full-sized avatar

Maarten Hermans mhermans

View GitHub Profile
@mhermans
mhermans / gist:5b33f154c97a9866c447
Last active August 29, 2015 14:01
dierentheator document scraper error
# ERROR
2014-05-19 23:57:10,980 - parsing http://www.lachambre.be/kvvcr/showpage.cfm?section=/flwb&language=fr&rightmenu=right&cfm=/site/wwwcfm/flwb/flwbn.cfm?lang=F&legislat=53&dossierID=3573 --- a document 3573
2014-05-19 23:57:12,065 - LXML parsing http://www.lachambre.be/kvvcr/showpage.cfm?section=/flwb&language=fr&rightmenu=right&cfm=/site/wwwcfm/flwb/flwbn.cfm?lang=F&legislat=53&dossierID=3573 --- a document 3573
2014-05-19 23:57:12,068 - LXML parsing http://www.lachambre.be/kvvcr/showpage.cfm?section=/flwb&language=nl&rightmenu=right&cfm=/site/wwwcfm/flwb/flwbn.cfm?lang=F&legislat=53&dossierID=3573 --- a document 3573 nl
2014-05-19 23:57:12,995 - lachambre_deputy.find (0.00) {'lachambre_id': '01201'}
Traceback (most recent call last):
File "/home/mhermans/tmp/dierentheater/lachambre_parser/documents.py", line 68, in parse_every_documents
handle_document(document)
File "/home/mhermans/tmp/dierentheater/lachambre_parser/documents.py", line 121, in handle_document
@mhermans
mhermans / group_mean.r
Created November 21, 2014 11:51
Berkeken groepsgemiddelden in R
# Berkeken groepsgemiddelden in R
data(iris)
summary(iris)
str(iris)
typeof(iris$Petal.Length)
typeof(iris$Species)
class(iris$Petal.Length)
class(iris$Species)
@mhermans
mhermans / writeTable.r
Created November 26, 2014 11:28
Write R tables to Excel
library(XLConnect)
# schrijffunctie
# --------------
writeTable <- function(table, excel_fn) {
wbFilename <- excel_fn
wb = loadWorkbook(wbFilename, create = TRUE)
sheet <- 'tables'
createSheet(wb, sheet)
@mhermans
mhermans / arduino_sunshine.cpp
Created December 14, 2014 19:16
Arduino "You are my sunshine" version using toneAC
#include <toneAC.h>
#define c 261.626 // 261 Hz
#define d 293.665 // 294 Hz
#define e 329.628 // 329 Hz
#define f 349.228 // 349 Hz
#define g 391.995 // 392 Hz
#define a 440.000 // 440 Hz
@mhermans
mhermans / 201502110_linkdump_power_mapping_cc.md
Last active August 29, 2015 14:15
Linkdump power mapping/analysis Ch & Cl (Feb. 2015)

Linkdump power mapping Ch & Cl February 2015

The following is a unstructed dump of links, projects, software, discussion groups, that I mentioned yesterday and/or interesting for power mapping & data journalism in general.

The number of excamation marks indicate relevance.

@mhermans
mhermans / 20150227_rstudio_demoserver.md
Last active August 29, 2015 14:16
setup rstudio demo server

Add swap (walkthrough):

sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
sudo swapon -s

start up docker:

@mhermans
mhermans / zotero_box_match.py
Last active August 29, 2015 14:19
Python script to match Zotero group library metadata with Box-hosted ebook files
from boxsdk import Client, OAuth2 # https://github.com/box/box-python-sdk
from boxsdk.exception import BoxAPIException
from pyzotero import zotero # https://github.com/urschrei/pyzotero
import os.path
"""
This script scans a Box input-folder and a Zotero group-library, and matches the ebook PDF's
in the Box-folder with the metadata in the Zotero-library.
Matches are based on the SHA1 file checksum, which Box calculates and tracks automatically
Beste
Via dit email adres was u in het verleden geregisteerd voor de Sociodata-berichtendienst van de Vereniging voor Sociologie.
Deze dienst is recent terug opgestart. U kan zich _hier registreren_ indien u deze berichten graag opnieuw wenst te ontvangen.
Is dit niet langer het geval, of u bent onder een ander email adres geregistreerd, dan kan u deze email negeren. We sturen dan nog een herinnering, en schrappen vervolgens dit email adres van de Sociodata-mailinglijst.
mvg,

From a reddit-comment:

stupid question on "semanticweb". How do I actucally get data? It says library of congress is on 'link-web-data' now. If I want to get a book name by ISBN (using LOCs 'linked data') how would I do that?

Is there a website for the standardize format of link-data? Are there APIs available?

Also how do I cross correlate link-data? Say Amazon also had a link data set (or other "publisher"). How do I correlate ISBN numbers between Amazon, LOCs, the patent office, etc... to verify the integrity of such data. Lots of stuff on goggle is inaccurate, but that is "ok" because people are verifying it. But with an application, you need a way to insure the data is correct and what you are actucally looking for.

How do I get the data?

@mhermans
mhermans / strat_example.R
Created June 23, 2010 09:55
Basic example for the R strat package
library(strat)
isco <- c(1200, 3131, 9110)
isei <- recode(isco, informat="isco88", outformat="isei")
esec <- recode(isco, informat="isco88", outformat="esec")
table(isei, esec)
esec
isei 1 4 6
29 0 1 0
48 0 0 1
68 1 0 0