Skip to content

Instantly share code, notes, and snippets.

View pdparker's full-sized avatar

Philip Parker pdparker

  • Australian Catholic University
  • Australia
View GitHub Profile
@pdparker
pdparker / myCitation
Last active August 29, 2015 14:07
Extract google scholar 'My Citation' information
#load beautiful soup and itertools
from bs4 import BeautifulSoup
import itertools
import re
#import URL with imitated browser
from urllib import FancyURLopener
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36'
openurl = MyOpener().open
print "sep=;"
@pdparker
pdparker / cricos
Created January 28, 2015 05:30
CRICOS webscrapper
uni = raw_input("Please enter CRICOS ID: ")
#based on https://stockrt.github.io/p/emulating-a-browser-in-python-with-mechanize/
import mechanize
import cookielib
from bs4 import BeautifulSoup
import itertools
import re
@pdparker
pdparker / knitr
Last active August 29, 2015 14:15
knitr makefile
# Transform .Rmd files to slidy files
.SUFFIXES: .Rmd .html .md
all: Day1Part1-Introduction.md Day1Part1-Introduction.html Day1Part1-session2.md Day1Part1-session2.html \
Day1Part1-session3.md Day1Part1-session3.html Day1Part2-session1.md Day1Part2-session1.html \
Day1Part2-session2.md Day1Part2-session2.html
#markdown
%.md: %.Rmd
@pdparker
pdparker / myCite
Last active August 29, 2015 14:19
GS citation profile for conky or geektools
#load beautiful soup and itertools
from bs4 import BeautifulSoup
import itertools
import re
from urllib import FancyURLopener
class MyOpener(FancyURLopener):
version = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36'
openurl = MyOpener().open
#If you want to use you will have to change "user=*" in the url below
url = 'http://scholar.google.com.au/citations?user=xHY4MJ8AAAAJ&hl=en'
@pdparker
pdparker / gist:f227b5b3592ed0c52443
Created April 25, 2015 05:15
cricos_splinter_scrapper
import itertools
import re
import string
import csv
from splinter import Browser
from bs4 import BeautifulSoup
#uni = "00219C"
uni = raw_input("")
@pdparker
pdparker / APAstyle
Created June 17, 2015 23:17
APA Style for Markdown
/* I have only tested this on Chrome but it
prints nicely to A4 size */
@media print {
body {
width: 210mm;
height: 297mm;
}
}
@pdparker
pdparker / APAtemplate
Created June 17, 2015 23:18
Template file for Producing APA document with Rmarkdown
---
output:
html_document:
number_sections: no
toc: no
fig_caption: yes
css: style.css
---
```{r titlePage, echo=FALSE, message=FALSE, warning=FALSE,results='asis'}
@pdparker
pdparker / C
Created July 23, 2015 00:10
C script to split numbers into digits
/* compile using gcc digits.c -lm -o digits */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <errno.h>
#define DIGITS 5
void die(const char *message){
@pdparker
pdparker / R
Created November 19, 2015 00:19
DSTK link to R
data <- read.csv("/Users/phparker/Dropbox/Databases/AmericanStudy.csv")
iplocation <- function(ip=""){
response <- readLines(paste("http://www.datasciencetoolkit.org//ip2coordinates/",ip,sep=""))
success <- !any(grepl("null",response))
ip <- grep("[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*",response,value=T)
match <- regexpr("[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*\\.[[:digit:]]*",ip)
ip <- substr(ip,match,as.integer(attributes(match)[1])+match-1)
if(success==T){
@pdparker
pdparker / pmScraper.R
Last active July 15, 2020 05:05
Scrap Australian PM speeches into mongoDB Database
#################################### Set up database ###########################
# - Make sure database is setup to be read,write and executable outside of sudo
# - Make sure to start mongo deamon before setting up database usr$ mongod
################################################################################
##Produces mongodb documents with the following fields:
# _id: Transcript id - used to index the files
# title: Title of the speech or interview
# primMinister: Who gave the speech in format 'Last name, First name'
# releaseDate: Given in days since 1970-01-01 as per R's default data storage