Skip to content

Instantly share code, notes, and snippets.

View giocomai's full-sized avatar

Giorgio Comai giocomai

View GitHub Profile
@giocomai
giocomai / SaveWebpage.js
Created May 1, 2015 15:00
Download a webpage with phantomjs from the command line. This allows to wait for javascript to be processed before saving the page, which cannot be achieved with wget. Download SaveWebpage.js, and then, from the terminal, run: phantomjs SaveWebpage.js URL nameOfSavedFile
var system = require('system');
var page = require('webpage').create();
var url = system.args[1];
var destination = system.args[2];
page.settings.resourceTimeout = 10000;
setTimeout(function(){
setInterval(function () {
@giocomai
giocomai / RemoveNewLines
Created May 2, 2015 15:39
Remove new lines from selection in LibreOffice - useful after copy/paste from pdf
sub RemoveNewLine
rem ----------------------------------------------------------------------
rem define variables
dim document as object
dim dispatcher as object
rem ----------------------------------------------------------------------
rem get access to the document
document = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
@giocomai
giocomai / importCountryNamesRu
Created October 6, 2015 20:50
Import into R names of the countries of the world in Russian
library("XML")
library("plyr")
countriesRu <- ldply(xmlToList(readLines("https://www.artlebedev.ru/tools/country-list/xml/")), data.frame)
countriesRu <- countriesRu[-1,]
@giocomai
giocomai / EUtemplate.R
Last active June 28, 2017 13:45
Castarter template for downloading a website, extracting metadata and exporting a dataset in R
## Install castarter (devtools required for installing from github)
# install.packages("devtools")
devtools::install_github("giocomai/castarter")
## Load castarter
library("castarter")
## Set project and website name
SetCastarter(project = "EuropeanUnion", website = "EuropeanParliament")
@giocomai
giocomai / install_luminance.sh
Last active February 26, 2019 11:53
Script for installing craigcabrey's luminance - A Philips Hue client for Linux written in Python and GTK+ - on Fedora 24
git clone git@github.com:craigcabrey/luminance.git
cd luminance
pip3 install requests
pip3 install netdisco
sudo dnf install gsettings-desktop-schemas gsettings-desktop-schemas-devel pygobject3-devel gtk3-devel
sudo pip3 install phue
./autogen.sh
./configure
sudo dnf install R R-RCurl curl-devel R-zoo R-XML openssl-devel libxml2-devel
@giocomai
giocomai / Turkish language wikipedia pageviews.R
Last active June 5, 2017 13:41
Extracts dumps of pageviews for Turkish language version of Wikipedia for the month of April 2017 and creates basic graphs
library("rvest")
library("tidyverse")
library("lubridate")
library("scales")
Sys.setlocale(category = "LC_TIME", locale = "en_IE")
dumpList <- read_html("https://dumps.wikimedia.org/other/pageviews/2017/2017-04/")
links <- data_frame(filename = html_attr(html_nodes(dumpList, "a"), "href")) %>% # extracting links
filter(grepl(x = filename, "projectviews")) %>% # keeping only aggregated data by project
@giocomai
giocomai / UN_country_names.R
Created May 28, 2017 07:39
Extract the name of all UN member states from the official website of the United Nations in R #rstats
library("rvest")
read_html(x = "http://www.un.org/en/member-states/") %>%
html_nodes(xpath = "//span[@class='member-state-name']") %>%
html_text()
@giocomai
giocomai / udunits in Fedora.R
Created June 14, 2017 15:03
install udunits2 for R in Fedora
# sudo dnf install udunits2 udunits2-devel
install.packages("udunits2", configure.args = c(udunits2 = '--with-udunits2-include=/usr/include/udunits2'))
@giocomai
giocomai / 2017-06-19-wikipediaTurk.R
Created June 20, 2017 07:26
Create graph with pageviews to Turkish language Wikipedia projects (April-June 2017)
library("rvest")
library("tidyverse")
library("lubridate")
library("scales")
Sys.setlocale(category = "LC_TIME", locale = "en_IE")
dumpListApril <- read_html("https://dumps.wikimedia.org/other/pageviews/2017/2017-04/")
linksApril <- data_frame(filename = html_attr(html_nodes(dumpListApril, "a"), "href")) %>% # extracting links
filter(grepl(x = filename, "projectviews")) %>% # keeping only aggregated data by project