Skip to content

Instantly share code, notes, and snippets.

download.file("https://github.com/lse-me314/assignment09/blob/master/UKimmigTexts.zip?raw=true", destfile = "UKimmigTexts.zip")
unzip("UKimmigTexts.zip", exdir = "UKimmigTexts")
fls <- list.files("UKimmigTexts", full.names = TRUE)
txts <- character(length(fls))
for (i in seq_along(fls)) {
txts[i] <- paste(readLines(fls[i]), collapse = " ")
}
mycorpus <- quanteda::corpus(txts, docnames = fls)
@tpaskhalis
tpaskhalis / r_benchmarks.md
Last active September 11, 2017 10:01
R benchmarks

Comparison of two functions for creating list of arrays in R. Lambda-function implementation is about 4 times slower than using rep() function.

> larray1 <- function(nx, B) rep(list(array(data = 0.0, dim = c(nx, 1))), B)
> larray2 <- function(nx, B) lapply(1:B, function(x) x <- array(data = 0.0, dim = c(nx, 1)))
> 
> microbenchmark(
+   larray1(1211, 5),
+   larray2(1211, 5)
+ )
@tpaskhalis
tpaskhalis / lobbying_uk.md
Last active March 1, 2017 12:23
Links to data sources on lobbying in the UK
@tpaskhalis
tpaskhalis / lm_multilang.md
Created January 27, 2017 18:58
Linear Regression Models in R, Python, Stata

Just a short of how fitting and output of linear models looks like in R, Python and Stata. Will be expanded to more complex models (GLM, GAM) in hte future.

For the illustration purposes dataset mtcars will be used. It comes from the 1981 paper in Biometriks where it was one used to contrast manual vs automatic variable selection in multiple linear regression.

Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391–411.

Here is how it looks in R:

@tpaskhalis
tpaskhalis / convertpdf.md
Last active October 16, 2021 19:53
Batch conversion of pdf files to text

The procedure for automatic conversion of pdf into txt files in shell has been previously described in detail here.

In this gist I will focus on writing the bash script that uses find command-line program. It allows much sleaker implementation, with less code (essentially one-liner), while being robust to file and folder names that contain whitespaces or other non-standard characters (more on issues of wordsplitting in bash here).

Here's the original script:

#!/bin/bash
FILES=~/pdfs/*.pdf
for f in $FILES
do
@tpaskhalis
tpaskhalis / resources.md
Last active March 22, 2018 17:53
Useful Resources
@tpaskhalis
tpaskhalis / ssh_ubuntu_cloud
Last active December 19, 2022 07:39
Adding ssh public key to Ubuntu cloud image
# Instructions for working with QEMU image come from
# https://www.kumari.net/index.php/system-adminstration/49-mounting-a-qemu-image
1. Download image http://cloud-images.ubuntu.com/
2. Install qemu, nbd-client
3. Load the module
sudo modprobe nbd max_part=8
4. Connect image
sudo qemu-nbd --connect=/dev/nbd0 Downloads/xenial-server-cloudimg-amd64-disk1.img
5. Mount image