Skip to content

Instantly share code, notes, and snippets.

View tmalsburg's full-sized avatar

Titus von der Malsburg tmalsburg

View GitHub Profile
@tmalsburg
tmalsburg / test_simulate.R
Last active August 29, 2015 14:21
Test how simulate.merMod deals with new factor levels
library(lme4)
head(sleepstudy)
summary(sleepstudy)
# Relabel subjects:
d <- sleepstudy
d$Subject <- factor(rep(1:18, each=10))
# Fit model:
@tmalsburg
tmalsburg / bold_author_hack.tex
Last active August 29, 2015 14:23
CV written in org-mode
\usepackage{xpatch}% or use http://tex.stackexchange.com/a/40705
\def\makenamesetup{%
\def\bibnamedelima{~}%
\def\bibnamedelimb{ }%
\def\bibnamedelimc{ }%
\def\bibnamedelimd{ }%
\def\bibnamedelimi{ }%
\def\bibinitperiod{.}%
\def\bibinitdelim{~}%

A potential pitfall when running Ibex experiments on Amazon Mechanical Turk

Ibex does Latin squares in a way that can potentially have serious unintended consequences in the form of spurious effects.

The problem: When you submit an experiment to Amazon Mechanical Turk, a lot of workers will immediately jump at it but the rate of participation quickly decays (the distribution over time often looks like an exponential decay). For every participant, Ibex selects a stimulus list based on an internal counter and this counter is incremented when a participant submits their results. Unfortunately, this means that the initial wave of participants all work on the same list of the Latin square and this list will therefore be strongly overrepresented. This can lead to strong spurious effects that are not due to the experimental manipulation but due to between-item differences. This is an easy-to-miss problem and I would not be surprised if some published results obtained with Ibex were false because of this prob

@tmalsburg
tmalsburg / mturk_completion_times.org
Last active June 15, 2016 19:45
How to correctly calculate worker compensation for Amazon Mechanical Turk

How to correctly calculate worker compensation for Amazon Mechanical Turk

tl;dr: When calculating the average time it takes to complete a HIT, it may be more appropriate to use the geometric mean or the median instead of the arithmetic mean. You may otherwise spend considerably more money than necessary (in our case 50% more).


@tmalsburg
tmalsburg / rmarkdown-render.el
Last active April 26, 2018 17:23
Elisp function for rendering RMarkdown files to PDF. Shows the output of the render process in a separate window.
(defun tmalsburg-rmarkdown-render ()
"Compiles the current RMarkdown file to PDF and shows output of
the compiler process in a separate window."
(interactive)
(let* ((buf (get-buffer-create "*rmarkdown-render*"))
(temp-window (or (get-buffer-window buf)
(split-window-below -10)))
(command "Rscript -e 'library(rmarkdown); render(\"%s\", output_format=\"%s\")'")
(command (format command (buffer-file-name) "pdf_document")))
(set-window-buffer temp-window buf)
@tmalsburg
tmalsburg / README.org
Last active May 12, 2018 17:23
R functions for calculating binomial credible intervals
@tmalsburg
tmalsburg / week04.org
Created November 6, 2019 15:15
Sample org file for teaching slides

Foundations of Math

Agenda for today

@tmalsburg
tmalsburg / Instructions.md
Last active August 16, 2022 13:58
LaTeX template for articles in APA format

Compile this template by executing the following in a command shell:

  pdflatex test && biber test && pdflatex test && pdflatex test

This template uses biblatex and biber instead of good old BibTeX. The bibliography files (*.bib) can have the same format (although biblatex allows using some interesting extensions). However, the biblatex+biber combo is much more powerful than good-old BibTeX (e.g. support for multiple bibliographies in one document) and comes with great documentation.

Suggestions for improvements welcome.

@tmalsburg
tmalsburg / predict_vs_simulate.org
Last active November 18, 2022 01:14
Predict vs simulate in lme4

Predict vs simulate in lme4

For this investigation we are going to use the sleepdata data set from the lme4 package. Here is the head of the data frame:

@tmalsburg
tmalsburg / lengths_of_words_brown_corpus.py
Last active December 11, 2022 11:42
Python script that uses NLTK to calculate the average length of English words across token and types in the Brown corpus
import nltk
from statistics import mean, stdev, median, mode
nltk.download('brown')
tokens = nltk.corpus.brown.tagged_words(tagset="universal")
types = list(dict.fromkeys(tokens))
# Lengths of tokens / types but ignoring punctuation, numbers, and X
# which is mostly foreign words (German, French, Latin) but strangely
# also a small number of common English words: