import nltk | |
from statistics import mean, stdev, median, mode | |
nltk.download('brown') | |
tokens = nltk.corpus.brown.tagged_words(tagset="universal") | |
types = list(dict.fromkeys(tokens)) | |
# Lengths of tokens / types but ignoring punctuation, numbers, and X | |
# which is mostly foreign words (German, French, Latin) but strangely | |
# also a small number of common English words: |
<?xml version="1.0"?> | |
<opml version="1.0"> | |
<head> | |
<title>(Psycho)linguistics journals</title> | |
</head> | |
<body> | |
<outline title="Annual Rev Ling" xmlUrl="https://www.annualreviews.org/action/showFeed?ui=45mu4&mi=3fndc3&ai=6690&jc=linguistics&type=etoc&feed=atom"/> | |
<outline title="Brain & Language" xmlUrl="https://rss.sciencedirect.com/publication/science/0093934X"/> | |
<outline title="Cognition" xmlUrl="http://rss.sciencedirect.com/publication/science/00100277"/> | |
<outline title="Cognitive Science" xmlUrl="https://onlinelibrary.wiley.com/feed/15516709/most-recent"/> |
This little project now lives in a GitHub repository: https://github.com/tmalsburg/binomialCRIs
Compile this template by executing the following in a command shell:
pdflatex test && biber test && pdflatex test && pdflatex test
This template uses biblatex and biber instead of good old BibTeX. The bibliography files (*.bib
) can have the same format (although biblatex allows using some interesting extensions). However, the biblatex+biber combo is much more powerful than good-old BibTeX (e.g. support for multiple bibliographies in one document) and comes with great documentation.
Suggestions for improvements welcome.
(defun tmalsburg-rmarkdown-render () | |
"Compiles the current RMarkdown file to PDF and shows output of | |
the compiler process in a separate window." | |
(interactive) | |
(let* ((buf (get-buffer-create "*rmarkdown-render*")) | |
(temp-window (or (get-buffer-window buf) | |
(split-window-below -10))) | |
(command "Rscript -e 'library(rmarkdown); render(\"%s\", output_format=\"%s\")'") | |
(command (format command (buffer-file-name) "pdf_document"))) | |
(set-window-buffer temp-window buf) |
How to correctly calculate worker compensation for Amazon Mechanical Turk
tl;dr: When calculating the average time it takes to complete a HIT, it may be more appropriate to use the geometric mean or the median instead of the arithmetic mean. You may otherwise spend considerably more money than necessary (in our case 50% more).
A potential pitfall when running Ibex experiments on Amazon Mechanical Turk
Ibex does Latin squares in a way that can potentially have serious unintended consequences in the form of spurious effects.
The problem: When you submit an experiment to Amazon Mechanical Turk, a lot of workers will immediately jump at it but the rate of participation quickly decays (the distribution over time often looks like an exponential decay). For every participant, Ibex selects a stimulus list based on an internal counter and this counter is incremented when a participant submits their results. Unfortunately, this means that the initial wave of participants all work on the same list of the Latin square and this list will therefore be strongly overrepresented. This can lead to strong spurious effects that are not due to the experimental manipulation but due to between-item differences. This is an easy-to-miss problem and I would not be surprised if some published results obtained with Ibex were false because of this prob
Predict vs simulate in lme4
For this investigation we are going to use the sleepdata
data set from the lme4 package. Here is the head of the data frame: