Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Spearman correlation: Wikidata QRank and Wikidata PageRank (danker)
#!/usr/bin/env bash
export LC_ALL=C
if [ ! -f "qrank_sorted.tsv" ]; then
wget -O - | \
gunzip -c | \
tail -n+2 | \
sed "s/,/\t/" | \
sort -k1,1 \
> qrank_sorted.tsv
if [ ! -f "pr_202111_sorted.tsv" ]; then
wget -O - | \
bunzip2 -c | \
sort -k1,1 \
> pr_202111_sorted.tsv
join qrank_sorted.tsv pr_202111_sorted.tsv > qrank_pr_joined.tsv
wc -l qrank_sorted.tsv pr_202111_sorted.tsv qrank_pr_joined.tsv
Rscript <(printf "qpr <- read.table(file = 'qrank_pr_joined.tsv', sep = ' ')\ncor(qpr[2],qpr[3], method='spearman')")
Copy link is redirected to which is broken since at least one week (error 502).
Do you know any other URL providing the same data?

Copy link

Good question, thanks @madrisan. Maybe @brawer can help here.

Copy link

brawer commented Aug 29, 2022

Sorry about this! See brawer/wikidata-qrank#8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment