Skip to content

Instantly share code, notes, and snippets.

@andrie
Last active April 16, 2021 01:50
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save andrie/73f23d15fed2f82853c1 to your computer and use it in GitHub Desktop.
Save andrie/73f23d15fed2f82853c1 to your computer and use it in GitHub Desktop.
Analyze R packages for popularity, using pagerank algorithm
## Analyze R packages for popularity, using pagerank algorithm
# Inspired by Antonio Piccolboni, http://piccolboni.info/2012/05/essential-r-packages.html
library(miniCRAN)
library(igraph)
library(magrittr)
# Download matrix of available packages at specific date ------------------
MRAN <- "http://mran.revolutionanalytics.com/snapshot/2014-11-01/"
pdb <- MRAN %>%
contrib.url(type = "source") %>%
available.packages(type="source", filters = NULL)
# Use miniCRAN to build a graph of package dependencies -------------------
# Note that this step takes a while, expect ~15-30 seconds
g <- pdb[, "Package"] %>%
makeDepGraph(availPkgs = pdb, suggests=FALSE, enhances=TRUE, includeBasePkgs = FALSE)
# Use the page.rank algorithm in igraph -----------------------------------
pr <- g %>%
page.rank(directed = FALSE) %>%
use_series("vector") %>%
sort(decreasing = TRUE) %>%
as.matrix %>%
set_colnames("page.rank")
# Display results ---------------------------------------------------------
head(pr, 25)
# build dependency graph of top packages ----------------------------------
set.seed(42)
pr %>%
head(25) %>%
rownames %>%
makeDepGraph(pdb) %>%
plot(main="Top packages by page rank", cex=0.5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment