Skip to content

Instantly share code, notes, and snippets.

@vanhumbeecka
Last active August 29, 2015 14:15
Show Gist options
  • Save vanhumbeecka/fa0515d6d89477222386 to your computer and use it in GitHub Desktop.
Save vanhumbeecka/fa0515d6d89477222386 to your computer and use it in GitHub Desktop.
R repositories on CRAN
## Plot the number of R repositories per year, contributed to the CRAN package manager.
## This gives an indication of the popularity of the R language over the years.
library(XML)
library(data.table)
library(stringr)
library(ggplot2)
library(plyr)
theurl <- "http://cran.r-project.org/web/packages/available_packages_by_date.html"
tables <- readHTMLTable(theurl)
table <- tables[[1]]
dt <- data.table(table)
for(i in seq(length(names(dt)))) {
setnames(dt, names(dt)[i], str_trim(names(dt)[i]))
}
dt[,Date:=as.Date(Date)]
dt[,Package:=as.character(Package)]
dt[,Title:=as.character(Title)]
dt[,Year:=year(Date)]
year.vector <- year(dt$Date)
frequency.dt <- as.data.table(count(year.vector))
setnames(frequency.dt, "x", "year")
setnames(frequency.dt, "freq", "count")
p <- ggplot(frequency.dt, aes(x = year, y=count))
p + geom_bar(stat="identity") +
geom_text(aes(label=count), vjust=-0.5) +
ggtitle(paste("Number of R repositories on CRAN as on",Sys.Date(),"\n")) +
scale_x_continuous(breaks=year.vector)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment