Skip to content

Instantly share code, notes, and snippets.

@johnbaums
Last active December 25, 2015 09:59
Show Gist options
  • Save johnbaums/6958408 to your computer and use it in GitHub Desktop.
Save johnbaums/6958408 to your computer and use it in GitHub Desktop.
Improved tpl_get, based on tpl_get in the taxize package (http://cran.r-project.org/web/packages/taxize/index.html). Downloads plant species records (with accepted names) indexed by theplantlist.org.
tpl_get2 <- function (dir_, family = NULL)
{
require(RCurl)
require(XML)
require(plyr)
temp <-getURL('http://www.theplantlist.org/browse/-/')
temp <- htmlParse(temp)
families <- xpathSApply(temp, "//ul[@id='nametree']//a", xmlValue)
csvlinks <- sprintf('http://www.theplantlist.org%s%s.csv',
xpathSApply(temp, "//ul[@id='nametree']//a", xmlGetAttr, 'href'),
families)
if (all(!family %in% families)) {
stop(paste('Requested families not found on TPL.',
'Use tpl_families() to list plant families indexed by TPL.'),
call.=FALSE)
}
if (any(!family %in% families)) {
warning(sprintf('Requested families not found on TPL: %s.\n%s',
paste(family[!family %in% families], collapse=', '),
'Use tpl_families() to list plant families indexed by TPL.'),
call.=FALSE)
}
if (!is.null(family)) {
csvlinks <- csvlinks[families %in% family]
families <- families[families %in% family]
}
getcsv <- function(x) {
download.file(x, destfile=file.path(dir_, basename(x)), quiet=TRUE)
}
message("Downloading csv files to ", dir_, "...")
dir.create(dir_)
l_ply(csvlinks, getcsv, .progress = "text")
message("...el fin")
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment