Skip to content

Instantly share code, notes, and snippets.

@opavlov24
Last active September 15, 2016 12:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save opavlov24/6c515d6d00f3831f91dd5e8059e68337 to your computer and use it in GitHub Desktop.
Save opavlov24/6c515d6d00f3831f91dd5e8059e68337 to your computer and use it in GitHub Desktop.
Downloading articles from pubmed
#!/usr/bin/env groovy
def searchUrl = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=("2016/09/01"[PDAT]:"2016/09/02"[PDAT])+AND+cancer[sb]&usehistory=y'
def parser = new XmlSlurper()
parser.setFeature("http://apache.org/xml/features/disallow-doctype-decl", false)
parser.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
def eSearchResult = parser.parse(new URL(searchUrl).openStream())
def webEnv = eSearchResult.WebEnv.toString()
int queryKey = eSearchResult.QueryKey.toInteger()
int amountOfArticles = eSearchResult.Count.toInteger()
def retMax = 1000
def xmlFile = new File("articles.xml")
(0..amountOfArticles).step(retMax) { retstart ->
URL url = new URL("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&query_key=$queryKey&WebEnv=$webEnv&retstart=${retstart}&retmax=$retMax&retmode=xml")
println url
xmlFile << url.openStream()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment