Skip to content

Instantly share code, notes, and snippets.

@dwinter
Created July 1, 2014 18:51
Show Gist options
  • Save dwinter/e695d61e0808f56e2099 to your computer and use it in GitHub Desktop.
Save dwinter/e695d61e0808f56e2099 to your computer and use it in GitHub Desktop.
Open tree proposal

#An rOpenSci library for the Open Tree of Life API.

rOpenSci is a project that allows programatic access to data repostories in the popular R programming language. rOpenSci already provides libraries to query the phylogeny databases treeBASE and Phylomatic, as well as data resources provided by NCBI and dryad . A library wrapping the Open Tree of Life would be an excellent addition to the rOpenSci project and hopefully increase the availability of the Open Tree of Life data.

I imagine the first step in creating such a library would be to faithfully map the Open Tree API. Providing well-documented functions supported by a thorough test-suite we will provide reliable programtic access to Open Tree of Life at a relatively 'low level'. These low level functions could then be used by developers of other libraries, or within the proposed library, to automate common use-cases involving phylogentic data. In time the functions provided by the proposed library and existing rOpenSci libraries could be united around a single 'umbrella' interface for phylogentic data, similar to the taxize library which allows the reterival of taxonomic information from many data sources.

rOpenSci is a collaborative project, and we welcome contributions in the form of code, documentation, tests or use-cases from anyone. In particular, we sould like to combine efforts with anyone planning or providing R functions to acess the Open Tree data, and work closely with projects like Arbor that already include R as part of their workflow. I welcome suggestions or comments or offers of collaboration in this thread!

@sckott
Copy link

sckott commented Jul 2, 2014

hey @dwinter Looks great!

  • s/sould/should ?
  • there is a R library https://github.com/nicolewhite/RNeo4j - I think they use neo4j as their DB? don't know if we'd need to roll our own wrappers to neo4j or not
  • I think we likely would want a single package for opentree, then a taxize like package to combine trees from different sources (some from ind packages, and some from small sources like Phylomatic (move the phylomatic_tree() fxn out of taxize))

I (and I'm sure Carl too) would be very interested in helping out on the package.

@karthik
Copy link

karthik commented Jul 8, 2014

rOpenSci is a project that allows programatic access to data repostories

rOpenSci is a project that provides programatic access to various scientific data repositories through the widely used R programming language.

increase the availability

increase update and promote greater use of ...

Would these really be characterized as low level if you are directly mapping the API methods. The API itself accomplishes that task, by providing language agnostic pipelines to the data. The appeal here would be that you can provide a much more user friendly R interface.

Other than that and a few typos, this looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment