Skip to content

Instantly share code, notes, and snippets.

@cvitolo
Last active January 1, 2016 19:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cvitolo/5f476832243a37923615 to your computer and use it in GitHub Desktop.
Save cvitolo/5f476832243a37923615 to your computer and use it in GitHub Desktop.
Dynamic Report - Demo for the talk on "Improving access to geospatial Big Data in the hydrology domain" - Royal Statistical Society 18.11.2015
---
title: "RNRFA: an R package to interact with the UK National River Flow Archive"
author: "Claudia Vitolo"
date: "18 November 2015"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(cache=TRUE)
```
*** Updated on the 01.01.2016 to work with rnrfa version 0.4.0 ***
The UK National River Flow Archive serves daily streamflow data, spatial rainfall averages and information regarding elevation, geology, land cover and FEH related catchment descriptors. There is currently an API under development that in future should provide access to the following services: metadata catalogue, catalogue filters based on a geographical bounding-box, catalogue filters based on metadata entries, gauged daily data for about 400 stations available in WaterML2 format, the OGC standard used to describe hydrological time series. The information returned by the first three services is in JSON format, while the last one is an XML variant. The RNRFA package aims to achieve a simpler and more efficient access to data by providing wrapper functions to send HTTP requests and interpret XML/JSON responses.
# Install dependencies
```{r, eval=TRUE, include=TRUE, echo=TRUE}
install.packages( c("devtools", "parallel", "ggplot2", "DT", "leaflet", "dygraphs") )
```
# Install the package
The stable version (preferred option) of rnrfa is available from CRAN using `install.packages("rnrfa")`, while the development version is available on github via devtools:
```{r, eval=TRUE, include=TRUE, echo=TRUE}
library(devtools)
install_github("cvitolo/r_rnrfa", subdir = "rnrfa")
```
# List monitoring stations
The R function that deals with the NRFA catalogue to retrieve the full list of monitoring stations is called NRFA_Catalogue(). The function, used with no inputs, requests the full list of gauging stations with associated metadata. The output is a dataframe containing one record for each station and as many columns as the number of metadata entries available.
```{r}
library(rnrfa)
# Retrieve information for all the stations operated by the Natural Resources Wales
someStations <- catalogue(metadataColumn="operator", entryValue="Natural Resources Wales")
```
# Convert coordinates
The only geospatial information contained in the list of station in the catalogue is the OS grid reference (column "gridRef"). The RNRFA package allows convenient conversion to more standard coordinate systems. The function "OSGparse()" converts the string to easting and northing in the British/Irish National Grid coordinate system (EPSG code: 27700/29902) by default. To get coordinates in latitude and longitude (WSGS84 coordinate system, EPSG code: 4326) use the parameter CoordSystem = "WGS84".
```{r}
# Convert OS Grid reference to BNG
OSGparse("SN853872")
# Convert BNG to WSGS84
OSGparse("SN853872", CoordSystem = "WGS84")
```
# Get time series data
The first column of the table "someStations" contains the id number. This can be used to retrieve the streamflow time series converting the waterml2 file to a time series object. Retrieving 129 time series is a time consuming task, here I use a library for parallel programming to speed up the process.
```{r}
library(parallel)
detectCores()
system.time( s <- mclapply(someStations$id, GDF) ) # from the parallel package
```
Use the result for a simple analysis
```{r, message=F, warning=F}
someStations$meanGDF <- unlist( lapply(s, mean) )
```
```{r}
# Linear model
library(ggplot2)
ggplot(someStations, aes(x = as.numeric(catchmentArea), y = meanGDF)) +
geom_point() +
stat_smooth(method = "lm", col = "red") +
xlab(expression(paste("Catchment area [Km^2]",sep=""))) +
ylab(expression(paste("Mean flow [m^3/s]",sep="")))
```
# INTEROPERABILITY
Upgrade your data.frame to a data.table:
```{r, cache=FALSE}
library(DT)
datatable(someStations[,c(1:4,7,9,10,12:14,17)])
```
Create interactive maps using leaflet:
```{r, cache=FALSE}
library(leaflet)
leaflet(data = someStations) %>% addTiles() %>%
addMarkers(~lon, ~lat, popup = ~as.character(paste(id,name)))
```
Generate interactive plots using dygraphs:
```{r, cache=FALSE}
library(dygraphs)
dygraph(s[[1]]) %>% dyRangeSelector()
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment