Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Dynamic Report - Demo for the talk on "Improving access to geospatial Big Data in the hydrology domain" - Royal Statistical Society 18.11.2015
title author date output
RNRFA: an R package to interact with the UK National River Flow Archive
Claudia Vitolo
18 November 2015

*** Updated on the 01.01.2016 to work with rnrfa version 0.4.0 ***

The UK National River Flow Archive serves daily streamflow data, spatial rainfall averages and information regarding elevation, geology, land cover and FEH related catchment descriptors. There is currently an API under development that in future should provide access to the following services: metadata catalogue, catalogue filters based on a geographical bounding-box, catalogue filters based on metadata entries, gauged daily data for about 400 stations available in WaterML2 format, the OGC standard used to describe hydrological time series. The information returned by the first three services is in JSON format, while the last one is an XML variant. The RNRFA package aims to achieve a simpler and more efficient access to data by providing wrapper functions to send HTTP requests and interpret XML/JSON responses.

Install dependencies

install.packages( c("devtools", "parallel", "ggplot2", "DT", "leaflet", "dygraphs") )

Install the package

The stable version (preferred option) of rnrfa is available from CRAN using install.packages("rnrfa"), while the development version is available on github via devtools:

install_github("cvitolo/r_rnrfa", subdir = "rnrfa")

List monitoring stations

The R function that deals with the NRFA catalogue to retrieve the full list of monitoring stations is called NRFA_Catalogue(). The function, used with no inputs, requests the full list of gauging stations with associated metadata. The output is a dataframe containing one record for each station and as many columns as the number of metadata entries available.

# Retrieve information for all the stations operated by the Natural Resources Wales
someStations <- catalogue(metadataColumn="operator", entryValue="Natural Resources Wales")

Convert coordinates

The only geospatial information contained in the list of station in the catalogue is the OS grid reference (column "gridRef"). The RNRFA package allows convenient conversion to more standard coordinate systems. The function "OSGparse()" converts the string to easting and northing in the British/Irish National Grid coordinate system (EPSG code: 27700/29902) by default. To get coordinates in latitude and longitude (WSGS84 coordinate system, EPSG code: 4326) use the parameter CoordSystem = "WGS84".

# Convert OS Grid reference to BNG
# Convert BNG to WSGS84
OSGparse("SN853872", CoordSystem = "WGS84")

Get time series data

The first column of the table "someStations" contains the id number. This can be used to retrieve the streamflow time series converting the waterml2 file to a time series object. Retrieving 129 time series is a time consuming task, here I use a library for parallel programming to speed up the process.

system.time( s <- mclapply(someStations$id, GDF) )  # from the parallel package

Use the result for a simple analysis

someStations$meanGDF <- unlist( lapply(s, mean) )
# Linear model
ggplot(someStations, aes(x = as.numeric(catchmentArea), y = meanGDF)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") +
  xlab(expression(paste("Catchment area [Km^2]",sep=""))) + 
  ylab(expression(paste("Mean flow [m^3/s]",sep="")))


Upgrade your data.frame to a data.table:


Create interactive maps using leaflet:


leaflet(data = someStations) %>% addTiles() %>%
  addMarkers(~lon, ~lat, popup = ~as.character(paste(id,name)))

Generate interactive plots using dygraphs:

dygraph(s[[1]]) %>% dyRangeSelector()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment