Skip to content

Instantly share code, notes, and snippets.

@mbjones
Last active May 10, 2019 01:12
Show Gist options
  • Save mbjones/5003ab37ec42367a3b6e065b08aacea9 to your computer and use it in GitHub Desktop.
Save mbjones/5003ab37ec42367a3b6e065b08aacea9 to your computer and use it in GitHub Desktop.
Plotting FAIR metrics
---
title: "FAIR Metrics"
author: "Matt Jones"
date: "5/8/2019"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
```
## Generate a simulated data set
This is a fake data set that takes the form:
```{r load_data}
updates <- data.frame(v1=seq(as.Date("2000/1/1"), by = "month", length.out = 10),
v2=seq(as.Date("2000/5/1"), by = "month", length.out = 10),
v3=seq(as.Date("2000/8/1"), by = "month", length.out = 10)) %>%
gather(key = version, value = update_date) %>%
arrange(update_date) %>%
select(update_date)
fair <- expand.grid(version = seq(1,3,1), object = seq(1,5,1), scope = c("adc", "knb")) %>%
mutate(pid = paste(scope, object, version, sep=".")) %>%
mutate(f = sample(90:100, 30, replace=TRUE)) %>%
mutate(a = sample(70:100, 30, replace=TRUE)) %>%
mutate(i = sample(40:70, 30, replace=TRUE)) %>%
mutate(r = sample(30:50, 30, replace=TRUE)) %>%
mutate(score = as.integer((f + a + i + r)/4))
scores <- bind_cols(updates, fair)
head(scores)
```
## Calculate stats
Generate stats by first grouping by month, then keep only the most recent
observation for each dataset that month, and then take the average of those
for each of the FAIR facets by month. Finally, transpose the data.
```{r calculate_means}
most_recent <- scores %>%
arrange(update_date, object, version) %>%
group_by(update_date, object) %>%
top_n(1, version)
score_means <- most_recent %>%
group_by(update_date) %>%
summarise(f=mean(f), a=mean(a), i=mean(i), r=mean(r)) %>%
gather(metric, mean, -update_date)
score_means$metric <- factor(score_means$metric,
levels=c("f", "a", "i", "r"),
labels=c("Findable", "Accessible", "Interoperable", "Reusable"))
head(score_means)
```
## Plot it!
```{r plot}
d1_colors <- c("#ff582d", "#c70a61", "#1a6379", "#60c5e4")
ggplot(data=score_means, mapping=aes(x=update_date, y=mean, color=metric)) +
geom_line() +
geom_point(size=1) +
theme_bw() +
scale_colour_manual(values=d1_colors) +
scale_x_date(date_breaks="3 months", date_minor_breaks="months", labels=date_format("%Y %b")) +
scale_y_continuous(limits=c(0,100))
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment