Skip to content

Instantly share code, notes, and snippets.

@giannou
Last active August 1, 2023 12:48
Show Gist options
  • Save giannou/0de533a3fd68f2f718214655004e99be to your computer and use it in GitHub Desktop.
Save giannou/0de533a3fd68f2f718214655004e99be to your computer and use it in GitHub Desktop.
NAVIGATE vetting (global)
---
title: "NAVIGATE Vetting"
author: "Anastasis Giannousakis (E3Modelling)"
output:
pdf_document: default
html_document: default
---
The R code (R Markdown document) that computes these checks can be found here: https://gist.github.com/giannou/0de533a3fd68f2f718214655004e99be.
The code reads a file called `dataset.rds` containing the data for the region "World" of the NAVIGATE vetting dataset as a long-format data-frame. The file `dataset.rds` has to be present in the same folder as the R code. For the summation check, a table (`summation_groups.csv`) needs to be present in the folder. The summation table has the form: column A contains the "parent" variables, and column B the "children" (i.e. variables whose sum must be the value of the "parent").
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r package preliminaries, include=FALSE}
# add additional R package repository (in case it is needed)
if (!any(grepl("https://rse.pik-potsdam.de/r/packages", getOption("repos"))))
options(repos = c(getOption("repos"), pik = "https://rse.pik-potsdam.de/r/packages"))
```
```{r check installed, include=FALSE}
#define packages to install
packages <- c("dplyr", "quitte", "tidyr")
#install all packages that are not already installed
install.packages(setdiff(packages, rownames(installed.packages())))
```
```{r load packages, include=FALSE}
# load packages
lapply(packages, require, character.only = TRUE)
```
```{r load data, include=FALSE}
# load data
if (file.exists("dataset.rds")) {
x <- readRDS("dataset.rds")
x$Region <- "GLO"
} else {
error("Error: file dataset.rds not found, it has to be in the same folder as this script")
}
```
# Check aggregate emissions
Aggregate global CO2 emissions from 2020 to the time of net-zero. These should be lower than 650 Gt in 1p5C scenarios and lower than 1150 in the 2C family of scenarios (30 / 50 Gt tolerance is used for 1.5 and 2 C scenarios, respectively). These models and scenarios are off:
```{r filter data, include=FALSE}
# filter data (keep only total CO2 emissions, and time period after 2020)
tmp <- filter(x, `Variable` == "Emissions|CO2", `Year` >= 2015) %>%
as.quitte() %>% # convert to "quitte" dataframe, to use built-in interpolation function (data has 5-year resolution) #nolint
interpolate_missing_periods(expand.values = TRUE, period = seq(2015, max(x[["Year"]]), 1))
```
```{r, include=FALSE}
tmp2 <- filter(tmp, `period` >= 2020) %>% mutate(.by = c("model", "scenario"), cum_sum = cumsum(value))
tmp2[["cum_sum"]] <- tmp2[["cum_sum"]] / 1000
```
```{r find year of net-zero, echo=FALSE}
tmp3 <- mutate(tmp, .by = c("scenario", "model"), minemi = min(abs(value)))
tmp4 <- tmp3[abs(tmp3$value) - tmp3$minemi < 0.01, ]
for (i in levels(x$Model)) {
for (j in levels(x$Scenario)) {
if (!is.na(as.numeric(tmp[tmp$model == i & tmp$scenario == j & tmp$period == 2020, "value"]))) {
if (grepl("1p5C", j)) {
nzp <- as.numeric(tmp4[tmp4$model == i & tmp4$scenario == j, "period"])
tae <- as.numeric(tmp2[tmp2$model == i & tmp2$scenario == j & tmp2$period == nzp, "cum_sum"])
if (tae > 680) print(paste("not OK: ", i, j, round(tae, 2)))
}
if (grepl("_2C", j)) {
nzp <- as.numeric(tmp4[tmp4$model == i & tmp4$scenario == j, "period"])
tae <- as.numeric(tmp2[tmp2$model == i & tmp2$scenario == j & tmp2$period == nzp, "cum_sum"])
if (tae > 1200) print(paste("not OK: ", i, j, round(tae, 2)))
}
}
}
}
```
# Check short term behavior
2025 Emissions|CO2|Energy and Industrial Processes should be no lower than 0.95 * 2019 levels. 2019 level from IEA is 36200 Mt. These models and scenarios are off::
```{r echo=FALSE, paged.print=TRUE}
# filter data (keep only "Emissions|CO2|Energy and Industrial Processes", and time period 2025)
tmp <- filter(x, `Variable` == "Emissions|CO2|Energy and Industrial Processes", `Year` == 2025)
# 2025 Emissions|CO2|Energy and Industrial Processes should be no lower than 0.95 * 2019 levels… 2019 level from IEA is 36.2 Gt
print(select(tmp[tmp$value < 0.95 * 36.2 * 1000, ], -c("Region")), n = 1e3)
```
# Check primary energy
Primary energy from fossils should be < 20% / 10% of primary energy supply (excl. non energy use of fossils) in 2100 in 2C and 1p5C scenarios, respectively. These 1p5C scenarios are off:
```{r echo=FALSE, paged.print=TRUE}
# filter data (keep only "Primary Energy"&"Primary Energy|Fossil", and time period 2100)
tmp <- filter(x, `Variable` %in% c("Primary Energy", "Primary Energy|Fossil"), `Year` == 2100, `Scenario` %in% grep("1p5C", levels(x$Scenario), v = T))
print(select(tmp[tmp$Variable=="Primary Energy",][c(tmp[tmp$Variable=="Primary Energy|Fossil", "value"] > 0.1 * tmp[tmp$Variable=="Primary Energy", "value"]),],-c("Region")),n=1e3)
```
These 2C scenarios are off:
```{r echo=FALSE, paged.print=TRUE}
# filter data (keep only "Primary Energy"&"Primary Energy|Fossil", and time period 2100)
tmp <- filter(x, `Variable` %in% c("Primary Energy", "Primary Energy|Fossil"), `Year` == 2100, `Scenario` %in% grep("_2C", levels(x$Scenario), v = T))
print(select(tmp[tmp$Variable == "Primary Energy", ][c(tmp[tmp$Variable=="Primary Energy|Fossil", "value"] > 0.2 * tmp[tmp$Variable == "Primary Energy", "value"]), ], -c("Region")), n=1e3)
```
# Check model behavior
CO2 emissions are lower in 1p5C&2C scenarios than NPi:
```{r echo=FALSE}
# filter data (keep only "Emissions|CO2")
tmp <- filter(x, `Variable` == "Emissions|CO2")
tmp5<-tidyr::pivot_wider(tmp, names_from = "Scenario")
for (i in grep("NPi", levels(x$Scenario), v = T, invert = T)) tmp5[[i]] <- tmp5[[i]] < tmp5$SUP_NPi_Default
tmp5 <- tidyr::pivot_longer(tmp5, values_to = "check", cols = c(6:10, 12:30))
names(tmp5) <- sub("name", "Scenario", names(tmp5))
tmp5 <- select(tmp5[tmp5$Year>2020, ], -c("Unit", "Variable", "Region"))
if (any(!tmp5$check, na.rm = T)) {
print(tmp5[which(!tmp5$check), ], n = 1e3)
} else {
print("All models and scenarios pass the test")
}
```
CO2 emissions are lower in 1p5C scenarios than in 2C:
```{r echo=FALSE}
# filter data (keep only "Emissions|CO2")
tmp <- filter(x, `Variable` == "Emissions|CO2")
tmp5<-tidyr::pivot_wider(tmp, names_from = "Scenario")
for (i in grep("1p5C", levels(x$Scenario), v = T)) tmp5[[i]] <- tmp5[[i]] < tmp5[[paste0("SUP_2C_", strsplit(i, split = "_1p5C_")[[1]][2])]]
tmp5 <- tidyr::pivot_longer(tmp5, values_to = "check", cols = grep("1p5C", names(tmp5)))
tmp5 <- select(tmp5, -grep("SUP", names(tmp5)))
names(tmp5) <- sub("name", "Scenario", names(tmp5))
tmp5 <- select(tmp5[tmp5$Year>2020,], -c("Unit", "Variable", "Region"))
if (any(!tmp5$check, na.rm = T)) {
print(tmp5[which(!tmp5$check), ], n = 1e3)
} else {
print("All models and scenarios pass the test")
}
```
Final energy is lower in 2C&1p5C than in NPi
```{r echo=FALSE}
# filter data (keep only "Final Energy")
tmp <- filter(x, `Variable` == "Final Energy")
tmp5 <- tidyr::pivot_wider(tmp, names_from = "Scenario")
for (i in grep("NPi", levels(x$Scenario), v = T, invert = T)) tmp5[[i]] <- tmp5[[i]] < tmp5$SUP_NPi_Default
tmp5 <- tidyr::pivot_longer(select(tmp5, -c("SUP_NPi_Default")), values_to = "check", cols = c(6:29))
names(tmp5) <- sub("name", "Scenario", names(tmp5))
tmp5 <- select(tmp5[tmp5$Year>2020,], -c("Unit", "Variable", "Region"))
if (any(!tmp5$check, na.rm = T)) {
print(tmp5[which(!tmp5$check), ], n = 1e3)
} else {
print("All models and scenarios pass the test")
}
```
# Check summation groups
```{r echo=FALSE}
checkSum <- function(name,x) {
tmp <- x %>%
group_by(Model, Scenario, Region, Year) %>%
summarise(grsum = sum(value), .groups = "drop") %>%
ungroup()
tmp$Variable <- name
return(tmp)
}
tmp <- read.csv("summation_groups.csv", stringsAsFactors = FALSE)
xt <- filter(x, Variable %in% unique(c(tmp$child, tmp$parent)))
check_variables <- list()
for (i in unique(tmp[, "parent"])) check_variables[[i]] <- tmp[which(tmp[,"parent"] ==i), "child"]
tmp <- tmp %>% mutate(Variable = child)
xt <- left_join(xt, tmp, by = "Variable", relationship = "many-to-many")
tmp2 <- NULL
for (i in names(check_variables)) tmp2 <- rbind(tmp2, checkSum(i, filter(xt, parent == i, Variable %in% check_variables[[i]])))
#tmp2$Variable <- gsub(" [1-9]$", "", tmp2$Variable)
tmp2 <- suppressWarnings(left_join(tmp2, xt, c("Scenario", "Region", "Variable", "Year", "Model"), relationship = "many-to-many") %>%
mutate(diff = abs(grsum - value), reldiff = 100*abs((grsum - value)/value)) %>%
select(-c("parent", "child")))
tmp2 <- tmp2[, c(1:10)]
tmp2_tiny <- filter(tmp2, reldiff < 3, diff < 0.001)
tmp2 <- filter(tmp2, reldiff >= 3, diff >= 0.001)
message("Summation groups:")
sg<-read.csv("summation_groups.csv", stringsAsFactors = FALSE)
options(width = 80)
for (i in unique(sg$parent)) {
message("parent:")
print(i)
message("children:")
print(filter(sg,parent==i)[,"child"])
message("")
}
options(width = 125)
for (i in levels(x$Model)) {
for (j in levels(x$Scenario)) {
pt <- select(as.data.frame(filter(tmp2, Model==i, Scenario == j)), -c(Region, Unit, diff, Model, Scenario))
if (nrow(pt) > 0 ) {
message(paste0("Summation check for model ", i, " and scenario ", j, ".", "\n", "Shown are > 3% differences (reldiff) between group sum (grsum) and parent variable (value):"))
print(pt, row.names = F)
}
}
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment