Skip to content

Instantly share code, notes, and snippets.

@hakimabdi
Last active January 5, 2024 00:49
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save hakimabdi/8ca1d2783b0ab8aec5cac993573b08ee to your computer and use it in GitHub Desktop.
Save hakimabdi/8ca1d2783b0ab8aec5cac993573b08ee to your computer and use it in GitHub Desktop.
This function performs a Mann-Kendall trend test with associated p-values on a time seris of gridded satellite data
#############################################################################################
# title : Perform a Mann-Kendall trend test of satellite image time series
# purpose : Mann-Kendall test of monotonic trend and associated statistical significance
# author : Abdulhakim Abdi (@HakimAbdi)
# input : Raster stack/brick comprising a time series of observations
# output : Raster object of monotoic trend (Kendall's tau), significance or both (i.e. brick)
# data : Test data: https://1drv.ms/u/s!AsHsKb_qtbkwg4ti8pJCxvUer9rMqg?e=BGcH1K
# notes : Depending on the time series, it is might be useful prewhiten the data to remove serial correlation
# : prior to extracting the trend. See blog post on www.hakimabdi.com for more information.
#############################################################################################
MKraster <- function(rasterstack, type=c("trend","pval","both")){
# Values for (layers, ncell, ncol, nrow, method, crs, extent) come straight from the input raster stack
# e.g. nlayers(rasterstack), ncell(rasterstack)... etc.
print(paste("Start MKraster:",Sys.time()))
print("Loading parameters")
layers=nlayers(rasterstack);ncell=ncell(rasterstack);
ncol=ncol(rasterstack);nrow=nrow(rasterstack);crs=crs(rasterstack);
extent=extent(rasterstack);pb = txtProgressBar(min = 0, max = ncell, initial = 0)
print("Done loading parameters")
mtrx <- as.matrix(rasterstack,ncol=layers)
empt <- matrix(nrow=ncell, ncol=2)
print("Initiating loop operation")
if (type == "trend"){
for (i in 1:length(mtrx[,1])){
if (all(is.na(mtrx[i,]))){
empt[i,1] <- NA
} else
if (sum(!is.na(mtrx[i,])) < 4){
empt[i,1] <- NA
} else
empt[i,1] <- as.numeric(MannKendall(as.numeric(mtrx[i,]))$tau)
}
print("Creating empty raster")
trend <- raster(nrows=nrow,ncols=ncol,crs=crs)
extent(trend) <- extent
print("Populating trend raster")
values(trend) <- empt[,1]
print(paste("Ending MKraster on",Sys.time()))
trend
}
else
if (type == "pval"){
for (i in 1:length(mtrx[,1])){
if (all(is.na(mtrx[i,]))){
empt[i,1] <- NA
} else
if (sum(!is.na(mtrx[i,])) < 4){
empt[i,1] <- NA
} else
empt[i,1] <- as.numeric(MannKendall(as.numeric(mtrx[i,]))$sl)
}
pval <- raster(nrows=nrow,ncols=ncol,crs=crs)
extent(pval) <- extent
print("Populating significance raster")
values(pval) <- empt[,1]
print(paste("Ending MKraster on",Sys.time()))
pval
}
else
if (type == "both"){
for (i in 1:length(mtrx[,1])){
if (all(is.na(mtrx[i,]))){
empt[i,1] <- NA
} else
if (sum(!is.na(mtrx[i,])) < 4){
empt[i,1] <- NA
} else
tryCatch({
empt[i,1] <- as.numeric(MannKendall(as.numeric(mtrx[i,]))$tau)
empt[i,2] <- as.numeric(MannKendall(as.numeric(mtrx[i,]))$sl)
})
}
tr <- raster(nrows=nrow,ncols=ncol,crs=crs)
pv <- raster(nrows=nrow,ncols=ncol,crs=crs)
print("Populating raster brick")
values(tr) <- empt[,1]
values(pv) <- empt[,2]
brk <- brick(tr,pv)
extent(brk) <- extent
names(brk) <- c("trend","p.value")
print(paste("Ending MKraster on",Sys.time()))
brk
}
}
@LUKE2019-ENG
Copy link

Thank you for sharing the code

@cbchisanga
Copy link

Good R scripts. I have used it in my analysis

@antobaro00
Copy link

Hi,
I am not a code expert at all. But I should perform the Mann-kendall test on a set of rasters using R in qgis.
Could you explain me how to do it, using this code?

Thanks

@cbchisanga
Copy link

Send me you email so that I send you the complete script that calculates the significance and p-value.

This is the script below:

# Perform a Mann-Kendall trend test on satellite image time series in R

The Mann-Kendall trend test has become popular in the remote sensing community to test whether a time series of satellite observations is consistently increasing or decreasing. This consistency is referred to as monotonicity, and the function that describes it is called a monotonic function. On the other hand, a non-monotonic trend increases and decreases at different intervals in the time series.

A number of reasons are behind the popularity of the Mann-Kendall test in remote sensing: (1) it is a non-parametric test, which means it makes no assumptions on the distribution of the data. (2) It also does not assume the data to be homoscedastic, which means the variance around the regression line does not have to be the same for all values of the predictor variable. (3) It is resistant to outliers, so these do not have to be removed prior to trend detection.

The Mann-Kendall test does require that the data be independent and is sensitive to presence of serial correlation (i.e. data points are correlated with one another). This makes the null hypothesis of no trend to be rejected too frequently, making the test very liberal. In this case, it is might be useful to remove the serial correlation from the data through a process called prewhitening (case for prewhitening). On the other hand, if the sample size of the data and the magnitude of the trend are large, serial correlation does not affect the Mann-Kendall test statistics and prewhitening the data may remove part of the trend itself (case against prewhitening). So, the short answer on whether to prewhiten or not is that it depends on your data.

The Mann-Kendall test is usually performed on univariate time series such as hydrometeorological data derived from rain or streamflow gauges. However, satellite image time series has a different data structure that presents a unique challenge, i.e. each pixel is its own time series. While conducting analysis for the last research paper in my PhD, I developed a function in R that can take in raster stacks or bricks to perform the Mann-Kendall trend test and calculate its statistical significance (p values). The advantage of this function is that it only requires two R packages and only has two parameters, which makes it straightforward to run.

# TUTORIAL: HOW TO PERFORM THE M-K TEST ON RASTER DATA IN R

Here I’ve provided sample data to try out the function. The sample dataset comprises biweekly observations of normalized difference vegetation index (NDVI) over South America between 2000 and 2009. The spatial resolution is 0.5 degrees (~50km), so the dataset is not large (about 5.2 MB).

The function depends on the kendall and raster packages in R, so you will need to install them. Once you’ve loaded them as well as custom function I’ve developed then you’re good to go.

require(raster)

require(kendall)

library(Kendall)
library(raster)

load NDVI time series stack from file.

ndvi.stack = stack("ndvi_SouthAmerica_2000-2009.tif")

setwd("D:/2023_modeled_climate_data/CHIRPS/chirps_geotif/Output_TIF/input_geotif/")
all_NDVI_files <- Sys.glob('zambia_chirps-v2*.tif')
all_NDVI_files
tiffiles <- all_NDVI_files

chirps_stack <- stack(tiffiles)
chirps_stack

source("MKraster.R") # this script should in folder where geotiff files are located

get the Mann-Kendall trend and p-values for the time series

MKraster(chirps_stack, type = "both")
mk_precip <- MKraster(chirps_stack, type = "both")

system.time({
mk_precip = MKraster(chirps_stack, type = "both")
})

plot the trend and p-value

plot(mk_precip$trend)
plot(mk_precip$p.value)

plot(mk_precip[[1]])
plot(mk_precip[[2]])

write the result to file

writeRaster(mk_precip, "D:/2023_modeled_climate_data/kendall_analysis/outputs/mk.Annual_evi_Trend.2000.2022.tif", overwrite=TRUE)
getwd()

calculate the trend, p-value and significance

set the path

path <- "D:/MOOC/kanona_forest/correlation_r/"

set the working directory

setwd(path)
list.files()

gimms <- brick("annual.evi.2000.2022.nc", varname= "_250m_16_days_EVI")

gimms[] <- NA

gimms[gimms == -999] <- NA

plot(gimms)

Remove NAs

gimms[is.na(gimms)] <- -9999

gimms <- na.omit(gimms)

first, let’s create annual sums, as autocorrelation might be present with monthly values. To keep NDVI as the unit, the results are devided by 24:

The MOD11A2 Version 6 product provides an average 8-day per-pixel Land Surface Temperature and Emissivity (LST&E) with a 1 kilometer (km) spatial resolution

fun <- function(x) {
gimms.ts = ts(x, start=c(2000,2), end=c(2022,1), frequency=1) # annual timestep
x <- aggregate(gimms.ts)
}
gimms.sum <- calc(gimms, fun)
gimms.sum=gimms.sum/1
plot(gimms.sum)
nlayers(gimms.sum)

#then the slope is calculated to get the direction and magnitude of trends, multiplied by the number of years to get the change in NDVI units:

time <- 1:nlayers(gimms.sum)
fun=function(x) { if (is.na(x[1])){ NA } else { m = lm(x ~ time); summary(m)$coefficients[2] }}
gimms.slope=calc(gimms.sum, fun)
gimms.slope=gimms.slope*25
plot(gimms.slope)

now we need to see which trends are significant. Thus we first extract the p-value:

fun=function(x) { if (is.na(x[1])){ NA } else { m = lm(x ~ time); summary(m)$coefficients[8] }}
p <- calc(gimms.sum, fun=fun)
plot(p, main="p-Value")

then mask all values >0.05 to get a confidence level of 95%:

m = c(0, 0.05, 1, 0.05, 1, 0)
rclmat = matrix(m, ncol=3, byrow=TRUE)
p.mask = reclassify(p, rclmat)
fun=function(x) { x[x<1] <- NA; return(x)}
p.mask.NA = calc(p.mask, fun)

and finaly mask all insignificant values in the trend map, so we only get NDVI change significant at the 95% level:

trend.sig = mask(gimms.slope, p.mask.NA)

plot(trend.sig, main="significant EVI change")

plot(shp, add=T)

evi.trend.sig <- trend.sig
plot(evi.trend.sig, main="significant EVI change")
plot(shp, add=T)

output<-mask(trend.sig, shp) # mask netcdf data using shp
output

The result could look like that:

The duration will vary depending on your data and computing configuration. If you have a large dataset and a multicore computer, you can use the clusterR function from within the raster package to speed up the process.

Using the sample NDVI data, this Mann-Kendall trend and its significance took 2 seconds on my laptop.

Using the sample NDVI data, this Mann-Kendall trend and its significance took 2 seconds on my laptop.

The function was developed for the paper below, so please cite it if you use any part of the code in a publication or if you build upon it.

# Citation:

Abdi, A. M., Boke-Olén, N., Jin, H., Eklundh, L., Tagesson, T., Lehsten, V., & Ardö, J. (2019). First assessment of the plant phenology index (PPI) for estimating gross primary productivity in African semi-arid ecosystems. International Journal of Applied Earth Observation and Geoinformation, 78, 249-260.

The function is available as a Gist on Github. The sample NDVI data I used to produce the figure above along with a tutorial script can be downloaded here. These sample data can conceivably be used in a study, if anyone is interested.

@antobaro00
Copy link

antobaro00 commented Jul 2, 2023 via email

@antobaro00
Copy link

@cbchisanga
Unfortunately, my e-mail address is censored.
In any case, I would need the procedure that can be used via R but in QGIS.
Could you explain how to do it?

Thank you very much
Antonio

@cbchisanga
Copy link

You can use QGIS 2.18.24. It has plugin for SPI Utility for Raster & SPI Utility for Vector. Read the manual on input data.
R scripts for computing drought indices are plenty.
For python install the climate-indices module in Miniconda or Anaconda or in QGIS. Using the python console.

@cbchisanga Unfortunately, my e-mail address is censored. In any case, I would need the procedure that can be used via R but in QGIS. Could you explain how to do it?

Thank you very much Antonio

You can use QGIS 2.18.24. It has plugin for SPI Utility for Raster & SPI Utility for Vector. Read the manual on input data.
R scripts for computing drought indices are plenty.
For python install the climate-indices module in Miniconda or Anaconda or in QGIS. Using the python console.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment