Created
August 18, 2014 21:44
-
-
Save grandadmiral-thrawn/7746d4a3a5f85c62706a to your computer and use it in GitHub Desktop.
function for computing periods of "stationarity" following a known measurement in sparse data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Do you have lots of meteorlogical data? | |
Most of it's missing? | |
Maybe your sensor is buried! Maybe your battery is low! Or maybe the weather is just really, really boring. | |
Fear not. stationarity.m is here to save the day | |
just feed it a vector of differences between subsequent measurements (using the diff() function is the easiest way!) , the number of intervals you wish to test across (for example, 100 intervals), a date-num as a vector, and the cumulative difference against which you wish to test! | |
The goal is that the sum of the absolute values of the differences in your nan-filled time series will exceed the cumulative difference, if it does not, the time series will be pulled out as "bad date" and "bad vals"-- then you can do with it as you want. | |
This algorithm is simply for exploration but I've found it to be performant. About 5-10x the standard deviation of a 5 minute interval seems to be sufficient for my data at a scale of 12 hours, but your mileage may vary. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function [dvbad, dvalbad] = stationarity(cumdiff, intervals, vector, daysvec); | |
%%% STATIONARITY/4 takes an input of a cumulative difference over an | |
%%% interval desired, the intervals (number of measurements) to assess, a | |
%%% vector containing DIFFERENCES between subsequent measurements (i.e. | |
%%% diff(measurement vector) and a vector of date nums (daysvec) and | |
%%% produces and output of the datevec of the days/times where this cumulative | |
%%% difference is not reached over the subsequent INTERVALS and the values | |
%%% during that time. | |
testvec = zeros(length(vector)-intervals,1); | |
for i = 1:length(testvec) | |
testvec(i,1) = nansum(abs(vector(i:i+intervals-1)))<=cumdiff; | |
end | |
dnbad = daysvec(testvec == 1); | |
dvbad = datevec(dnbad); | |
dvalbad = vector(testvec ==1); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment