Skip to content

Instantly share code, notes, and snippets.

@Kcnarf
Last active December 3, 2018 08:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Kcnarf/89e1e69c888e8241ed92 to your computer and use it in GitHub Desktop.
Save Kcnarf/89e1e69c888e8241ed92 to your computer and use it in GitHub Desktop.
timeline - seasonality detection (I)
license: mit

If you suspect a time serie has a seasonality component, this is an example of how to validate/invalidate this hypothesis.

Seasonality means that the time serie has a periodic component, repeating the same pattern on each period. For example, sales of a store may have a week-based seasonality: sales increase on saturday, while there is no sale at all on sunday.

Graphically speaking, detecting a seasonality is (quite) easy: just look for a repeating pattern. Note that it could be difficult if the pattern has a long period, or/and the order of magnitude of the seasonilaty is low (ie. lowest and highest values are not so far from the season's mean, but in this case there might be no seasonality at all ! ).

Computationnaly speaking, one can use the autocorrelation technique. Correlation is a technique that allows to determine if 2 time series are correlated (ie. if they behave in the same way). Autocorrelation is the same technique applied to only 1 time serie: it allows to compare the correlation between a time serie and the same time serie shifted by a certain amount of time. Hence, if you suspect a seasonality of 7 days, then the autocorrelation between the time serie and the same time serie shifted by 7 days will validate (autocorrelation near 1) or invalidate (autocorrelation near 0) the seasonality.

Usages :

  • in the left graph, Drag & Drop points to update the timeline and create seasons of your choice (below the graph are some shortcuts)
  • then test for a particular seasonality period, and see how the coefficient of correlation behaves;
  • decrease the order of magnitude of the season component to see that when this order is small, then it becomes difficult to detect a season; this is because the coefficient of correlation is constantly high; even for the test of a suspected season's length that is not the real one.

Notes:

  • another block explains time series correlation
  • another block deals with the impact of seasonality when computing the trend of a timeline

Acknowledgments:

<!DOCTYPE html>
<meta charset="utf-8">
<style>
body {
position: relative;
background-color: #ddd;
margin: auto;
}
#under-construction {
display: none;
position: absolute;
top: 200px;
left: 300px;
font-size: 40px;
}
#controls {
position: absolute;
top: 420px;
right: 10px;
font: 11px arial;
text-align: right;
}
.viz {
position: absolute;
background-color: white;
border-radius: 10px;
}
.viz#timelines {
top: 5px;
left: 5px;
}
.viz#correlation-plot {
top: 5px;
right: 5px;
}
.viz#correlation {
top: 385px;
right: 5px;
}
.flow {
position: absolute;
font-size: 30px;
color: darkgrey;
}
.flow#flow1 {
top: 310px;
left: 470px;
}
.flow#flow2 {
top: 350px;
right: 435px;
}
.axis path,
.axis line {
fill: none;
stroke: black;
shape-rendering: crispEdges;
}
.axis text {
font-family: sans-serif;
font-size: 11px;
}
.grid>line, .grid>.intersect {
fill: none;
stroke: #ddd;
shape-rendering: crispEdges;
vector-effect: non-scaling-stroke;
}
.legend {
font-size: 12px;
}
.shift text {
font-size: 12px;
}
.shift path {
fill: none;
stroke: black;
shape-rendering: crispEdges;
}
.dot {
fill: steelblue;
stroke: white;
stroke-width: 3px;
}
.dot.correlated {
stroke: none;
opacity: 0.5;
}
.dot.draggable:hover, .dot.dragging {
fill: pink;
cursor: ns-resize;
}
.timeline {
fill: none;
stroke: lightsteelblue;
stroke-width: 2px;
}
.timeline.lagged {
opacity: 0.5;
}
.timeline.draggable:hover, .timeline.dragging {
stroke: pink;
opacity: 1;
cursor: ns-resize;
}
.trend {
stroke: steelblue;
stroke-width: 2px;
}
.trend.lagged {
stroke: green;
}
.trend.correlated {
stroke: grey;
}
#correlation-bar {
fill: grey;
}
</style>
<body>
<div id="timelines" class="viz">
<div id="controls">
update time serie with a seasonality's length of <a href="#" onclick="updateSeasonalityPeriod(2);">2</a> / <a href="#" onclick="updateSeasonalityPeriod(3);">3</a> / <a href="#" onclick="updateSeasonalityPeriod(4);">4</a> / <a href="#" onclick="updateSeasonalityPeriod(5);">5</a> / <a href="#" onclick="updateSeasonalityPeriod(6);">6</a> / <a href="#" onclick="updateSeasonalityPeriod(7);">7</a> / <a href="#" onclick="updateSeasonalityPeriod(8);">8</a> / <a href="#" onclick="updateSeasonalityPeriod(9);">9</a> / <a href="#" onclick="updateSeasonalityPeriod(10);">10</a> periods
<br/>
<br/>
test for a seasonality's length of <a href="#" onclick="testSeasonalityPeriod(2);">2</a> / <a href="#" onclick="testSeasonalityPeriod(3);">3</a> / <a href="#" onclick="testSeasonalityPeriod(4);">4</a> / <a href="#" onclick="testSeasonalityPeriod(5);">5</a> / <a href="#" onclick="testSeasonalityPeriod(6);">6</a> / <a href="#" onclick="testSeasonalityPeriod(7);">7</a> / <a href="#" onclick="testSeasonalityPeriod(8);">8</a> / <a href="#" onclick="testSeasonalityPeriod(9);">9</a> / <a href="#" onclick="testSeasonalityPeriod(10);">10</a> periods
<br/>
<br/>
<a href="#" onclick="disperse();">increase</a> / <a href="#" onclick="concentrate();">decrease</a> seasonality's order of magnitude
<br/>
</div>
</div>
<div id="correlation-plot" class="viz"></div>
<div id="correlation" class="viz"></div>
<div id="flow1" class="flow">&#8614;</div>
<div id="flow2" class="flow">&#8615;</div>
<div id="under-construction">
UNDER CONSTRUCTION
</div>
<script src="https://d3js.org/d3.v3.min.js"></script>
<script>
var timeSerie = [];
var laggedTimeSerie = [];
var correlatedSerie = [];
var suspectedSeasonLength = 4;
var WITH_TRANSITION = true;
var WITHOUT_TRANSITION = false
var duration = 500;
var timelineVizDimension = {width: 960/2, height:500},
correlationPlotVizDimension = {width: 960/2, height:370},
correlationVizDimension = {width: 960/2, height:130},
vizMargin = 5,
flowWidth = 20
legendHeight = 20,
xAxisLabelHeight = 10,
yAxisLabelWidth = 10,
margin = {top: 20, right: 20, bottom: 20, left: 20},
timelineSvgWidth = timelineVizDimension.width - 2*vizMargin - flowWidth/2,
timelineSvgHeight = timelineVizDimension.height - 2*vizMargin,
correlationPlotSvgWidth = correlationPlotVizDimension.width - 2*vizMargin - flowWidth/2,
correlationPlotSvgHeight = correlationPlotVizDimension.height - 2*vizMargin - flowWidth/2,
correlationSvgWidth = correlationVizDimension.width - 2*vizMargin - flowWidth/2,
correlationSvgHeight = correlationVizDimension.height - 2*vizMargin - flowWidth/2,
timelineWidth = timelineSvgWidth - margin.left - margin.right - yAxisLabelWidth,
// timelineHeight = timelineSvgHeight - margin.top - margin.bottom - xAxisLabelHeight;
correlationPlotWidth = correlationPlotSvgWidth - margin.left - margin.right - yAxisLabelWidth,
correlationPlotHeight = correlationPlotSvgHeight - margin.top - margin.bottom - xAxisLabelHeight,
timelineHeight = correlationPlotHeight,
correlationWidth = correlationSvgWidth - margin.left - margin.right,
correlationHeight = correlationSvgHeight - margin.top - margin.bottom - xAxisLabelHeight - 2*legendHeight;
var drag = d3.behavior.drag()
.origin(function(d) { return d; })
.on("dragstart", dragStarted)
.on("drag", dragged1)
.on("dragend", dragEnded);
var x = d3.scale.linear()
.domain([0, 20])
.range([0, timelineWidth]);
var y = d3.scale.linear()
.domain([0, 50])
.range([0, -timelineHeight]);
var xPlot = d3.scale.linear()
.domain([0, 50])
.range([0, correlationPlotWidth]);
var yPlot = d3.scale.linear()
.domain([0, 50])
.range([0, -correlationPlotHeight]);
var xCorrelation = d3.scale.linear()
.domain([-1, 1])
.range([0, correlationWidth]);
var xAxisDef = d3.svg.axis()
.scale(x)
.ticks(11);
var yAxisDef = d3.svg.axis()
.scale(y)
.orient("left");
var xAxisPlotDef = d3.svg.axis()
.scale(xPlot);
var yAxisPlotDef = d3.svg.axis()
.scale(yPlot)
.orient("left");
var xAxisCorrelationDef = d3.svg.axis()
.scale(xCorrelation)
.ticks(5);
var svg = d3.select("#timelines").append("svg")
.attr("width", timelineSvgWidth)
.attr("height", timelineSvgHeight)
.append("g")
.attr("transform", "translate(" + [margin.left, margin.top] + ")");
var container = svg.append("g")
.attr("id", "graph")
.attr("transform", "translate(" + [yAxisLabelWidth, timelineHeight] + ")");
var grid = container.append("g")
.attr("class", "grid");
var intersects = [];
d3.range(2, x.invert(timelineWidth), 2).forEach(function(a) { d3.range(5, y.invert(-timelineHeight),5).forEach(function(b) { intersects.push([a,b])})});
grid.selectAll(".intersect")
.data(intersects)
.enter().append("path")
.classed("intersect", true)
.attr("d", function(d) { return "M"+[x(d[0])-1,y(d[1])]+"h3M"+[x(d[0]),y(d[1])-1]+"v3"});
container.append("text")
.attr("transform", "translate(" + [timelineWidth/2, -timelineHeight] + ")")
.attr("text-anchor", "middle")
.text("Timelines");
container.append("g")
.attr("class", "axis x")
.call(xAxisDef)
.append("text")
.attr("x", timelineWidth)
.attr("y", -6)
.style("text-anchor", "end")
.text("Time");
container.append("g")
.attr("class", "axis y")
.call(yAxisDef)
.append("text")
.attr("transform", "rotate(-90)")
.attr("x", timelineHeight)
.attr("y", 16)
.style("text-anchor", "end")
.text("Amount");
var legend = container.append("g")
.classed("legend", true)
.attr("transform", "translate(" + 100 + "," + (xAxisLabelHeight+legendHeight) + ")");
var currentLegend = legend.append("g")
.attr("transform", "translate(" + 0 + ",0)");
currentLegend.append("line")
.classed("timeline", true)
.attr("x1", -20)
.attr("y1", -5)
.attr("x2", -5)
.attr("y2", -5);
currentLegend.append("text")
.attr("dx", 5)
.text(": raw timeline");
currentLegend = legend.append("g")
.attr("transform", "translate(" + 120 + ",0)");
currentLegend.append("line")
.classed("timeline lagged", true)
.attr("x1", -20)
.attr("y1", -5)
.attr("x2", -5)
.attr("y2", -5);
var laggedTimelineLegend = currentLegend.append("text")
.attr("dx", 5)
.text(": 4-period lagged timeline");
var shiftLegend = container.append("g")
.classed("shift", true)
.attr("transform", "translate(" + [x(1), y(1)] + ")");
shiftLegend.append("text")
.attr("transform", "translate(" + [x(4)+5,0] + ")")
.text("suspected season's length");
shiftLegend.append("path")
.attr("d", "M0,0h"+x(4)+"l-3,-3");
var timeline1 = container.append("path")
.datum(1)
.classed("timeline serie1", true)
.attr("d", line1);
var timeline2 = container.append("path")
.datum(2)
.classed("timeline lagged", true)
.attr("d", line2);
var dotContainer = container.append("g")
.classed("dots", true);
svg = d3.select("#correlation-plot").append("svg")
.attr("width", correlationPlotSvgWidth)
.attr("height", correlationPlotSvgHeight)
.append("g")
.attr("transform", "translate(" + [margin.left, margin.top] + ")");
var container2 = svg.append("g")
.attr("id", "graph correlated")
.attr("transform", "translate(" + [yAxisLabelWidth, correlationPlotHeight] + ")");
var grid2 = container2.append("g")
.attr("class", "grid");
var intersects = [];
d3.range(5, xPlot.invert(correlationPlotWidth), 5).forEach(function(a) { d3.range(5, yPlot.invert(-correlationPlotHeight),5).forEach(function(b) { intersects.push([a,b])})});
grid2.selectAll(".intersect")
.data(intersects)
.enter().append("path")
.classed("intersect", true)
.attr("d", function(d) { return "M"+[xPlot(d[0])-1,yPlot(d[1])]+"h3M"+[xPlot(d[0]),yPlot(d[1])-1]+"v3"});
container2.append("text")
.attr("transform", "translate(" + [correlationPlotWidth/2, -correlationPlotHeight] + ")")
.attr("text-anchor", "middle")
.text("Scatter plot");
var scatterPlotSubTitle = container2.append("text")
.attr("transform", "translate(" + [correlationPlotWidth/2, -correlationPlotHeight+15] + ")")
.attr("text-anchor", "middle")
.style("font-size", "12px")
.text("between the raw time serie and its 4-periods lagged version");
container2.append("g")
.attr("class", "axis x")
.call(xAxisPlotDef)
.append("text")
.attr("x", correlationPlotWidth)
.attr("y", -6)
.style("text-anchor", "end")
.text("Amount (from time serie)");
container2.append("g")
.attr("class", "axis y")
.call(yAxisPlotDef)
.append("text")
.attr("transform", "rotate(-90)")
.attr("x", correlationPlotHeight)
.attr("y", 16)
.style("text-anchor", "end")
.text("Amount (from lagged time serie)");
var correlatedDotContainer = container2.append("g")
.classed("dots correlated", true);
var correlatedTrendLine = container2.append("line")
.attr("class", "trend correlated")
.attr("x1", xPlot(0))
.attr("y1", yPlot(0))
.attr("x2", xPlot(50))
.attr("y2", yPlot(50));
container = d3.select("#correlation").append("svg")
.attr("width", correlationSvgWidth)
.attr("height", correlationSvgHeight)
.append("g")
.attr("transform", "translate(" + [margin.left, margin.top] + ")");
var graph3 = container.append("g")
.attr("id", "graph correlation")
.attr("transform", "translate(" + [0, correlationHeight] + ")");
var correlationTitle = graph3.append("text")
.attr("transform", "translate(" + [correlationWidth/2, -correlationHeight] + ")")
.attr("text-anchor", "middle")
.text("Coefficient of 4-periods autocorrelation");
graph3.append("g")
.attr("class", "axis x")
.call(xAxisCorrelationDef);
var legend3 = container.append("g")
.classed("legend", true)
.attr("transform", "translate(" + [0, correlationHeight+xAxisLabelHeight] + ")");
legend3.append("g")
.selectAll("text")
.data(["Perfect", "Inverted", "Correlation"])
.enter().append("text")
.attr("x", 0)
.attr("y", function(d,i) { return 25 + i*10; })
.style("text-anchor", "start")
.text( function(d) { return d; });
legend3.append("g")
.attr("transform", "translate(" + [correlationWidth/2, 0] + ")")
.selectAll("text")
.data(["No", "Correlation"])
.enter().append("text")
.attr("x", 0)
.attr("y", function(d,i) { return 25 + i*10; })
.style("text-anchor", "middle")
.text( function(d) { return d; });
legend3.append("g")
.attr("transform", "translate(" + [correlationWidth, 0] + ")")
.selectAll("text")
.data(["Perfect", "Correlation"])
.enter().append("text")
.attr("x", 0)
.attr("y", function(d,i) { return 25 + i*10; })
.style("text-anchor", "end")
.text( function(d) { return d; });
var correlationBar = graph3.append("path")
.attr("id", "correlation-bar")
.attr("d", "M"+[xCorrelation(0), -1]+"v-10H"+xCorrelation(0.34)+"v10z")
d3.csv("timeserie.csv", dottype, function(error, dots) {
updateLaggedTimeSerie();
updateCorrelatedSerie();
updateDots(WITHOUT_TRANSITION);
updateCorrelatedDots(WITHOUT_TRANSITION);
updateTimelines(WITHOUT_TRANSITION);
updateTrends(WITHOUT_TRANSITION);
});
function dottype(d) {
d.x = +d.x;
d.y = +d.y;
timeSerie.push(d);
return d;
}
function updateLaggedTimeSerie() {
serieLength = timeSerie.length;
var newLaggedTimeSerie = [];
timeSerie.forEach(function(d,i) {
if (i<serieLength-suspectedSeasonLength) {
newLaggedTimeSerie.push({x: i+suspectedSeasonLength+1, y: d.y});
}
});
laggedTimeSerie = newLaggedTimeSerie;
}
function updateCorrelatedSerie() {
var newCorrelatedSerie = laggedTimeSerie.map(function(d,i) {
return {y: timeSerie[i+suspectedSeasonLength].y, laggedY:d.y }
})
correlatedSerie = newCorrelatedSerie
}
var line1 = d3.svg.line()
.x(function(d) { return x(d.x); })
.y(function(d) { return y(d.y); });
var line2 = d3.svg.line()
.x(function(d) { return x(d.x); })
.y(function(d) { return y(d.y); });
function updateSeasonalityPeriod(newSeasonPeriod) {
var trend = 1.6;
var intercept = 10;
var expected;
timeSerie.forEach(function(d,i) {
expected = trend*d.x+intercept;
switch (i%newSeasonPeriod) {
case 0: expected -= 7; break;
case 1: expected += 2; break;
case (newSeasonPeriod-1): expected += 5; break;
}
d.y = expected+2*(Math.random()-0.5);
})
updateLaggedTimeSerie();
updateCorrelatedSerie();
updateDots(WITH_TRANSITION);
updateTimelines(WITH_TRANSITION);
updateCorrelatedDots(WITH_TRANSITION);
updateTrends(WITH_TRANSITION);
}
function testSeasonalityPeriod(seasonLength) {
suspectedSeasonLength = seasonLength;
updateLaggedTimeSerie();
updateCorrelatedSerie();
updateDots(WITH_TRANSITION);
updateTimelines(WITH_TRANSITION);
updateCorrelatedDots(WITH_TRANSITION);
updateTrends(WITH_TRANSITION);
}
function changeDispersion(scale) {
var serieLength = timeSerie.length;
var timeInterval = 1;
var ySum = 0;
var timeYSum = 0;
timeSerie.forEach(function(d){
ySum += d.y;
timeYSum += d.x*d.y;
});
var trend = (12*timeYSum - 6*(serieLength+1)*ySum)/(timeInterval*serieLength*(serieLength-1)*(serieLength+1));
var intercept = (2*(2*serieLength+1)*ySum - 6*timeYSum)/(serieLength*(serieLength-1));
var expected;
timeSerie.forEach(function(d){
expected = d.x*trend + intercept;
d.y = expected + scale*(d.y-expected);
});
updateLaggedTimeSerie();
updateCorrelatedSerie();
updateDots(WITH_TRANSITION);
updateTimelines(WITH_TRANSITION);
updateCorrelatedDots(WITH_TRANSITION);
updateTrends(WITH_TRANSITION);
}
function disperse() {
changeDispersion(1.6)
}
function concentrate() {
changeDispersion(0.625)
}
function updateDots(withTransition) {
var dots = dotContainer.selectAll(".dot.serie1")
.data(timeSerie);
dots.enter()
.append("circle")
.classed("dot draggable serie1", true)
.attr("r", 5)
.attr("cx", function(d) { return x(d.x); })
.call(drag);
dots.transition()
.duration(withTransition? duration : 0)
.attr("cy", function(d) { return y(d.y); })
}
function updateTimelines(withTransition) {
laggedTimelineLegend.text(": "+suspectedSeasonLength+"-periods lagged timeline");
shiftLegend.select("text").transition()
.duration(withTransition? duration : 0)
.attr("transform", "translate("+[x(suspectedSeasonLength)+5,0]+")");
shiftLegend.select("path").transition()
.duration(withTransition? duration : 0)
.attr("d", "M0,0h"+x(suspectedSeasonLength)+"l-3,-3");
timeline1.transition()
.duration(withTransition? duration : 0)
.attr("d", line1(timeSerie));
timeline2.transition()
.duration(withTransition? duration : 0)
.attr("d", line2(laggedTimeSerie));
}
function updateCorrelatedDots(withTransition) {
scatterPlotSubTitle.text("between the time serie and its " +suspectedSeasonLength+"-periods lagged version")
var correlatedDots = correlatedDotContainer.selectAll(".dot.correlated")
.data(correlatedSerie);
correlatedDots.enter()
.append("circle")
.classed("dot correlated", true)
.attr("r", 3.5);
correlatedDots.transition()
.duration(withTransition? duration : 0)
.attr("cx", function(d) { return xPlot(d.y); })
.attr("cy", function(d) { return yPlot(d.laggedY); })
correlatedDots.exit().remove();
}
function updateTrends(withTransition) {
var correlatedSerieLength = correlatedSerie.length;
var timeInterval = 1
var ySum = 0;
var squareYSum = 0;
var laggedYSum = 0;
var squareLaggedYSum = 0;
// below sums are for trend lines and correlation
var yLaggedYSum = 0;
correlatedSerie.forEach(function(d){
ySum += d.y;
squareYSum += Math.pow(d.y, 2);
laggedYSum += d.laggedY;
squareLaggedYSum += Math.pow(d.laggedY, 2);
yLaggedYSum += (d.y)*(d.laggedY);
});
var yMean = ySum/correlatedSerieLength;
var laggedYMean = laggedYSum/correlatedSerieLength;
var yVariance = squareYSum/correlatedSerieLength - Math.pow(yMean, 2);
var laggedYVariance = squareLaggedYSum/correlatedSerieLength - Math.pow(laggedYMean, 2);
var yStdDev = Math.pow(yVariance, 0.5)
var laggedYStdDev = Math.pow(laggedYVariance, 0.5)
var yLaggedYCovariance = yLaggedYSum/correlatedSerieLength - yMean*laggedYMean;
var correlatedTrend = yLaggedYCovariance/(yVariance);
var correlatedIntercept = laggedYMean - correlatedTrend*yMean;
var correleationCoefficient = yLaggedYCovariance/(yStdDev*laggedYStdDev);
correlatedTrendLine
.transition()
.duration(withTransition? duration : 0)
.attr("y1", yPlot(correlatedIntercept))
.attr("y2", yPlot(correlatedTrend*50+correlatedIntercept));
correlationTitle.text("Coefficient of "+suspectedSeasonLength+"-periods autocorrelation")
correlationBar
.transition()
.duration(withTransition? duration : 0)
.attr("d", "M"+[xCorrelation(0), -1]+"v-10H"+xCorrelation(correleationCoefficient)+"v10z")
}
function dragStarted(d) {
d3.select(this).classed("dragging", true);
}
function dragged1(d) {
d.y += y.invert(d3.event.dy)
updateLaggedTimeSerie();
updateCorrelatedSerie();
updateDots(WITHOUT_TRANSITION);
updateTimelines(WITHOUT_TRANSITION);
updateCorrelatedDots(WITHOUT_TRANSITION);
updateTrends(WITHOUT_TRANSITION);
}
function dragEnded(d) {
d3.select(this).classed("dragging", false);
}
</script>
x y
1 7
2 15
3 16
4 25
5 11
6 20
7 20
8 30
9 16
10 24
11 25
12 35
13 20
14 30
15 30
16 42
17 25
18 33
19 34
20 44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment