Skip to content

Instantly share code, notes, and snippets.

@galvanic
Last active September 21, 2015 16:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save galvanic/2eb5043ea7c2dd845975 to your computer and use it in GitHub Desktop.
Save galvanic/2eb5043ea7c2dd845975 to your computer and use it in GitHub Desktop.
user journey visualisation
*/
getDataAWS.js

This is a visualisation made using D3.js for visualising marketing data about user journeys. A demo can be found here: http://bl.ocks.org/galvanic/2eb5043ea7c2dd845975.

The data describes each unique journey. A journey is a series of touches at pre-defined touchpoints. Each touchpoint is represented by a letter, for conciseness. For example, 'A' could be 'viewed ad #5 online'. Representing the touchpoints as letters has the advantage of being a quick visual shortcut, and also of doing further analysis on the string (e.g. string matching). The granularity of these touchpoints is chosen earlier on in the pipeline, at the Spark stage (at least at this stage in the project). The order of the touches is important.

The data looks like this:

  • FunnelLetters: eg. "ABBBCAABCC". A string of letters represent the sequence of touches - to be changed to JourneyLetters
  • sessionCount: eg. 589. A natural (positive integer) number that represents the number of 'marketing sessions' who followed that journey
  • conversionCount: eg. 15. A natural number, lower than the session count, that represents the percentage of sessions going through that journey who have converted.

There is an extra item in the table under the hash key "===" that stores the key from letter to touchpoint in a letterToTouchpoint attribute. letterToTouchpoint is a Map. Keys are letters and values are the corresponding touchpoint description.

There are also some assumptions about the number of journeys, the number of unique touchpoints and the number of touches per journey for each randomly generated dataset; see the generateData.js file. While the conversion rate can technically be between 0 and 1, conversion rates are more likely to be between 0 and 5% so this is reflected in the upper limit when generating random data.

function generateDataset(seed) {
var chance = new Chance(seed)
// params
var numJourneys = 100 // chance.integer({ min: 5, max: 20 })
var numTouchpoints = 8 // chance.integer({ min: 3, max: 6 })
var numTouches = { min: 1, max: 10 }
var numSessions = { min: 1, max: 100000 }
var maxConversionRate = 0.06
var marketingChannels = ['Social', 'Email', 'Advertising', 'Web', 'Event', 'TV', 'Content', 'Partners', 'Search Marketing', 'Mobile']
var touchpointPool = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.slice(0, numTouchpoints)
// for each journey:
function getNumSessions() {
return chance.integer(numSessions)
}
function getConversionRate() {
return chance.floating({ min: 0, max: maxConversionRate, fixed: 4 })
}
function getTouches(touchpointPool) {
return chance.string({
pool: touchpointPool
// , length: chance.weighted(d3.range(numTouches.min, numTouches.max), [10,10,1,1,1,1,1,1,1,1]) // only in version 0.7.6
, length: chance.pick(d3.range(numTouches.min, numTouches.max)
.concat(Array.apply(null, Array(30)).map(Number.prototype.valueOf,1))
.concat(Array.apply(null, Array(15)).map(Number.prototype.valueOf,2)))
})
}
function getLegend(touchpointPool) {
// zips letter to channels name
var legend = {}
for (var i = 0; i < touchpointPool.length; i++) {
legend[touchpointPool[i]] = { S: marketingChannels[i] }
}
return legend
}
function generateJourney(touchpointPool) {
var sessionCount = getNumSessions()
return {
'FunnelLetters': { S: getTouches(touchpointPool) } // Funnel instead of Journey to mirror data in DynamoDB
, 'sessionCount': { N: sessionCount }
, 'conversionCount': { N: getConversionRate() * sessionCount }
}
}
var dataset = new Array(numJourneys)
for (var i = 0; i < dataset.length; i++) {
dataset[i] = generateJourney(touchpointPool)
}
dataset[0].letterToTouchpoint = { M: getLegend(touchpointPool) }
return {
'Count': dataset.length
, 'Items': dataset
, 'ScannedCount': dataset.length
}
}
<!DOCTYPE html>
<html>
<head>
<title>Visualisation</title>
<link rel='stylesheet' type='text/css' href='style.css'>
<script src='https://cdn.rawgit.com/mbostock/d3/master/lib/colorbrewer/colorbrewer.js'></script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.js' charset='utf-8'></script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/chance/0.5.6/chance.min.js'></script>
<script src='https://cdnjs.cloudflare.com/ajax/libs/aws-sdk/2.1.34/aws-sdk.js'></script>
<script src='generateData.js'></script>
<script src='journeysChart.js'></script>
<script src='main.js'></script>
</head>
<body>
<div id='container'>
<div id='vis'></div>
</div>
</body>
</html>
// A good understanding of important D3 concepts (selections, scales, transitions, etc.) is helpful to understand this code
// see: https://github.com/mbostock/d3/wiki/Tutorials
// follows the reusable chart convention - see http://bost.ocks.org/mike/chart/
function JourneysChart() {
// assumptions about the data passed to this chart:
// i.e. dataset of each selection
// -> list of objects. each object has the following keys:
// -> `letters` an Array of characters from the journey string, order preserved
// -> `conversionRate` a Float between 0.0001 and 1 (in practice, 0.1)
// -> `sessionCount` an Integer between 1 and 100000
//
// DEFAULT CONFIGURATION PARAMETERS
//
var margin = {top: 30, bottom: 30, left: 50, right: 50}
var totalHeight = 700
var spaceBetweenYAxisAndChart = 40
var minBarHeight = 5
var maxBarHeight = 50
var viewportWidth = 20
var barTipLength = 5
var maxScalingFactor = 400
var legendTitle = 'Channels touched'
var legendWidth = 120
var legendXPos = 2
var legendYPos = 175
var legendKeyHeight = 13
var legendKeyWidth = 20
var spaceBetweenLegendKeys = 3
var spaceBetweenLegendTitleAndKeys = 20
var spaceBetweenSwatchAndText = 5
//
// CONFIGURATION PARAMETERS - calculated programmatically
//
var height = totalHeight - margin.top - margin.bottom
//
// SCALES - that don't depend on data
//
var yScaleOnAxis = d3.scale.linear()
.domain([0, 1])
.range([height, 0])
var yScaleOnChart = d3.scale.linear()
.domain(yScaleOnAxis.domain())
.range([height, 0])
var yAxis = d3.svg.axis()
.scale(yScaleOnAxis)
.orient('left')
.ticks(2)
.tickSize(5, 0)
.tickPadding(10)
.tickFormat(d3.format('%'))
//
// returned function that draws the whole chart
//
function drawChart(selection) {
selection.each(function(dataset, i) {
//
// CONFIGURATION PARAMETERS - calculated programmatically
//
var containerWidth = this.clientWidth
var isScreenSmall = window.innerWidth < 550
var totalWidth = isScreenSmall ? containerWidth : containerWidth - legendWidth - 5// bit of a hack to make legend go to left when screen big enough
var width = totalWidth - margin.left - margin.right
//
// SCALES - that don't depend on data
//
var journeyStepXPos = d3.scale.ordinal()
.domain(d3.range(dataset.longestJourneyLength))
.rangeRoundBands([0, width], 0.1, 0)
var touchpointToColorScale = d3.scale.ordinal()
.domain(dataset.touchpoints.map(function(d) { return d.key }))
.range(colorbrewer.Dark2[dataset.touchpoints.length])
var countToBarHeightScale = d3.scale.linear()
.domain([1, d3.max(dataset.values, function(d) { return d.sessionCount })])
.range([minBarHeight, maxBarHeight])
//
// LEGEND
//
// draw legend before chart so that it can be floated left
// legend is selected at each chart update in order to update its keys
var legend = d3.select(this).selectAll('svg.legend')
.data(function(dataset) { return [dataset.touchpoints] })
legendEnter = legend.enter()
.append('svg')
.classed('legend', true)
.attr('width', legendWidth)
legendEnter
.append('text')
.text(legendTitle)
.attr('alignment-baseline', 'before-edge')
legendEnter
.append('g')
.classed('legend keys', true)
.attr('transform', 'translate(' + 1 // to align with title
+ ',' + spaceBetweenLegendTitleAndKeys + ')')
var legendKeys = legend.select('g.legend.keys').selectAll('g.legend.key')
.data(function(d) { return d }, function(d) { return d.value })
var legendKeysEnter = legendKeys.enter()
.append('g')
.classed('legend key', true)
.attr('transform', function(d,i) {
return 'translate(' + 0 + ',' + (i * (legendKeyHeight + spaceBetweenLegendKeys)) + ')' }
)
legendKeysEnter
.append('rect')
.attr('width', legendKeyWidth)
.attr('height', legendKeyHeight)
.style('fill', function(d) { return touchpointToColorScale(d.key) })
legendKeysEnter
.append('text')
.text(function(d) { return d.value })
.classed('label', true)
.style('alignment-baseline', 'central')
.attr('x', legendKeyWidth + spaceBetweenSwatchAndText)
.attr('y', legendKeyHeight/2)
legendKeysEnter
.append('title')
.text(function(d) { return d.key })
legendKeys.exit()
.remove()
//
// THE WHOLE CHART - includes y axis (but not legend)
//
// each svg.chart element is bound to one dataset
// in practice there is one dataset we are passing to the chart, so only one svg will be created, and then updated when new data is passed
var wholeChart = d3.select(this).selectAll('svg.chart')
.data(function(dataset) {
return [dataset.values.sort(function(a,b) { return b.sessionCount - a.sessionCount })]
}) // sort so that the smallest bars are drawn last and therefore on top of larger ones
var wholeChartEnter = wholeChart.enter()
.append('svg')
.classed('chart', true)
.attr('width', totalWidth)
.attr('height', totalHeight)
//
// MAIN CHART
//
wholeChartEnter
.append('g')
.classed('innerchart', true)
.attr('transform', 'translate(' + (margin.left + spaceBetweenYAxisAndChart) + ',' + margin.top + ')')
var innerchart = wholeChart.select('g.innerchart')
//
// Y AXIS
//
var yAxisG = wholeChartEnter
.append('g')
.classed('y axis', true)
.attr('transform', 'translate(' + margin.left + ',' + margin.top + ')')
.call(yAxis)
yAxisG
.append('text')
.classed('label', true)
.text('Conversion rate')
.style('text-anchor', 'end')
.attr('transform', 'rotate(270)')
.attr('y', -30)
.attr('x', -20)
//
// VIEWPORT - brush that enables the zoom and scroll effect
//
var viewport = d3.svg.brush()
.y(yScaleOnAxis)
.on('brush', function() { // 'brush' is D3 terminology for click-and-dragging the viewport on the y axis
// if viewport isn't present, change scale of chart to the scale of y axis
// otherwise change scale of chart to the range covered by the viewport on the y axis
// reading about D3 Scales would help to understand, for example: http://bost.ocks.org/mike/bar/#scaling
yScaleOnChart.domain(viewport.empty() ? yScaleOnAxis.domain() : viewport.extent())
drawInnerChart({ barTransitionDuration: 50 }) // very short duration to make zoom & scroll smooth
links
.transition()
.duration(50)
.attr('y2', function(d) { return yScaleOnChart(d.conversionRate) })
})
.on('brushend', function() {
zoomListener.y(yScaleOnChart)
// update scaleExtent otherwise can't zoom back out with mouse in innnerchart
var fullDomainExtent = yScaleOnAxis.domain()[1] - yScaleOnAxis.domain()[0]
var currentDomainExtent = yScaleOnChart.domain()[1] - yScaleOnChart.domain()[0]
var minScale = currentDomainExtent / fullDomainExtent
var maxScale = minScale * maxScalingFactor
zoomListener.scaleExtent([minScale, maxScale])
})
var viewportElement = yAxisG
.append('g')
.classed('viewport', true)
.call(viewport.extent(yScaleOnAxis.domain()))
viewportElement
.selectAll('rect')
.attr('x', -viewportWidth)
.attr('width', viewportWidth)
.style('fill', 'gray')
.style('opacity', 0.5)
//
// ARROWS
//
var numPixelsOnChartMovedByArrow = 20
yAxisG.append('polygon')
.attr('transform', 'translate(' + (-10) + ',' + (-5) + ')')
.attr('points', '-10,0 0,-10 10,0')
.on('click', function() {
moveViewportByXPixelsOnChart('up', numPixelsOnChartMovedByArrow)
})
yAxisG.append('polygon')
.attr('transform', 'translate(' + (-10) + ',' + (5 + height) + ')')
.attr('points', '-10,0 0,10 10,0')
.on('click', function() {
moveViewportByXPixelsOnChart('down', numPixelsOnChartMovedByArrow)
})
function moveViewportByXPixelsOnChart(direction, numPixels) {
switch (direction) {
case 'up': // move viewport up
directionFunc = function(d, offset) { return d+offset }
break
case 'down': // move viewport down
directionFunc = function(d, offset) { return d-offset }
break
default:
directionFunc = function(d) { return d }
}
var offset = yScaleOnChart.invert(0) - yScaleOnChart.invert(numPixels)
var newViewportExtent = viewport.extent().map(function(d) {
return directionFunc(d, offset)
})
if (newViewportExtent[1] >= yScaleOnAxis.domain()[1] || newViewportExtent[0] <= yScaleOnAxis.domain()[0]) {
// TODO still move it but only by what is left to close the gap
return
}
viewportElement
.call(viewport.extent(newViewportExtent)) // move viewport rectangle
.call(viewport.event) // send brush events so that chart updates too
}
//
// ZOOM ON CHART
//
var zoomListener = d3.behavior.zoom()
.y(yScaleOnChart)
.scaleExtent([1, maxScalingFactor]) // do not change the first value, must always be 1; last value can be changed to increase the scale factor
.on('zoom', function() {
if (yScaleOnChart.domain()[0] < 0) {
var yVector = zoomListener.translate()[1] - yScaleOnChart(0) + yScaleOnChart.range()[0]
zoomListener.translate([0, yVector])
} else if (yScaleOnChart.domain()[1] > 1) {
var yVector = zoomListener.translate()[1] - yScaleOnChart(1) + yScaleOnChart.range()[1]
zoomListener.translate([0, yVector])
}
drawInnerChart({ barTransitionDuration: 50 })
links
.transition()
.duration(50)
.attr('y2', function(d) { return yScaleOnChart(d.conversionRate) })
d3.select('g.viewport').call(viewport.extent(yScaleOnChart.domain())) // updates viewport
})
wholeChartEnter
.append('rect')
.classed('innerchart zoom pane', true)
.attr('transform', 'translate(' + (margin.left + spaceBetweenYAxisAndChart) + ',' + margin.top + ')')
.attr('width', width)
.attr('height', height)
.call(zoomListener)
//
// LINK LAYER - links between y axis and chart journeys
//
wholeChartEnter
.append('g')
.classed('linklayer', true)
.attr('transform', 'translate(' + margin.left + ',' + margin.top + ')')
var links = wholeChart.select('g.linklayer').selectAll('line')
.data(function(d) { return d }, function(d) { return d.letters })
links
.transition()
.duration(1900)
.attr('y1', function(d) { return yScaleOnAxis(d.conversionRate) })
.attr('y2', function(d) { return yScaleOnChart(d.conversionRate) })
links.enter()
.append('line')
.attr('x1', 1)
.attr('y1', function(d) { return yScaleOnAxis(d.conversionRate) })
.attr('x2', 1)
.attr('y2', function(d) { return yScaleOnChart(d.conversionRate) })
.transition()
.delay(1900-500)
.duration(400)
.attr('x2', spaceBetweenYAxisAndChart - barTipLength + 1)
links.exit()
.transition()
.delay(100)
.duration(200)
.attr('x1', spaceBetweenYAxisAndChart - barTipLength + 1)
.remove()
//
// DENSITY PLOT - circles on y axis that convey conversionRate density
//
wholeChartEnter
.append('g')
.classed('density plot', true)
.attr('transform', 'translate(' + margin.left + ',' + margin.top + ')')
var circleTicks = wholeChart.select('g.density.plot').selectAll('circle')
.data(function(d) { return d }, function(d) { return d.letters })
circleTicks
.transition()
.duration(1900)
.attr('cy', function(d) { return yScaleOnAxis(d.conversionRate) })
circleTicks.enter()
.append('circle')
.attr('cy', function(d) { return yScaleOnAxis(d.conversionRate) })
.attr('r', 0)
.style('opacity', 0)
.transition()
.delay(1900-500-100)
.duration(500)
.attr('r', 5)
.style('opacity', 0.05)
circleTicks.exit()
.transition()
.duration(500)
.attr('r', 0)
.style('opacity', 0)
.remove()
//
// THE CHART - in its own function so that it call be redrawn when needed (not only when new dataset passed in)
//
// bring viewport and chart back to full domain when new dataset passed in
d3.select('g.viewport')
.call(viewport.extent(yScaleOnAxis.domain()))
yScaleOnChart.domain(viewport.extent())
drawInnerChart() // draw the inner chart when new dataset passed in
function drawInnerChart(params) {
// this function is called every time the chart must be redrawn on screen
// i.e. when the viewport changes (zoom or scroll) and at new dataset
var params = params || {}
var barTransitionDuration = params.barTransitionDuration || 1900
var journeys = innerchart.selectAll('g.journey')
.data(function(d) { return d }, function(d) { return d.letters }) // the second function, the 'key' function, determines that each journey is uniquely identified by its letter sequence
journeys
.transition()
.duration(barTransitionDuration)
.attr('transform', function(d) { return 'translate(' + 0 + ',' + yScaleOnChart(d.conversionRate) + ')' })
var journeysEnter = journeys.enter()
.append('g')
.classed('journey', true)
.attr('transform', function(d) { return 'translate(' + 0 + ',' + (-200) + ')' })
journeysEnter
.append('line')
.attr('x1', -barTipLength + 1)
.attr('x2', width + barTipLength - 1)
journeysEnter
.transition()
.duration(barTransitionDuration)
.attr('transform', function(d) { return 'translate(' + 0 + ',' + yScaleOnChart(d.conversionRate) + ')' })
journeys.exit()
.transition()
.delay(300)
.duration(barTransitionDuration)
.attr('transform', 'translate(' + 0 + ',' + (totalHeight + 200) + ')')
.remove()
var journeySteps = journeys.selectAll('rect')
.data(function(d) { return d.letters }, function(d,i) { return d+i.toString() })
journeySteps.enter()
.append('rect')
.classed('journeyStep step', true)
.style('fill', function(d) { return touchpointToColorScale(d) })
.style('stroke', function(d) {
return d3.rgb(touchpointToColorScale(d)).darker()
})
.append('title')
.text(function(d) { return d })
journeySteps
.transition()
.delay(500)
.attr('x', function(d,i) { return journeyStepXPos(i) })
.attr('y', function(d) { return -getBarHeightFromParent(this)/2 })
.attr('width', journeyStepXPos.rangeBand())
.attr('height', function(d) { return getBarHeightFromParent(this) })
}
//
// HELPER FUNCTIONS
//
function getBarHeightFromParent(node) {
return countToBarHeightScale(
d3.select(node.parentNode).datum().sessionCount
)
}
})
}
return drawChart
}
'use strict'
function transformDynamoDBDatum(d) {
// Returns flattened datum that can then be used by `transformDatum`
return {
journeyLetters: d.FunnelLetters.S
, sessionCount: +d.sessionCount.N
, conversionCount: +d.conversionCount.N
}
}
function transformDatum(d) {
// Returns transformed datum that can be used by the d3 chart cleanly
// useful for this to be in a seperate function for use on data from csv files
return {
letters: d.journeyLetters.toUpperCase().split('')
, sessionCount: +d.sessionCount
, conversionRate: +d.conversionCount/+d.sessionCount
}
}
function getLegendKeys(data) {
var legendKeys = data.Items.filter(function(d) { return d.letterToTouchpoint })[0].letterToTouchpoint.M
var legendKeyValuePairs = []
for (var key in legendKeys) {
if (legendKeys.hasOwnProperty(key)) {
legendKeyValuePairs.push({
'key': key
, 'value': legendKeys[key].S
})
}
}
return legendKeyValuePairs
}
function cleanDataFromDynamoDB(data) {
return data.Items
.filter(function(d) { return d.sessionCount && d.conversionCount })
.map(transformDynamoDBDatum)
.map(transformDatum)
}
function computeDatasetProperties(data, legendKeys) {
// Returns a dataset object with properties of the dataset and the data in the 'values' property
// fallback in case touchpoints key wasn't present in DynamoDB table
var uniqueTouchpoints = d3.set( // get unique touchpoints from all the journeys
[].concat.apply([], data.map(function(d) { return d.letters })) // flatMap
).values()
var touchpointsFallback = uniqueTouchpoints.map(function(d) {
return { 'key': d, 'value': d }
})
return {
values: data
, longestJourneyLength: d3.max(data, function(d) { return d.letters.length })
, touchpoints: legendKeys || touchpointsFallback
}
}
function prepareDataFromDynamoDB(data) {
var legendKeys = getLegendKeys(data)
var dataset = cleanDataFromDynamoDB(data)
var datasetWithProperties = computeDatasetProperties(dataset, legendKeys)
return datasetWithProperties
}
function draw(dataset) {
// config params
// the chart
var chart = JourneysChart() // You can add configuration to the chart by adding chained methods here; JourneysChart() returns a function that draws the configured chart using the data passed to it by the selection
d3.select('div#vis')
.datum(dataset)
.call(chart)
}
window.onload = function() {
var data = generateDataset(Math.random)
draw(prepareDataFromDynamoDB(data))
}
{
"Count": 5,
"Items": [
{
"FunnelLetters": {
"S": "A"
},
"conversionCount": {
"N": 54
},
"sessionCount": {
"N": 234
}
},
{
"FunnelLetters": {
"S": "B"
},
"conversionCount": {
"N": 155
},
"sessionCount": {
"N": 345
}
},
{
"FunnelLetters": {
"S": "C"
},
"conversionCount": {
"N": 79
},
"sessionCount": {
"N": 89
}
},
{
"FunnelLetters": {
"S": "AC"
},
"conversionCount": {
"N": 797
},
"sessionCount": {
"N": 2345
}
},
{
"FunnelLetters": {
"S": "AABC"
},
"conversionCount": {
"N": 1742
},
"sessionCount": {
"N": 2601
}
},
{
"FunnelLetters": {
"S": "==="
},
"letterToTouchpoint": {
"M": {
"A": {
"S": "Ad #1 Clicked"
},
"B": {
"S": "Ad #2 Displayed"
},
"C": {
"S": "Ad #1 Displayed"
}
}
}
}
],
"ScannedCount": 5
}
div.container {
width: 960px;
}
svg.legend {
padding-top: 25px;
float: left;
}
rect.innerchart.zoom {
fill: white;
fill-opacity: 0;
}
g.journey {
opacity: 0.8;
}
g.journey:hover {
opacity: 1;
}
g.journey line, g.linklayer line {
stroke: lightgray;
}
g.journey line {
opacity: 0.6;
stroke-width: 4;
}
g.linklayer line {
opacity: 0.2;
}
g.journey rect.step {
shape-rendering: crispEdges;
}
g.density.plot circle {
fill: black;
}
.axis {
font: 10px sans-serif;
}
.axis text.label, .legend text {
font: 14px sans-serif;
}
.y.axis text.label {
cursor: vertical-text;
}
.legend.key text.label {
font-size: 12px;
}
.axis line, .axis path {
stroke: black;
shape-rendering: crispEdges;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment