Skip to content

Instantly share code, notes, and snippets.

@randyzwitch
Created September 10, 2014 20:42
Show Gist options
  • Save randyzwitch/008be202b94bde7c4359 to your computer and use it in GitHub Desktop.
Save randyzwitch/008be202b94bde7c4359 to your computer and use it in GitHub Desktop.
RSiteCatalyst Sankey Diagram - Single Page to Multiple Pages
library("RSiteCatalyst")
library("d3Network")
#### Authentication
SCAuth("key", "secret")
#### Get Pathing data: Single page, then ::anything:: pattern
pathpattern <- c("http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-1", "::anything::")
next_page <- QueuePathing("zwitchdev",
"2014-01-01",
"2014-08-31",
metric="pageviews",
element="page",
pathpattern,
top = 50000)
#Optional step: Cleaning my pagename URLs to remove to domain for clarity
next_page$step.1 <- sub("http://randyzwitch.com/","",
next_page$step.1, ignore.case = TRUE)
next_page$step.2 <- sub("http://randyzwitch.com/","",
next_page$step.2, ignore.case = TRUE)
#Get unique values of page name to create nodes df
#Create an index value, starting at 0
nodes <- as.data.frame(unique(c(next_page$step.1, next_page$step.2)))
names(nodes) <- "name"
nodes$nodevalue <- as.numeric(row.names(nodes)) - 1
#Convert string to numeric nodeid
links <- merge(next_page, nodes, by.x="step.1", by.y="name")
names(links) <- c("step.1", "step.2", "value", "source")
links <- merge(links, nodes, by.x="step.2", by.y="name")
names(links) <- c("step.1", "step.2", "value", "source", "target")
#Create next page Sankey chart
d3output = "C:/Users/rzwitc200/Desktop/sankey.html"
d3Sankey(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
fontsize = 12, nodeWidth = 100, file = d3output, width = 750, height = 600)
@thesmarthomeninja
Copy link

Your work is very fascinating Randy. It has inspired me to take some programming & data/statistics classes (which I'm already enrolled in after finding these gems last month). Thank you so much for everything you have done with RSiteCatalyst. I swore I saw some kind of link for an Adobe Summit 2017 session you had or that you were associated with earlier this year. I'm kicking myself in the foot that I didn't get a chance to attend that session if you did in fact hold a session at the Summit.

@thesmarthomeninja
Copy link

Oh and you probably saw my linkedin notification because I couldn't find a way to send my gratitude when browsing your site. My name is Kalen Daniel, and I do web analytics and work with the implementations as well as create reports/visualizations (well dumb-versions compared to R at least! with excel/google sheets and scripts from clickstream data which has been painful to say the least). Just mentioning that because you may have seen when I was poking around to find out more about your work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment