Skip to content

Instantly share code, notes, and snippets.

@alexbrey
Last active June 26, 2016 03:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alexbrey/3b55433effae223d5b0215b247228661 to your computer and use it in GitHub Desktop.
Save alexbrey/3b55433effae223d5b0215b247228661 to your computer and use it in GitHub Desktop.
#Networks of Gothic Illuminators Tutorial
## Step 1: Acquire Data
- Scan summary list of manuscripts
- crop scans using briss
- OCR scans using Adobe Acrobat
- manually copy Artist, Region, and MSS data into a google sheet (there must be a better way than this, I do not know what it is)
- download google sheet as a tab-separated value file (.tsv)
estimated time ca. 3-4 hours
## Step 2:
#things I had to look up
- how to create a unimodal adjacency list from a bimodal adjacency list (many thanks to Miriam Posner and Matthew Lincoln)
https://github.com/miriamposner/cytoscape_tutorials/blob/master/get-a-unimodal-network.md
- how to code categorical variables as integers in a dataframe d for variables in column a (useful for fast color-coding of igraph networks)
l=unique(c(as.character(d$a)))
d1 <- data.frame(a=as.numeric(factor(d$a, levels=l)))
read more here:
http://stackoverflow.com/questions/13778950/converting-string-to-unique-integer-in-r
- how to get only rows that contain duplicated MSS names for the collaborative MSS unimodal edge list:
vec[duplicated(vec) | duplicated(vec, fromLast=TRUE), ]
read more about duplicated here:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/duplicated.html
http://stackoverflow.com/questions/7854433/finding-all-duplicate-rows-including-elements-with-smaller-subscripts
- how to save files as .tsv's (not perfect, still requires removing row.id column usually)
write.table(test, file='test.tsv', quote=FALSE, sep='\t', col.names = NA)
read more about it here:
http://stackoverflow.com/questions/17108191/how-to-export-proper-tsv
- merging dataframes by adding columns
# merge two data frames by ID
total <- merge(data frameA,data frameB,by="ID")
http://www.statmethods.net/management/merging.html
- read a weighted edge list into igraph
el=read.csv(file.choose()) # read the 'el.with.weights.csv' file
g2=graph.data.frame(el)
read more here:
http://www.shizukalab.com/toolkits/sna/weighted-edgelists
- calculate different forms of centrality with igraph (sometimes want to use weights, so use NULL rather than NA, see below)
closeness(net, mode="all", weights=NA)
eigen_centrality(net, directed=T, weights=NA)
betweenness(net, directed=T, weights=NA)
read more here:
http://kateto.net/networks-r-igraph
weights: A numerical vector or NULL. This argument can be used to give edge weights for calculating the weighted eigenvector centrality of vertices. If this is NULL and the graph has a weight edge attribute then that is used. If weights is a numerical vector then it used, even if the graph has a weights edge attribute. If this is NA, then no edge weights are used (even if the graph has a weight edge attribute. Note that if there are negative edge weights and the direction of the edges is considered, then the eigenvector might be complex. In this case only the real part is reported.
read more here:
http://igraph.org/r/doc/eigen_centrality.html
for graphing centrality with ggplot2 faceting by region
facet_wrap(facets, nrow = NULL, ncol = NULL, scales = "fixed", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = NULL, drop = TRUE, dir = "h")
http://docs.ggplot2.org/current/facet_wrap.html
how to calculate E-I ratio (inspired by Matthew Lincoln):
not perfect – this code only retrieves EI data for the largest connected network within a graph as far as I can tell?
https://lists.nongnu.org/archive/html/igraph-help/2012-10/msg00002.html
for Lincoln's use see http://matthewlincoln.net/assets/docs/scsc2014.pdf
how to set your own colors in networkd3
colourScale = "d3.scale.ordinal().range(["#7d3945","#e0677b", "#244457"])" (check quotation marks)
http://stackoverflow.com/questions/32209372/own-colour-range-for-sankey-diagram-with-networkd3-package-in-r
how to look up the region value associated with a matching illuminator value in a reference table and insert it into the group of a networkd3 object
> sg_d3$nodes$group <- regions[match(sg_d3$nodes$name, regions$Artist), "Region"]
found one of the lower answers here:
http://stackoverflow.com/questions/14485984/r-how-to-match-values-within-column-1-and-assign-adjacent-values-from-column-2
#other cools things
networkD3
read more here:
https://christophergandrud.github.io/networkD3/
http://kateto.net/network-visualization
other tutorials I used:
Lincoln Mullen's Tudor tutorial
http://lincolnmullen.com/projects/dh-r/networks.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment