Skip to content

Instantly share code, notes, and snippets.

@mhawksey
Created February 7, 2012 16:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mhawksey/1760718 to your computer and use it in GitHub Desktop.
Save mhawksey/1760718 to your computer and use it in GitHub Desktop.
PROD Sweave Report Demo
<<echo=FALSE>>=
strand = "Open educational resources programme"
require(RCurl)
library(tm)
library(wordcloud)
require(RColorBrewer)
library(igraph)
library(xtable)
@
\documentclass[12pt,a4paper]{article}
\begin{document}
\title{CETIS PROD Demonstration Report: \Sexpr{strand}}
\maketitle
\sffamily
\section{Background}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. In nec dui eget sapien aliquet aliquet. Praesent tincidunt ultrices rhoncus. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Vivamus id nisi sed justo tempor bibendum. Curabitur non velit purus, sit amet fermentum nunc. Vestibulum mattis cursus libero, et scelerisque nulla vulputate non. Phasellus lectus metus, fermentum sit amet suscipit eget, faucibus ac velit. Duis id velit ut purus viverra commodo at sed ligula. Ut lectus felis, ornare id lobortis eget, rutrum sed magna.
\section{Programme Overview}
<<echo=FALSE>>=
projects <- read.csv(textConnection(getURL("http://data-gov.tw.rpi.edu/ws/sparqlproxy.php?query=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+owl%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0APREFIX+xsd%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2001%2FXMLSchema%23%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0APREFIX+jisc%3A+%3Chttp%3A%2F%2Fwww.rkbexplorer.com%2Fontologies%2Fjisc%23%3E%0D%0APREFIX+doap%3A+%3Chttp%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23%3E%0D%0APREFIX+prod%3A+%3Chttp%3A%2F%2Fprod.cetis.ac.uk%2Fvocab%2F%3E%0D%0APREFIX+mu%3A+%3Chttp%3A%2F%2Fwww.jiscmu.ac.uk%2Fschema%2Fmuweb%2F%3E%0D%0APREFIX+geo%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F01%2Fgeo%2Fwgs84_pos%23%3E%0D%0APREFIX+dc%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0D%0ASELECT+DISTINCT+%3FprojectID+%3FProject+%3FProject_Name+%3FProgramme+%3FStrand+%3FDesc%0D%0AWHERE+%7B++%0D%0A%3FprojectID+a+doap%3AProject+.%0D%0AOPTIONAL+%7B+%3FprojectID+prod%3Aprogramme+%3FProgramme+%7D+.%0D%0AOPTIONAL+%7B+%3FprojectID+prod%3Astrand+%3FStrand+%7D+.%0D%0AOPTIONAL+%7B+%3FprojectID+jisc%3Ashort-name+%3FProject+%7D+.%0D%0AOPTIONAL+%7B+%3FprojectID+doap%3Aname+%3FProject_Name+%7D+.%0D%0AOPTIONAL+%7B%0D%0A%3FprojectID+doap%3Ashortdesc+%3FDesc+.%0D%0A%7D%0D%0A%7D&output=csv&callback=&tqx=&default-graph-uri=&service-uri=http%3A%2F%2Fapi.talis.com%2Fstores%2Fjisc-prod-dev1%2Fservices%2Fsparql")), header = T)
projectSub <- subset(projects, grepl("^Open education", projects$Strand ))
# this bit from http://onertipaday.blogspot.com/2011/07/word-cloud-in-r.html
# note if you are pulling in multiple columns you may needd to change which one
# in the dataset is select e.g. dataset[,2] etc
ap.corpus <- Corpus(DataframeSource(data.frame(as.character(projectSub$Desc))))
ap.corpus <- tm_map(ap.corpus, removePunctuation)
ap.corpus <- tm_map(ap.corpus, tolower)
ap.corpus <- tm_map(ap.corpus, function(x) removeWords(x, stopwords("english")))
# additional stopwords can be used as shown below
#ap.corpus <- tm_map(ap.corpus, function(x) removeWords(x, c("ukoer","oer")))
ap.tdm <- TermDocumentMatrix(ap.corpus)
ap.m <- as.matrix(ap.tdm)
ap.v <- sort(rowSums(ap.m),decreasing=TRUE)
ap.d <- data.frame(word = names(ap.v),freq=ap.v)
#table(ap.d$freq)
pal2 <- brewer.pal(8,"Dark2")
@
\setkeys{Gin}{width=3in}
\begin{figure}
\centering
\caption{Wordcloud of project descriptions}
<<fig=TRUE,echo=FALSE>>=
wordcloud(ap.d$word,ap.d$freq, scale=c(8,.2),min.freq=3, max.words=Inf, random.order=FALSE, rot.per=.15, colors=pal2)
@
\label{fig:desccloud}
\end{figure}
There are \Sexpr{nrow(projectSub)} projects in the \Sexpr{strand} Strand.
\subsection{Project Relationship}
Here is a grpah (see Figure \ref{fig:buildson}) of how projects in this strand relate to other projects
<<echo=FALSE>>=
buildson <- read.csv(textConnection(getURL("http://data-gov.tw.rpi.edu/ws/sparqlproxy.php?query=PREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+jisc%3A+%3Chttp%3A%2F%2Fwww.rkbexplorer.com%2Fontologies%2Fjisc%23%3E%0D%0APREFIX+doap%3A+%3Chttp%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23%3E%0D%0APREFIX+prod%3A+%3Chttp%3A%2F%2Fprod.cetis.ac.uk%2Fvocab%2F%3E%0D%0ASELECT+DISTINCT+%3FProject+%3FBuilds_Project%0D%0AWHERE+%7B%0D%0A%3FprojectID+a+doap%3AProject+.%0D%0A%3FprojectID+prod%3Astrand+%3FStrand+.%0D%0AFILTER+regex%28%3FStrand%2C+%22%5Eopen+education%22%2C+%22i%22%29+.%0D%0A%3FprojectID+jisc%3Ashort-name+%3FProject+.%0D%0A%3Frelationship+prod%3Abuilds_on_by+%3FprojectID+.%0D%0A%3Frelationship+prod%3Abuilds_on+%3FBuilds_projectID+.%0D%0A%3FBuilds_projectID+jisc%3Ashort-name+%3FBuilds_Project+.%0D%0A%7D%0D%0A&output=csv&callback=&tqx=&default-graph-uri=&service-uri=http%3A%2F%2Fapi.talis.com%2Fstores%2Fjisc-prod-dev1%2Fservices%2Fsparql")), header = T)
g <- graph.data.frame(buildson, directed=TRUE)
V(g)$size <- degree(g) * 2 # multiply by 2 for scale
l <- layout.fruchterman.reingold(g)
l <- layout.norm(l, -1,1, -1,1)
@
\begin{figure}
\centering
\caption{Project - builds on}
<<fig=TRUE,echo=FALSE>>=
plot(g, layout=l, vertex.size=3, vertex.label=V(g)$name,
vertex.color="#ff0000", vertex.frame.color="#ff0000", edge.color="#555555",
vertex.label.dist=0, vertex.label.cex=1, vertex.label.font=2,
edge.arrow.size=0.3, xlim=range(l[,1]), ylim=range(l[,2]),
main="Builds on")
@
\label{fig:buildson}
\end{figure}
\end{document}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment