Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Practising some data shaping in R for #compdata week 1
# compdata week 1 pracitce
# Script reads a NodeXL twitter search for #compdata hashtag that's been uploaded to Google Spreadsheet
# Data is reshaped using subsetting to get a slice of rows columns fitting a certiain condition
# read csv from Google Spreadsheet, headers in row 2 in this case an vertices list
vertices <- read.csv("",header=TRUE,skip=1,)
# see number of rows
# read csv from Google Spreadsheet, headers in row 2 in this case an edges list
edges <- read.csv("",header=TRUE,skip=1,)
# look at the data
# Note that $ Relationship : Factor w/ 4 levels "Followed","Mentions"
# What are all the levels in $Relationship
# how many rows are there where $Tweet that contains 'I just signed up for Computing for Data Analysis .. '
iJust <- grepl("^I just signed up for Computing for Data Analysis", edges$Tweet)
# Want to get a subset of data of $Vertex.1 and $Vertex.2 where $Relationship is 'Followed'
# To get 'Followed' subset
followed <- edges$Relationship == "Followed"
# now make a new data.frame with 1st two cols of edges $Vertex.1 and $Vertex.2 where followed
edgeList <- edges[followed,1:2]
# lines 10 and 13 can be combined using
edgeList <- edges[edges$Relationship == "Followed",1:2]
# look at the new data
# Now look at most frequent occurences of $Vertex.1 values from edges
# table will give us a frquency table
topInVert1 <-data.frame(table(edges$Vertex.1))
# now we can change the order
topInVert1 <- topInVert1[order(-topInVert1$Freq), ]
#print the top 10 results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.