Skip to content

Instantly share code, notes, and snippets.

@thiemehennis
thiemehennis / edX .mongo forum data into csv
Last active August 29, 2015 14:00
Turn edX .mongo forum data into csv files
#### Entire workflow:
# Checked some of the data in jsonlint - corrected the errors → },{ instead of }{ between each line and [ and ] at the beginning and end of the file
# Made a smaller file to play with, containing about 11 JSON lines
# Used the code below to parse the datafile - however, checking the different listItems first if they are not lists themselves (that gives problems) // as you will see,
# I also removed things like \n because that gave errors and added an empty value for parent_id if there is none in the data (otherwise it would mix up the data)
# The code to import the .mongo file into R and then parse it into CSV:
setwd("/your/favourite/dir/json to csv/")
@thiemehennis
thiemehennis / edX student module analytics to R
Last active December 31, 2015 12:49
This R code cleans up raw sql edX dataset into a new dataset with columns "Student", "Question", "Answer". The previous version contained a Credits: Written by @cbdvs for beautiful karma!
# Credits for @cbdvs (Christopher Davis) - check out his amazing work here: http://enipedia.tudelft.nl or email him at: c.b.davis@tudelft.nl
#never ever convert strings to factors
options(stringsAsFactors = FALSE)
###### TODO set this, the data will be written out there
setwd("/home/username/Desktop/R data/")
file.remove("allData.csv") ## removes the old datafile if there is one (so the data is not appended to the file, but a new file is created)