Skip to content

Instantly share code, notes, and snippets.

@ramnov
Last active April 30, 2017 22:17
Show Gist options
  • Save ramnov/f0ed6a3fab929f47564cf540f3f68a0c to your computer and use it in GitHub Desktop.
Save ramnov/f0ed6a3fab929f47564cf540f3f68a0c to your computer and use it in GitHub Desktop.
RStudio Server example code to run MRS in Spark compute Context
# Spark Compute Context
sparkCC <- rxSparkConnect()
rxHadoopMakeDir("/share/SampleData")
rxHadoopCopyFromLocal(file.path(dataPath = rxGetOption("sampleDataDir"), "AirlineDemoSmall.csv"), "/share/SampleData")
airDS <- RxTextData(file = "/share/SampleData/AirlineDemoSmall.csv", missingValueString = "M",
fileSystem = RxHdfsFileSystem())
adsSummary <- rxSummary(~ArrDelay+CRSDepTime+DayOfWeek, data = airDS)
print(adsSummary)
rxSparkDisconnect(sparkCC)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment