Skip to content

Instantly share code, notes, and snippets.

@wesslen
Created February 6, 2018 17:27
Show Gist options
  • Save wesslen/b193445b9432a435542cb9f3eeed20a7 to your computer and use it in GitHub Desktop.
Save wesslen/b193445b9432a435542cb9f3eeed20a7 to your computer and use it in GitHub Desktop.
R Code to connect to UNCC Vis DataLake (MongoDB) via mongolite
# install mongolite if you do not have it
# see https://github.com/jeroen/mongolite
#install.packages("mongolite")
library(mongolite)
# replace with name of database -- change last folder to database (e.g., "gab")
mongoUrl <- "mongodb://datalake:27017/gab"
# change col to your collection
col <- "id"
# create connection (con)
con <- mongo(collection = col, url = mongoUrl)
# count how many records
con$count('{}')
# count how many records have a score greater than 3
out <- con$find('{"score" : { "$gt" : 3 } }')
nrow(out)
# get collection info info
info <- con$info()
# sample -- get five elements
con$find(limit = 5)
# get all posts (be careful if too large)
t <- con$find(fields = '{"body" : true}')
## how to query in mongolit
# see https://jeroen.github.io/mongolite/query-data.html#query-syntax
t2 <- con$find(
query = '{"score" : { "$gt" : 3 } }',
fields = '{"id" : true, "created_at" : true, "body": true}',
limit = 5
)
# indexing --
system.time(con$find(sort = '{"created_at" : 1}', limit = 100))
# query by date
t3 <- con$find(
query = '{"created_at": { "$gte" : { "$date" : "2016-10-19T01:35:26Z" }}}',
fields = '{"id" : true, "created_at" : true, "body" : true, "_id": false}'
)
# aggregation & map-reduce https://jeroen.github.io/mongolite/calculation.html#aggregate
@gabefair
Copy link

I just used this. Thanks for posting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment