public
Last active

Facebook Mining

  • Download Gist
facebook_mining.r
R
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
# go to 'https://developers.facebook.com/tools/explorer' to get your access token
access_token <- "******************* INPUT YOUR ACCESS TOKEN ******************************"
 
require(RCurl)
require(rjson)
 
# Facebook json function copied from original (Romain Francois) post
facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s&access_token=%s", path, options, access_token ) )
fromJSON( data )
}
 
### MY FACEBOOK POSTS
 
myposts <- list()
i <- 0
next.path <- "me/posts"
 
# download all my posts
while(length(next.path)!=0) {
i<-i+1
myposts[[i]] <- facebook(path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/", "", myposts[[i]]$paging$'next')
}
myposts[[i]] <- NULL
 
# parse the list, extract number of likes and the corresponding text (status)
parse.master <- function(x, f)
sapply(x$data, f)
parse.likes <- function(x) if(!is.null(x$likes$count)) x$likes$count else 0
mylikes <- unlist(sapply(myposts, parse.master, f=parse.likes))
parse.messages <- function(x) if(!is.null(x$message)) x$message else NA
mymessages <- unlist(sapply(myposts, parse.master, f=parse.messages))
 
# and the most liked status is...
mymessages[which.max(mylikes)]
 
### TED FACEBOOK PAGE
# http://www.facebook.com/TED
# TED's Facebook ID 29092950651 can be found on http://graph.facebook.com/TED
 
ted <- list()
i<-0
next.path <- "29092950651/posts"
 
# download all TED posts
while(length(next.path)!=0) {
i<-i+1
ted[[i]] <- facebook( path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/","",ted[[i]]$paging$'next')
}
ted[[i]] <- NULL
 
# parse just video links posted by TED
parse.count.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$likes$count else NA
parse.link.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$link else NA
ted.counts <- unlist(sapply(ted, parse.master, f=parse.count.ted))
ted.links <- unlist(sapply(ted, parse.master, f=parse.link.ted))
 
# see three most popular talks
ted.links[order(ted.counts,decreasing=TRUE)][1:3]

I had some ssh truble. inserting
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
helped-nice tool!

While running the above code with my access token my RStudio shows me

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

Can you suggest me where am I wrong??

It looks like that the facebook function is using the wrong delimiter for the request - & instead of ?

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?access_token=%s", path, options, access_token ) )
fromJSON( data )
}

worked for me

Woked for me, I just had to change the delimiter for the request of the facebook function (? instead of &).
I'm having a small issue though, my access token has expired, how can I get a new one through R?

Thanks

I'd really like to get this code to work. I am a novice!!

When I download the TED posts, I am getting the following:

Error in fromJSON(data) : unexpected escaped character '\w' at pos 53

I am also getting the error when I try to find the 3 most popular posts even though I made the changes outlined above.

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

I resolved my first error by using:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

I tried the code several times( before TED code chuck).

All I am getting some "NULL" value. :(

Nice pice of code! Worked just fine with all the corrections people posted. Anyone who knows why one can only scrape 25 posts???

on the 25 posts - the next handling is not working this way
I changed the facebook function to:

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
if( regexpr("access_token=", path) <= 0 ){
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?limit=2&access_token=%s&format=json", path, options, access_token ) )
} else {
data <- getURL( sprintf(path) )

}
fromJSON( data )
}

and then the next.path is:

next.path <- myposts[[i]]$paging$'next'

at least works for me - beside a json issue with some posts but this seems to be a different thing

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.