Skip to content

Instantly share code, notes, and snippets.

@simecek
Created January 18, 2012 18:26
Show Gist options
  • Save simecek/1634662 to your computer and use it in GitHub Desktop.
Save simecek/1634662 to your computer and use it in GitHub Desktop.
Facebook Mining
# go to 'https://developers.facebook.com/tools/explorer' to get your access token
access_token <- "******************* INPUT YOUR ACCESS TOKEN ******************************"
require(RCurl)
require(rjson)
# Facebook json function copied from original (Romain Francois) post
facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s&access_token=%s", path, options, access_token ) )
fromJSON( data )
}
### MY FACEBOOK POSTS
myposts <- list()
i <- 0
next.path <- "me/posts"
# download all my posts
while(length(next.path)!=0) {
i<-i+1
myposts[[i]] <- facebook(path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/", "", myposts[[i]]$paging$'next')
}
myposts[[i]] <- NULL
# parse the list, extract number of likes and the corresponding text (status)
parse.master <- function(x, f)
sapply(x$data, f)
parse.likes <- function(x) if(!is.null(x$likes$count)) x$likes$count else 0
mylikes <- unlist(sapply(myposts, parse.master, f=parse.likes))
parse.messages <- function(x) if(!is.null(x$message)) x$message else NA
mymessages <- unlist(sapply(myposts, parse.master, f=parse.messages))
# and the most liked status is...
mymessages[which.max(mylikes)]
### TED FACEBOOK PAGE
# http://www.facebook.com/TED
# TED's Facebook ID 29092950651 can be found on http://graph.facebook.com/TED
ted <- list()
i<-0
next.path <- "29092950651/posts"
# download all TED posts
while(length(next.path)!=0) {
i<-i+1
ted[[i]] <- facebook( path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/","",ted[[i]]$paging$'next')
}
ted[[i]] <- NULL
# parse just video links posted by TED
parse.count.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$likes$count else NA
parse.link.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$link else NA
ted.counts <- unlist(sapply(ted, parse.master, f=parse.count.ted))
ted.links <- unlist(sapply(ted, parse.master, f=parse.link.ted))
# see three most popular talks
ted.links[order(ted.counts,decreasing=TRUE)][1:3]
@jan-glx
Copy link

jan-glx commented Jun 18, 2012

I had some ssh truble. inserting
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
helped-nice tool!

@bhattarai842
Copy link

While running the above code with my access token my RStudio shows me

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

Can you suggest me where am I wrong??

@d2k
Copy link

d2k commented Apr 24, 2013

It looks like that the facebook function is using the wrong delimiter for the request - & instead of ?

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?access_token=%s", path, options, access_token ) )
fromJSON( data )
}

worked for me

@yborghol
Copy link

Woked for me, I just had to change the delimiter for the request of the facebook function (? instead of &).
I'm having a small issue though, my access token has expired, how can I get a new one through R?

Thanks

@d2k
Copy link

d2k commented May 2, 2013

you need to request a new one via https://developers.facebook.com/tools/explorer

@tryingstats
Copy link

I'd really like to get this code to work. I am a novice!!

When I download the TED posts, I am getting the following:

Error in fromJSON(data) : unexpected escaped character '\w' at pos 53

I am also getting the error when I try to find the 3 most popular posts even though I made the changes outlined above.

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

@subasish
Copy link

I resolved my first error by using:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

I tried the code several times( before TED code chuck).

All I am getting some "NULL" value. :(

@kasper2619
Copy link

Nice pice of code! Worked just fine with all the corrections people posted. Anyone who knows why one can only scrape 25 posts???

@d2k
Copy link

d2k commented Dec 5, 2013

on the 25 posts - the next handling is not working this way
I changed the facebook function to:

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
if( regexpr("access_token=", path) <= 0 ){
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?limit=2&access_token=%s&format=json", path, options, access_token ) )
} else {
data <- getURL( sprintf(path) )
}
fromJSON( data )
}

and then the next.path is:

next.path <- myposts[[i]]$paging$'next'

at least works for me - beside a json issue with some posts but this seems to be a different thing

@deep-mukherjee
Copy link

when I run the function
friends <- facebook( path="me/friends" , access_token=access_token)

I get some error saying

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Any idea how to fix it ?

@zippeurfou
Copy link

@englianhu
Copy link

Tried to generate few times but all tokens inactive.
http://dni-institute.in/blogs/extracting-data-from-facebook-using-r/

> access_token <-'CAACEdEose0cBADrmigxgKk68AUu2IJapqA4ZAlsUkZARKhcBZCuWBQyZClkFSS73CRFoZBNC7bd2SEAsnvvtYPWFCKZBLtL02o0VEw7oP3OOgO4l5nMypUFox23yGuTyp9isa89wtem62F8GZAFUnjkbIWtfrVOF2Jod5BexAAIFUlXoZBDzxdb4F9O5wqOUiGvcpvYh22oUBdZAOrtcNr8Xo'
> library(RCurl)
> # Getting URL link along with access code
> f_url <-sprintf( "https://graph.facebook.com/%s&access_token=%s", "me/photos", access_token )
> 
> #Connnect and Extract Data
> connect <- getURL(f_url)
> connect
- [1] "{\"error\":{\"message\":\"An active access token must be used to query information about the current  user.\",\"type\":\"OAuthException\",\"code\":2500,\"fbtrace_id\":\"BEXIOk2pFoi\"}}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment