Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Facebook Mining
# go to 'https://developers.facebook.com/tools/explorer' to get your access token
access_token <- "******************* INPUT YOUR ACCESS TOKEN ******************************"
require(RCurl)
require(rjson)
# Facebook json function copied from original (Romain Francois) post
facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s&access_token=%s", path, options, access_token ) )
fromJSON( data )
}
### MY FACEBOOK POSTS
myposts <- list()
i <- 0
next.path <- "me/posts"
# download all my posts
while(length(next.path)!=0) {
i<-i+1
myposts[[i]] <- facebook(path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/", "", myposts[[i]]$paging$'next')
}
myposts[[i]] <- NULL
# parse the list, extract number of likes and the corresponding text (status)
parse.master <- function(x, f)
sapply(x$data, f)
parse.likes <- function(x) if(!is.null(x$likes$count)) x$likes$count else 0
mylikes <- unlist(sapply(myposts, parse.master, f=parse.likes))
parse.messages <- function(x) if(!is.null(x$message)) x$message else NA
mymessages <- unlist(sapply(myposts, parse.master, f=parse.messages))
# and the most liked status is...
mymessages[which.max(mylikes)]
### TED FACEBOOK PAGE
# http://www.facebook.com/TED
# TED's Facebook ID 29092950651 can be found on http://graph.facebook.com/TED
ted <- list()
i<-0
next.path <- "29092950651/posts"
# download all TED posts
while(length(next.path)!=0) {
i<-i+1
ted[[i]] <- facebook( path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/","",ted[[i]]$paging$'next')
}
ted[[i]] <- NULL
# parse just video links posted by TED
parse.count.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$likes$count else NA
parse.link.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$link else NA
ted.counts <- unlist(sapply(ted, parse.master, f=parse.count.ted))
ted.links <- unlist(sapply(ted, parse.master, f=parse.link.ted))
# see three most popular talks
ted.links[order(ted.counts,decreasing=TRUE)][1:3]
@jan-glx

This comment has been minimized.

Copy link

commented Jun 18, 2012

I had some ssh truble. inserting
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
helped-nice tool!

@bhattarai842

This comment has been minimized.

Copy link

commented Mar 14, 2013

While running the above code with my access token my RStudio shows me

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

Can you suggest me where am I wrong??

@d2k

This comment has been minimized.

Copy link

commented Apr 24, 2013

It looks like that the facebook function is using the wrong delimiter for the request - & instead of ?

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?access_token=%s", path, options, access_token ) )
fromJSON( data )
}

worked for me

@yborghol

This comment has been minimized.

Copy link

commented Apr 29, 2013

Woked for me, I just had to change the delimiter for the request of the facebook function (? instead of &).
I'm having a small issue though, my access token has expired, how can I get a new one through R?

Thanks

@d2k

This comment has been minimized.

Copy link

commented May 2, 2013

you need to request a new one via https://developers.facebook.com/tools/explorer

@tryingstats

This comment has been minimized.

Copy link

commented Jun 10, 2013

I'd really like to get this code to work. I am a novice!!

When I download the TED posts, I am getting the following:

Error in fromJSON(data) : unexpected escaped character '\w' at pos 53

I am also getting the error when I try to find the 3 most popular posts even though I made the changes outlined above.

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

@subasish

This comment has been minimized.

Copy link

commented Jun 26, 2013

I resolved my first error by using:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

I tried the code several times( before TED code chuck).

All I am getting some "NULL" value. :(

@kasper2619

This comment has been minimized.

Copy link

commented Jul 11, 2013

Nice pice of code! Worked just fine with all the corrections people posted. Anyone who knows why one can only scrape 25 posts???

@d2k

This comment has been minimized.

Copy link

commented Dec 5, 2013

on the 25 posts - the next handling is not working this way
I changed the facebook function to:

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
if( regexpr("access_token=", path) <= 0 ){
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?limit=2&access_token=%s&format=json", path, options, access_token ) )
} else {
data <- getURL( sprintf(path) )
}
fromJSON( data )
}

and then the next.path is:

next.path <- myposts[[i]]$paging$'next'

at least works for me - beside a json issue with some posts but this seems to be a different thing

@deep-mukherjee

This comment has been minimized.

Copy link

commented Jun 1, 2014

when I run the function
friends <- facebook( path="me/friends" , access_token=access_token)

I get some error saying

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Any idea how to fix it ?

@zippeurfou

This comment has been minimized.

@englianhu

This comment has been minimized.

Copy link

commented Oct 5, 2015

Tried to generate few times but all tokens inactive.
http://dni-institute.in/blogs/extracting-data-from-facebook-using-r/

> access_token <-'CAACEdEose0cBADrmigxgKk68AUu2IJapqA4ZAlsUkZARKhcBZCuWBQyZClkFSS73CRFoZBNC7bd2SEAsnvvtYPWFCKZBLtL02o0VEw7oP3OOgO4l5nMypUFox23yGuTyp9isa89wtem62F8GZAFUnjkbIWtfrVOF2Jod5BexAAIFUlXoZBDzxdb4F9O5wqOUiGvcpvYh22oUBdZAOrtcNr8Xo'
> library(RCurl)
> # Getting URL link along with access code
> f_url <-sprintf( "https://graph.facebook.com/%s&access_token=%s", "me/photos", access_token )
> 
> #Connnect and Extract Data
> connect <- getURL(f_url)
> connect
- [1] "{\"error\":{\"message\":\"An active access token must be used to query information about the current  user.\",\"type\":\"OAuthException\",\"code\":2500,\"fbtrace_id\":\"BEXIOk2pFoi\"}}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.