Create a gist now

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Facebook Mining
# go to 'https://developers.facebook.com/tools/explorer' to get your access token
access_token <- "******************* INPUT YOUR ACCESS TOKEN ******************************"
require(RCurl)
require(rjson)
# Facebook json function copied from original (Romain Francois) post
facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s&access_token=%s", path, options, access_token ) )
fromJSON( data )
}
### MY FACEBOOK POSTS
myposts <- list()
i <- 0
next.path <- "me/posts"
# download all my posts
while(length(next.path)!=0) {
i<-i+1
myposts[[i]] <- facebook(path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/", "", myposts[[i]]$paging$'next')
}
myposts[[i]] <- NULL
# parse the list, extract number of likes and the corresponding text (status)
parse.master <- function(x, f)
sapply(x$data, f)
parse.likes <- function(x) if(!is.null(x$likes$count)) x$likes$count else 0
mylikes <- unlist(sapply(myposts, parse.master, f=parse.likes))
parse.messages <- function(x) if(!is.null(x$message)) x$message else NA
mymessages <- unlist(sapply(myposts, parse.master, f=parse.messages))
# and the most liked status is...
mymessages[which.max(mylikes)]
### TED FACEBOOK PAGE
# http://www.facebook.com/TED
# TED's Facebook ID 29092950651 can be found on http://graph.facebook.com/TED
ted <- list()
i<-0
next.path <- "29092950651/posts"
# download all TED posts
while(length(next.path)!=0) {
i<-i+1
ted[[i]] <- facebook( path=next.path , access_token=access_token)
next.path <- sub("https://graph.facebook.com/","",ted[[i]]$paging$'next')
}
ted[[i]] <- NULL
# parse just video links posted by TED
parse.count.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$likes$count else NA
parse.link.ted <- function(x)
if (x$type=="link" & x$from$id=="29092950651") x$link else NA
ted.counts <- unlist(sapply(ted, parse.master, f=parse.count.ted))
ted.links <- unlist(sapply(ted, parse.master, f=parse.link.ted))
# see three most popular talks
ted.links[order(ted.counts,decreasing=TRUE)][1:3]
@jan-glx

This comment has been minimized.

Show comment
Hide comment
@jan-glx

jan-glx Jun 18, 2012

I had some ssh truble. inserting
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
helped-nice tool!

jan-glx commented Jun 18, 2012

I had some ssh truble. inserting
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
helped-nice tool!

@bhattarai842

This comment has been minimized.

Show comment
Hide comment
@bhattarai842

bhattarai842 Mar 14, 2013

While running the above code with my access token my RStudio shows me

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

Can you suggest me where am I wrong??

While running the above code with my access token my RStudio shows me

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

Can you suggest me where am I wrong??

@d2k

This comment has been minimized.

Show comment
Hide comment
@d2k

d2k Apr 24, 2013

It looks like that the facebook function is using the wrong delimiter for the request - & instead of ?

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?access_token=%s", path, options, access_token ) )
fromJSON( data )
}

worked for me

d2k commented Apr 24, 2013

It looks like that the facebook function is using the wrong delimiter for the request - & instead of ?

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?access_token=%s", path, options, access_token ) )
fromJSON( data )
}

worked for me

@yborghol

This comment has been minimized.

Show comment
Hide comment
@yborghol

yborghol Apr 29, 2013

Woked for me, I just had to change the delimiter for the request of the facebook function (? instead of &).
I'm having a small issue though, my access token has expired, how can I get a new one through R?

Thanks

Woked for me, I just had to change the delimiter for the request of the facebook function (? instead of &).
I'm having a small issue though, my access token has expired, how can I get a new one through R?

Thanks

@d2k

This comment has been minimized.

Show comment
Hide comment

d2k commented May 2, 2013

you need to request a new one via https://developers.facebook.com/tools/explorer

@tryingstats

This comment has been minimized.

Show comment
Hide comment
@tryingstats

tryingstats Jun 10, 2013

I'd really like to get this code to work. I am a novice!!

When I download the TED posts, I am getting the following:

Error in fromJSON(data) : unexpected escaped character '\w' at pos 53

I am also getting the error when I try to find the 3 most popular posts even though I made the changes outlined above.

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

I'd really like to get this code to work. I am a novice!!

When I download the TED posts, I am getting the following:

Error in fromJSON(data) : unexpected escaped character '\w' at pos 53

I am also getting the error when I try to find the 3 most popular posts even though I made the changes outlined above.

Error in order(ted.counts, decreasing = TRUE) :
argument 1 is not a vector

@subasish

This comment has been minimized.

Show comment
Hide comment
@subasish

subasish Jun 26, 2013

I resolved my first error by using:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

I tried the code several times( before TED code chuck).

All I am getting some "NULL" value. :(

I resolved my first error by using:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))

I tried the code several times( before TED code chuck).

All I am getting some "NULL" value. :(

@kasper2619

This comment has been minimized.

Show comment
Hide comment
@kasper2619

kasper2619 Jul 11, 2013

Nice pice of code! Worked just fine with all the corrections people posted. Anyone who knows why one can only scrape 25 posts???

Nice pice of code! Worked just fine with all the corrections people posted. Anyone who knows why one can only scrape 25 posts???

@d2k

This comment has been minimized.

Show comment
Hide comment
@d2k

d2k Dec 5, 2013

on the 25 posts - the next handling is not working this way
I changed the facebook function to:

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
if( regexpr("access_token=", path) <= 0 ){
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?limit=2&access_token=%s&format=json", path, options, access_token ) )
} else {
data <- getURL( sprintf(path) )
}
fromJSON( data )
}

and then the next.path is:

next.path <- myposts[[i]]$paging$'next'

at least works for me - beside a json issue with some posts but this seems to be a different thing

d2k commented Dec 5, 2013

on the 25 posts - the next handling is not working this way
I changed the facebook function to:

facebook <- function( path = "me", access_token, options){
if( !missing(options) ){
options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) )
} else {
options <- ""
}
if( regexpr("access_token=", path) <= 0 ){
data <- getURL( sprintf( "https://graph.facebook.com/%s%s?limit=2&access_token=%s&format=json", path, options, access_token ) )
} else {
data <- getURL( sprintf(path) )
}
fromJSON( data )
}

and then the next.path is:

next.path <- myposts[[i]]$paging$'next'

at least works for me - beside a json issue with some posts but this seems to be a different thing

@deep-mukherjee

This comment has been minimized.

Show comment
Hide comment
@deep-mukherjee

deep-mukherjee Jun 1, 2014

when I run the function
friends <- facebook( path="me/friends" , access_token=access_token)

I get some error saying

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Any idea how to fix it ?

when I run the function
friends <- facebook( path="me/friends" , access_token=access_token)

I get some error saying

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Any idea how to fix it ?

@englianhu

This comment has been minimized.

Show comment
Hide comment
@englianhu

englianhu Oct 5, 2015

Tried to generate few times but all tokens inactive.
http://dni-institute.in/blogs/extracting-data-from-facebook-using-r/

> access_token <-'CAACEdEose0cBADrmigxgKk68AUu2IJapqA4ZAlsUkZARKhcBZCuWBQyZClkFSS73CRFoZBNC7bd2SEAsnvvtYPWFCKZBLtL02o0VEw7oP3OOgO4l5nMypUFox23yGuTyp9isa89wtem62F8GZAFUnjkbIWtfrVOF2Jod5BexAAIFUlXoZBDzxdb4F9O5wqOUiGvcpvYh22oUBdZAOrtcNr8Xo'
> library(RCurl)
> # Getting URL link along with access code
> f_url <-sprintf( "https://graph.facebook.com/%s&access_token=%s", "me/photos", access_token )
> 
> #Connnect and Extract Data
> connect <- getURL(f_url)
> connect
- [1] "{\"error\":{\"message\":\"An active access token must be used to query information about the current  user.\",\"type\":\"OAuthException\",\"code\":2500,\"fbtrace_id\":\"BEXIOk2pFoi\"}}"

Tried to generate few times but all tokens inactive.
http://dni-institute.in/blogs/extracting-data-from-facebook-using-r/

> access_token <-'CAACEdEose0cBADrmigxgKk68AUu2IJapqA4ZAlsUkZARKhcBZCuWBQyZClkFSS73CRFoZBNC7bd2SEAsnvvtYPWFCKZBLtL02o0VEw7oP3OOgO4l5nMypUFox23yGuTyp9isa89wtem62F8GZAFUnjkbIWtfrVOF2Jod5BexAAIFUlXoZBDzxdb4F9O5wqOUiGvcpvYh22oUBdZAOrtcNr8Xo'
> library(RCurl)
> # Getting URL link along with access code
> f_url <-sprintf( "https://graph.facebook.com/%s&access_token=%s", "me/photos", access_token )
> 
> #Connnect and Extract Data
> connect <- getURL(f_url)
> connect
- [1] "{\"error\":{\"message\":\"An active access token must be used to query information about the current  user.\",\"type\":\"OAuthException\",\"code\":2500,\"fbtrace_id\":\"BEXIOk2pFoi\"}}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment