GitHub Gist Comments Feed Generator in R (this is how much I hate Ruby)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Roll your own GitHub Gist Comments Feed in R | |
library(xml2) # github version | |
library(rvest) # github version | |
library(stringr) # for str_trim & str_replace | |
library(dplyr) # for data_frame & bind_rows | |
library(pbapply) # free progress bars for everyone! | |
library(XML) # to build the RSS feed | |
who <- "hrbrmstr" # CHANGE ME! | |
# Grab the user's gist feed ----------------------------------------------- | |
gist_feed <- sprintf("https://gist.github.com/%s.atom", who) | |
feed_pg <- read_xml(gist_feed) | |
ns <- xml_ns_rename(xml_ns(feed_pg), d1 = "feed") | |
# Extract the links & titles of the gists in the feed --------------------- | |
links <- xml_attr(xml_find_all(feed_pg, "//feed:entry/feed:link", ns), "href") | |
titles <- xml_text(xml_find_all(feed_pg, "//feed:entry/feed:title", ns)) | |
#' This function does the hard part by iterating over the | |
#' links/titles and building a tbl_df of all the comments per-gist | |
get_comments <- function(links, titles) { | |
bind_rows(pblapply(1:length(links), function(i) { | |
# get gist | |
pg <- read_html(links[i]) | |
# look for comments | |
ref <- tryCatch(html_attr(html_nodes(pg, "div.timeline-comment-wrapper a[href^='#gistcomment']"), "href"), | |
error=function(e) character(0)) | |
# in theory if 'ref' exists then the rest will | |
if (length(ref) != 0) { | |
# if there were comments, get all the metadata we care about | |
author <- html_text(html_nodes(pg, "div.timeline-comment-wrapper a.author")) | |
timestamp <- html_attr(html_nodes(pg, "div.timeline-comment-wrapper time"), "datetime") | |
contentpg <- str_trim(html_text(html_nodes(pg, "div.timeline-comment-wrapper div.comment-body"))) | |
} else { | |
ref <- author <- timestamp <- contentpg <- character(0) | |
} | |
# bind_rows ignores length 0 tbl_df's | |
if (sum(lengths(list(ref, author, timestamp, contentpg))==0)) { | |
return(data_frame()) | |
} | |
return(data_frame(title=titles[i], link=links[i], | |
ref=ref, author=author, | |
timestamp=timestamp, contentpg=contentpg)) | |
})) | |
} | |
comments <- get_comments(links, titles) | |
feed <- xmlTree("feed") | |
feed$addNode("id", sprintf("user:%s", who)) | |
feed$addNode("title", sprintf("%s's gist comments", who)) | |
feed$addNode("icon", "https://assets-cdn.github.com/favicon.ico") | |
feed$addNode("link", attrs=list(href=sprintf("https://github.com/%s", who))) | |
feed$addNode("updated", format(Sys.time(), "%Y-%m-%dT%H:%M:%SZ", tz="GMT")) | |
for (i in 1:nrow(comments)) { | |
feed$addNode("entry", close=FALSE) | |
feed$addNode("id", sprintf("gist:comment:%s:%s", who, comments[i, "timestamp"])) | |
feed$addNode("link", attrs=list(href=sprintf("%s%s", comments[i, "link"], comments[i, "ref"]))) | |
feed$addNode("title", sprintf("Comment by %s", comments[i, "author"])) | |
feed$addNode("updated", comments[i, "timestamp"]) | |
feed$addNode("author", close=FALSE) | |
feed$addNode("name", comments[i, "author"]) | |
feed$closeTag() | |
feed$addNode("content", saveXML(xmlTextNode(as.character(comments[i, "contentpg"])), prefix=""), | |
attrs=list(type="html")) | |
feed$closeTag() | |
} | |
rss <- str_replace(saveXML(feed), "<feed>", '<feed xmlns="http://www.w3.org/2005/Atom">') | |
writeLines(rss, con="feed.xml") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0"?> | |
<feed xmlns="http://www.w3.org/2005/Atom"> | |
<id>user:hrbrmstr</id> | |
<title>hrbrmstr's gist comments</title> | |
<icon>https://assets-cdn.github.com/favicon.ico</icon> | |
<link href="https://github.com/hrbrmstr"/> | |
<updated>2015-07-25T13:55:04Z</updated> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-23T23:23:41Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1503378"/> | |
<title>Comment by ateucher</title> | |
<updated>2015-07-23T23:23:41Z</updated> | |
<author> | |
<name>ateucher</name> | |
</author> | |
<content type="html">Very nice! Regarding the extreme values, is truncating them back to maximum the right thing to do? Or should they &quot;wrap&quot; into the other half of the globe (eg, rather than converting -186.0 to -179.99999, should it actually be 174.0?) | |
I ask this out of ignorance of the source of the errors...</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T12:55:57Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1506227"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-07-24T12:55:57Z</updated> | |
<author> | |
<name>hrbrmstr</name<name /> | |
</author> | |
<content type="html">That&apos;s a good question. I&apos;m going to post this to the r-sig-geo list to get some feedback.</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T18:06:40Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1507414"/> | |
<title>Comment by ateucher</title> | |
<updated>2015-07-24T18:06:40Z</updated> | |
<author> | |
<name>ateucher</name> | |
</author> | |
<content type="html"></content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T19:01:32Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1507609"/> | |
<title>Comment by ateucher</title> | |
<updated>2015-07-24T19:01:32Z</updated> | |
<author> | |
<name>ateucher</name> | |
</author> | |
<content type="html">So I don&apos;t think that chopping it at 180 is the answer, as those values &gt; 180 are actually &apos;valid&apos;, as Russia, Fiji, and Antarctica all cross the 180th meridian (https://en.wikipedia.org/wiki/180th_meridian). But I don&apos;t know what the answer is - see the &apos;software representation problems&apos; in the Wikipedia article - we&apos;re not alone :) | |
library(ggplot2) | |
library(maps) | |
world &lt;- map_data(&quot;world&quot;) | |
gg &lt;- ggplot() | |
gg &lt;- gg + geom_map(data=world, map=world, | |
aes(x=long, y=lat, map_id=region)) | |
gg &lt;- gg + xlim(c(170, 200)) + ylim(c(60, 70)) | |
gg</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T20:02:06Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1507834"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-07-24T20:02:06Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">As I posted on Twitter (adding it here just for folks who stumble on this via my blog post) i totally knew I was DESTROYING THE EARTH with that hack ;-) rworldmap::getMap() has a cleaner shapefile for the world that doesn&apos;t impact this, but I do need to do something about this before it becomes &quot;a real thing&quot; for folks. No replies from r-sig-geo yet but I&apos;ll research over the weekend and see what I can come up with. It won&apos;t be super-scary math, but i need to ensure I cover all the edge cases (no pun intended).</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T20:27:50Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1507926"/> | |
<title>Comment by hadley</title> | |
<updated>2015-07-24T20:27:50Z</updated> | |
<author> | |
<name>hadley</name> | |
</author> | |
<content type="html">There some good stuff on the general problem in http://bost.ocks.org/mike/example/</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T21:10:56Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1508014"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-07-24T21:10:56Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">heh. that site of Bostock&apos;s always makes me dizzy. thx for that, tho. hopefully won&apos;t be too hard to work around.</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-24T21:19:45Z</id> | |
<link href="https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7#gistcomment-1508021"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-07-24T21:19:45Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">This comment is solely to see if the IFTTT action is working</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-07-12T18:32:08Z</id> | |
<link href="https://gist.github.com/hrbrmstr/bf821a2e4b48151a8e96#gistcomment-1491158"/> | |
<title>Comment by abresler</title> | |
<updated>2015-07-12T18:32:08Z</updated> | |
<author> | |
<name>abresler</name> | |
</author> | |
<content type="html">This code is so clean just wanted to say nice!!!</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-06-29T13:02:23Z</id> | |
<link href="https://gist.github.com/hrbrmstr/32e9c140129d7d51db52#gistcomment-1482541"/> | |
<title>Comment by bearloga</title> | |
<updated>2015-06-29T13:02:23Z</updated> | |
<author> | |
<name>bearloga</name> | |
</author> | |
<content type="html">Error in UseMethod(&quot;html_nodes&quot;) : | |
no applicable method for &apos;html_nodes&apos; applied to an object of class &quot;c(&apos;xml_document&apos;, &apos;xml_node&apos;)&quot; | |
:\ Have you seen that error? | |
P.S. My machine has: | |
Package Version | |
1 xml2 0.1.1 | |
2 rvest 0.2.0 | |
3 htmltools 0.2.6</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-06-29T13:22:38Z</id> | |
<link href="https://gist.github.com/hrbrmstr/32e9c140129d7d51db52#gistcomment-1482548"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-06-29T13:22:38Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">aye. i just made a note in the source. | |
rvest * 0.2.0.9000 2015-06-21 Github (hadley/rvest@9461bc4) is what I&apos;m using. I think i can tweak this, tho.</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-06-29T13:38:22Z</id> | |
<link href="https://gist.github.com/hrbrmstr/32e9c140129d7d51db52#gistcomment-1482557"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-06-29T13:38:22Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">and, it should work on stable and github versions</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-06-29T13:48:55Z</id> | |
<link href="https://gist.github.com/hrbrmstr/32e9c140129d7d51db52#gistcomment-1482563"/> | |
<title>Comment by cpsievert</title> | |
<updated>2015-06-29T13:48:55Z</updated> | |
<author> | |
<name>cpsievert</name> | |
</author> | |
<content type="html">I think you want xml2, not xml</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-06-29T14:12:04Z</id> | |
<link href="https://gist.github.com/hrbrmstr/32e9c140129d7d51db52#gistcomment-1482570"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-06-29T14:12:04Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">aye. thxk @cpsievert. v1 was beautiful. v2+ has been coded whilst catching up from being on vacation and dealing with the morning routine.</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-05-17T14:45:01Z</id> | |
<link href="https://gist.github.com/hrbrmstr/51f961198f65509ad863#gistcomment-1455240"/> | |
<title>Comment by irichgreen</title> | |
<updated>2015-05-17T14:45:01Z</updated> | |
<author> | |
<name>irichgreen</name> | |
</author> | |
<content type="html">Hi, | |
I&apos;ve got an error message in the line number 9 code. | |
&quot;us &lt;- readOGR(&quot;us_states_hexgrid.geojson&quot;, &quot;OGRGeoJSON&quot;)&quot; | |
Error in ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv, : | |
GDAL Error 3: Cannot open file &apos;us_states_hexgrid.geojson&apos; | |
Could you please resolve it?</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-05-17T16:15:09Z</id> | |
<link href="https://gist.github.com/hrbrmstr/51f961198f65509ad863#gistcomment-1455260"/> | |
<title>Comment by bnjcbsn</title> | |
<updated>2015-05-17T16:15:09Z</updated> | |
<author> | |
<name>bnjcbsn</name> | |
</author> | |
<content type="html">Curious about this error as well. Interesting topic.</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-05-21T17:19:31Z</id> | |
<link href="https://gist.github.com/hrbrmstr/51f961198f65509ad863#gistcomment-1458397"/> | |
<title>Comment by hrbrmstr</title> | |
<updated>2015-05-21T17:19:31Z</updated> | |
<author> | |
<name>hrbrmstr</name> | |
</author> | |
<content type="html">I really need to figure out how to get notices abt comments on gists | |
You need the latest gdal library and the a fresh install of rgdal | |
You need the shapefile referenced in the previous blog post. Here&apos;s the link to said shapefile https://team.cartodb.com/u/andrew/tables/andrew.us_states_hexgrid/public/map | |
I also added it here</content> | |
</entry> | |
<entry> | |
<id>gist:comment:hrbrmstr:2015-03-18T22:47:43Z</id> | |
<link href="https://gist.github.com/hrbrmstr/43a6d52622825fbd9e3d#gistcomment-1415781"/> | |
<title>Comment by timelyportfolio</title> | |
<updated>2015-03-18T22:47:43Z</updated> | |
<author> | |
<name>timelyportfolio</name> | |
</author> | |
<content type="html">freaking awesome</content> | |
</entry> | |
</feed> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment