Skip to content

Instantly share code, notes, and snippets.

@iandouglas
Created August 10, 2018 19:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iandouglas/45db16020d85ed164631ea6289164cac to your computer and use it in GitHub Desktop.
Save iandouglas/45db16020d85ed164631ea6289164cac to your computer and use it in GitHub Desktop.
Scraping GitHub pull requests using Ruby
# determine if a user has made a pull request (or several), returning a JSON payload either way
require 'nokogiri'
require 'open-uri'
require 'json'
student = '' # 'iandouglas'
repo = 'turingschool/portfolios'
doc = Nokogiri::HTML(open("https://github.com/#{repo}/pulls"))
# important note: this does not handle pagination!
pull_requests = Array.new
next_state = 'pr_link'
pr_link = ''
doc.xpath('//li//div//a').each do |link|
if next_state == 'pr_link' && link.attributes['href'].value.include?('/pull/')
pr_link = link.attributes['href'].value
next_state = 'github_username'
elsif next_state == 'github_username'
next_state = 'pr_link'
if student.empty? || (link.children[0].content.strip.downcase == student.downcase)
pull_requests << "https://github.com#{pr_link}"
end
pr_link = ''
end
end
pr = {'pr_list' => pull_requests}.to_json
puts pr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment