Skip to content

Instantly share code, notes, and snippets.

@jenningsanderson
Created February 13, 2014 23:22
Show Gist options
  • Save jenningsanderson/8986086 to your computer and use it in GitHub Desktop.
Save jenningsanderson/8986086 to your computer and use it in GitHub Desktop.
Scrape a Moodle forum or forums, generate CSV of unique participants.
#http://stackoverflow.com/questions/19271622/rails-ruby-mechanize-how-to-get-a-page-after-redirection
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'mechanize'
require 'csv'
class MoodleScraper
attr_writer :base
def initialize(username, password)
login(username, password)
@participants = []
end
def login(uname, pwd)
@agent = Mechanize.new
@agent.user_agent = 'Individueller User-Agent'
@agent.user_agent_alias = 'Linux Mozilla'
@agent.get('https://moodle.cs.colorado.edu/login/index.php') do |page|
@login_page = page.form_with(:id => 'login') do |f|
f.username = uname
f.password = pwd
end.click_button
end
end
def get_all_forums(base_url=nil)
@forum_urls= []
base_url ||= @base
page = @agent.get base_url
page.search("td.topic").each do |link|
@forum_urls << link.child.values[0].to_s
end
@forum_urls.each do |forum|
scrape_forum(forum)
puts "Scraped #{forum}"
end
end
def scrape_forum(forum_url)
page = @agent.get forum_url # here checking authentication if success then redirecting to destination
page.search('div.author').each do |author|
name=author.child.next.text.split
if not @participants.include? [name[0],name[-1]]
@participants << [name[0],name[-1]]
end
end
end
def write_csv(name=nil)
name ||= 'moodle_participants.csv'
@participants_csv = CSV.open(name, "w")
@participants_csv << ['First Name', 'Last Name']
@participants.each do |part|
@participants_csv << part
end
end
end
## Example call:
## Initialize a scraper & authenticate
#forum_scrape = MoodleScraper.new('username', 'password')
## Scrape all Forums:
## The base is the page that lists each forum topic, see below for example
#forum_scrape.base='http://moodle.cs.colorado.edu/mod/forum/view.php?id=8960'
##Call this to get all forums.
#forum_scrape.get_all_forums
## OR you can scrape only one forum:
#forum_scrape.scrape_forum('http://moodle.cs.colorado.edu/mod/forum/discuss.php?d=5198')
## Write the CSV
#forum_scrape.write_csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment