Skip to content

Instantly share code, notes, and snippets.

@ashaw
Created October 26, 2013 11:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ashaw/7168162 to your computer and use it in GitHub Desktop.
Save ashaw/7168162 to your computer and use it in GitHub Desktop.
namespace :transcribable do
desc "Harvest documents to transcribe from DocumentCloud"
task :harvest_kmiz => :environment do
require 'rest-client'
klass = Kernel.const_get(Transcribable.table.classify)
result = JSON.parse(RestClient.get("http://www.documentcloud.org/api/search?q=contributedto%3Afreethefiles+station%3AKMIZ"))
result['documents'].each do |doc|
obj = klass.find_or_initialize_by_url("https://www.documentcloud.org/documents/#{doc['id']}")
# don't plow over verified docs if rerunning the script
obj.verified = false if obj.new_record?
obj.save
puts "== added #{obj.url}"
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment