Skip to content

Instantly share code, notes, and snippets.

@juliends
Created October 24, 2017 17:22
Show Gist options
  • Save juliends/f8b41ae4976f5400ca5717c345687795 to your computer and use it in GitHub Desktop.
Save juliends/f8b41ae4976f5400ca5717c345687795 to your computer and use it in GitHub Desktop.
scrapper_imdb
require 'open-uri' # Open an url
require 'nokogiri' # HTML ==> Nokogiri Document
url = "http://www.imdb.com/chart/top"
html = open(url).read
html_doc = Nokogiri::HTML(html)
html_doc.search('.titleColumn a').each do |element|
title = element.text
link = element.attribute('href')
actors = element.attribute('title')
url = "http://www.imdb.com/#{link}"
html1 = open(url).read
html_doc1 = Nokogiri::HTML(html1)
summary = html_doc1.search('.summary_text').text.strip
movie_text = "#{title}\n"
movie_text += "#{actors}\n"
movie_text += "#{summary}\n"
file_path = "#{title.gsub(" ","_")}.txt"
File.open(file_path, 'w') do |file|
file.write(movie_text)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment