Skip to content

Instantly share code, notes, and snippets.

@dvoryankin
Created February 15, 2018 15:08
Show Gist options
  • Save dvoryankin/2306602d391ecfc67e8ccb71e9bafbb5 to your computer and use it in GitHub Desktop.
Save dvoryankin/2306602d391ecfc67e8ccb71e9bafbb5 to your computer and use it in GitHub Desktop.
nokogiri scraper answers
# 1) What is your favorite Ruby class/library or method and why?
# I like ruby structure at all. It is intuitive understandability of its methods and language reads as plain english.
# I started using ruby for linux administration and writing scripts on ruby, so my favourite class is 'File', I like its for convenient methods for working with filesystem.
# Of course also array and hash classes, because simply a pleasure to work with objects of these classes.
# As for method - maybe reduce, because it allows to do complex things by a simple logic and allows do not make a big hardreading constructions.
# Also I like a lot of methods for working with hashes and arrays for their simplicity and clarity.
# Before ruby, I had experience with Java and C#, and now I'm glad that I'm working with ruby.
-----
#2) Given HTML: "<div class="images"><img src="/pic.jpg"></div>" Using Nokogiri how would you select the src attribute from the image? Show me two different ways to do that correctly the the HTML given.
html = Nokogiri::HTML('<div class="images"><img src="/pic.jpg"></div>')
html.at_css('.images img').attr('src')
#or
html.css('img')[0]['src']
# => "/pic.jpg"
#or
html.xpath('//div[@class="images"]/img/@src').map { |s| s.value }
# => ["/pic.jpg"]
-----
# 3) If found HTML was a collection of li tags within a div with class="attr", how would you use Nokogiri to collect that information into one array?
html = Nokogiri::HTML("<div class='attr'><ul><li>first</li><li>second</li></ul></div>")
html.xpath("//div[@class='attr']//ul//li").map(&:text)
# => ["first", "second"]
-----
# 4) Given the following HTML:
# <div class="listing"> <div class="row"> <span class="left">Title:</span>
# <span class="right">The Well-Grounded Rubyist</span> </div> <div class="row"> <span class="left">Author:</span>
# <span class="right">David A. Black</span> </div> <div class="row"> <span class="left">Price:</span>
# <span class="right">$34.99</span> </div> <div class="row"> <span class="left">Description:</span>
# <span class="right">A great book for Rubyists</span> </div> <div class="row"> <span class="left">Seller:</span>
# <span class="right">Ruby Scholar</span> </div> </div>
# Please collect all of the data presented into a key-value store. Please include code and the output.
html = Nokogiri::HTML('<div class="listing"> <div class="row"> <span class="left">Title:</span>
<span class="right">The Well-Grounded Rubyist</span> </div> <div class="row">
<span class="left">Author:</span> <span class="right">David A. Black</span> </div>
<div class="row"> <span class="left">Price:</span> <span class="right">$34.99</span>
</div> <div class="row"> <span class="left">Description:</span> <span class="right">A great book for Rubyists</span>
</div> <div class="row"> <span class="left">Seller:</span> <span class="right">Ruby Scholar</span> </div> </div>')
hash = {}
html.css('.row').map do |row|
hash[row.css('.left').text.to_sym]=row.css('.right').text
end
hash.each { |key, value| puts "#{key} #{value}" }
# =>
# Title: The Well-Grounded Rubyist
# Author: David A. Black
# Price: $34.99
# Description: A great book for Rubyists
# Seller: Ruby Scholar
#5) What Ruby feature do you hate?
# Not that I really hate somethind in ruby, but sometimes I little sad that is not so fast.
# But ruby other pleasantnesses outweigh these minor downsides.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment