Skip to content

Instantly share code, notes, and snippets.

@Andsbf
Last active Aug 29, 2015
Embed
What would you like to do?
Web Scrapper Exercise
class Comment
attr_accessor :content
def initialize (content)
@content = content
end
end
require 'pry'
require 'colorize'
require_relative 'post'
ARGV[0]
begin
post = Post.new(ARGV[0])
post.print_details
rescue
p "Invalid URL"
end
#classes
require 'nokogiri'
require 'open-uri'
require_relative 'comment'
class Post
attr_accessor :title ,:url, :points, :item_id, :content
def initialize(url)
@content= Nokogiri::HTML(open(url))
@title = content.search('.title > a').map { |a| a.inner_text}[0].sub(/\u00E2\u0080\u0093/,'-')
@url = url
@points = content.search('.subtext > span:first-child').map { |span| span.inner_text}
@item_id = (/=\d+/.match(url)[0]).match(/\d+/)[0]
@comments_array = content.search('.comment').map { |comment| Comment.new(comment.inner_text) }
end
def show_comments
@comments_array.each{|each_comment| p each_comment}
end
def add_comment(comment_obj)
@comments_array.push(comment_obj)
end
def print_details
puts "Post title: #{title}".colorize(:black).colorize(:background => :white)
puts "Number of comments: #{@comments_array.length}".colorize(:white).colorize( :background => :red).blink
end
end
@jansepar

This comment has been minimized.

Copy link

@jansepar jansepar commented Mar 12, 2015

Some code review for this:

Small thing, but you should take special care to make sure indentation is correct. Your initialize is unnecessary indented in comment.rb:

https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-comment-rb-L5

Maybe this was a mistake, but this line doesn't really do anything:

https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-main-rb-L6

Incase it's not clear, ARGV is just an array that contains all of the arguments passed into calling your program.

You didn't need to set @url, @points, @item_id, as you don't do anything with them after they are set. https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L13

You never call show_comments or add_comments, you can probably get rid of them: https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L20 and https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L24

You shouldn't have to convert UTF-8 characters into -s. There should be a way to specify the encoding. Seems like the answer is somewhere in here (but I haven't tried it myself):

https://gist.github.com/Andsbf/0e64c54d151c30dbd174#file-post-rb-L12

If you have any responses feel free to comment back here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment