Skip to content

Instantly share code, notes, and snippets.

@JakeAustwick
Created December 29, 2011 22:29
Show Gist options
  • Save JakeAustwick/1536485 to your computer and use it in GitHub Desktop.
Save JakeAustwick/1536485 to your computer and use it in GitHub Desktop.
1
{"generator"=>"WordPress 3.2.1", "description"=>"Mumbo Design will dramatically increase your ebay sales by providing you with ebay templates that will blow your competition away! We have a selection of free ebay listing templates as well as premium ebay templates. We design our templates to be eye catching and tempting. If you want a custom designed ebay template, it is not a problem. We can customise your ebay template to your exact needs. Don\u2019t forget to check out our testimonials to see our customer\u2019s comments about the quality of our work. We guarantee that all of our templates are unique and are coded to the required standard. All our work is done in-house by experienced designers.", "keywords"=>"ebay template, auction template, ebay listing, free ebay templates, ebay template design, ebay designer, ebay auction template, ebay html template, ebay designs, free ebay design, free ebay listing, free ebay, free listing, premium ebay template, ebay templates, free ebay template maker"}
"Mumbo Design \u2014 Ebay Templates | Ebay Template Design"
"Mumbo Design \u2014 Ebay Templates | Ebay Template Design"
["ebay template", "auction template", "ebay listing", "free ebay templates", "ebay template design", "ebay designer", "ebay auction template", "ebay html template", "ebay designs", "free ebay design", "free ebay listing", "free ebay", "free listing", "premium ebay template", "ebay templates", "free ebay template maker"]
[Finished]
require './meta_grabber'
grabber = MetaGrabber.new('http://www.mumbodesign.com')
grabber.grab_meta
p grabber.meta
p grabber.meta_title
p grabber.title
p grabber.keywords_array
require 'nokogiri'
require 'open-uri'
class MetaGrabber
attr_reader :doc, :meta
def initialize(url)
@url = url
@doc = Nokogiri::HTML::parse(open(@url))
@meta = {}
end
def title
@title ||= @doc.xpath('//title').text rescue nil
end
#Some sites do <meta name="title" ... /> for some wierd reason
def meta_title
@meta['title'] ||= title
end
def grab_meta
# grab each meta tag
for i in @doc.xpath('//meta') do
next if !i[:name] #dont really care about these, http types etc
meta[i[:name].to_s.downcase] = i[:content]
end
end
def keywords_array
@meta['keywords'] ? @meta['keywords'].split(",").map{|kw| kw.strip} : nil
end
def common_words
#Use readability to get main content, strip out shit words like "and etc"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment