Skip to content

Instantly share code, notes, and snippets.

@pricees
Created November 13, 2012 19:54
Show Gist options
  • Save pricees/4067982 to your computer and use it in GitHub Desktop.
Save pricees/4067982 to your computer and use it in GitHub Desktop.
RSS Feed processor class
require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'json'
# Class handles reading, processing RSS feeds
#
# Examples
#
# # GOOD
# Feed.run("http://celebuzz.com/feed", 1)
# # => [ [ "[item 1]", "[title 1]", "[image link 1]" ] ]
#
# Feed.run("http://celebuzz.com/feed", 4, [ :link, :title, :description ])
# # => [ [ "[item 1]", "[title 1]", "[description 1]", "[image link 1]" ],
# [ "[item 2]", "[title 2]", "[description 2]", "[image link 2]" ],
# [ "[item 3]", "[title 3]", "[description 3]", "[image link 3]" ],
# [ "[item 4]", "[title 4]", "[description 4]", "[image link 4]" ] ]
#
# # BAD
# Feed.run("http://celebuzz.com/fe", 1)
# # => [ [ :exception, ""http://celebuzz.com/fe", "404..." ], ]
class Feed
# Public: Initialize a feed
#
# options - Optional values to instantiate attributes to
# :source - RSS url
#
# Examples
#
# Feed.new :source => "http://foo.com"
# # => #<Feed..
#
# Returns a Feed instance
def initialize(options={})
@source = options[:source] if options.key?(:source)
end
# Public: Sets attributes hash with values from rss feed
#
# attrs - Array of atttribute keys as symbols
#
# Examples
#
# #info([ :link ])
# # => [ :link ]
#
# Returns an array
def info(attrs = [:link,:title,:description])
attrs.each { |x| self.attributes[x] = channel.send x }
end
# Public: Return an array of values from an rss feed
#
# n - number of items to return
# attrs - array of attributes to return for each rss item
# NOTE: the first image in the description is added to the return array
#
# Example
# to_ary(1, attrs[:link])
# # => [ [ "http://example.net", "http://example.net/image.png" ] ]
#
# Returns a collection of arrays
def to_ary(n = 5, attrs=[:link,:title])
@ary ||= channel.items.first(n).map do |item|
tmp = attrs.map { |attr| item.send attr }
# Find the first image
if item.description
item.description =~ /<img [^>]*src=["|\']([^"|\']+)/i
tmp << $1
end
tmp
end
end
# Public: Processes feeds and returns array of data
#
# feeds - One or more feed urls
#
# Examples
#
# Feed.run("http://example.net")
# # => [ [ "http://example.net/item1.html", "[image]" [, ... ]
#
#
#
# NOTE: Array may contain :exception as first element if error
# Returns an array
def self.run(feeds, n = 5, attrs = [ :link, :title, :description ])
Array(feeds).map do |source|
begin
new(source: source).to_ary(n, attrs)
rescue Exception => blasto
[ :exception, source, blasto.to_s ]
end
end
end
private
attr_accessor :content
attr_accessor :source
def channel
@channel ||= parse.channel
end
def parse
open(source) { |s| self.content = s.read }
rss = RSS::Parser.parse(content, false)
end
def attributes
@attributes ||= {}
end
end
if $0 == __FILE__
feeds = [
"http://celebuzz.com/feed", # good
"http://celebuzz.com/fee", # bad
]
p Feed.run(feeds, 1)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment