Skip to content

Instantly share code, notes, and snippets.

@wallerdev
Created January 26, 2012 03:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wallerdev/1680735 to your computer and use it in GitHub Desktop.
Save wallerdev/1680735 to your computer and use it in GitHub Desktop.
# This script can be used to check craigslist to see if certain apartment complexes become available.
# You can customize the Apartments, Periods and Bedrooms below to change the parameters.
require 'rubygems'
require 'htmlentities'
require 'rss'
require 'open-uri'
require 'pry'
# We use these constants to check multiple spellings and meanings of the same concept.
Apartments = {
"Beaumont" => [/beaumont/i, /baeumont/i],
"The Oaks" => [/oaks/i, /oak's/i],
"Abbott Pointe" => [/abbott point/, /abbot point/i, /abott point/]
}
Periods = [/spring/i, /jan/i, /dec/i, /second semester/i]
Bedrooms = [/2 bed/i, /2 br/i, /2br/i, /2 bd/i, /2bd/i]
class ApartmentFinder
# Find apartments based on the rss feed urls given.
def find(urls)
urls.each do |url|
open(url) do |rss|
feed = RSS::Parser.parse(rss)
find_from_feed(feed)
end
end
end
# Check each item in the feed
def find_from_feed(feed)
feed.items.each do |entry|
checker = AdChecker.new(entry.title, entry.description, entry.link)
checker.check_ad
end
end
end
class AdChecker
def initialize(title, content, url)
coder = HTMLEntities.new
@title = coder.decode(title)
@content = coder.decode(content)
@url = url
end
# Checks whether or not the title or content match any of the regexes given.
def check_title_and_content(regexes)
regexes.any? { |regex| @title =~ regex || @content =~ regex }
end
def check_ad
Apartments.each_key do |key|
if check_title_and_content(Apartments[key])
# If something matches the apartment hash, we output it
print "Apartment found at #{key}"
print " ( SPRING )" if check_title_and_content(Periods)
print " ( 2 BEDROOM )" if check_title_and_content(Bedrooms)
binding.pry
puts "\n", @title, @url, "\n"
end
end
end
end
apartment_finder = ApartmentFinder.new
apartment_finder.find([
'http://lansing.craigslist.org/apa/index.rss',
'http://lansing.craigslist.org/sub/index.rss'
])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment