Skip to content

Instantly share code, notes, and snippets.

@sarchertech
Created April 17, 2011 03:21
Show Gist options
  • Save sarchertech/923721 to your computer and use it in GitHub Desktop.
Save sarchertech/923721 to your computer and use it in GitHub Desktop.
Quick script I wrote to find stores with the same name, that may be franchises.
require 'yaml'
class Store
attr_accessor :name, :address, :city, :state, :zip, :phone_number
end
def list_of_stores
files = Dir.glob('*.yml')
stores = []
files.each do |file|
File.open(file, 'r') {|f| stores += YAML.load(f)}
end
return stores
end
def delete_duplicates(stores)
seen = []
marker = []
counter = 0
stores.each do |store|
attr_array = [store.name[0..5], store.zip, store.phone_number, store.address[0..3]]
if seen.include?(attr_array)
marker << store
counter += 1
else
seen << attr_array
end
end
marker.each {|m| stores.delete(m)}
return counter
end
def print_multi_stores(stores)
seen = {}
stores.each do |store|
if seen.has_key?(store.name)
seen[store.name][0] += 1
else
seen[store.name] = [1, store.state]
end
end
seen = seen.sort_by {|k,v| v[0]}
seen.reverse!
puts ""
puts "multi stores"
puts "-------------"
#sorting converts hash to array of arrays
seen.each do |k,v|
num, state = v
puts num.to_s + "--" + state + "--" + k if num > 1
end
puts "-------------"
puts ""
end
stores = list_of_stores
stores.sort_by! {|s| s.zip}
puts stores.length
puts "deleted " + delete_duplicates(stores).to_s + " duplicate stores"
print_multi_stores(stores)
@coty
Copy link

coty commented Apr 20, 2011

I think you could use 1.9's uniq! with a block to simplify your delete_duplicates method: https://gist.github.com/932961

@sarchertech
Copy link
Author

sarchertech commented Apr 21, 2011 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment