Created
April 17, 2011 03:21
-
-
Save sarchertech/923721 to your computer and use it in GitHub Desktop.
Quick script I wrote to find stores with the same name, that may be franchises.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'yaml' | |
class Store | |
attr_accessor :name, :address, :city, :state, :zip, :phone_number | |
end | |
def list_of_stores | |
files = Dir.glob('*.yml') | |
stores = [] | |
files.each do |file| | |
File.open(file, 'r') {|f| stores += YAML.load(f)} | |
end | |
return stores | |
end | |
def delete_duplicates(stores) | |
seen = [] | |
marker = [] | |
counter = 0 | |
stores.each do |store| | |
attr_array = [store.name[0..5], store.zip, store.phone_number, store.address[0..3]] | |
if seen.include?(attr_array) | |
marker << store | |
counter += 1 | |
else | |
seen << attr_array | |
end | |
end | |
marker.each {|m| stores.delete(m)} | |
return counter | |
end | |
def print_multi_stores(stores) | |
seen = {} | |
stores.each do |store| | |
if seen.has_key?(store.name) | |
seen[store.name][0] += 1 | |
else | |
seen[store.name] = [1, store.state] | |
end | |
end | |
seen = seen.sort_by {|k,v| v[0]} | |
seen.reverse! | |
puts "" | |
puts "multi stores" | |
puts "-------------" | |
#sorting converts hash to array of arrays | |
seen.each do |k,v| | |
num, state = v | |
puts num.to_s + "--" + state + "--" + k if num > 1 | |
end | |
puts "-------------" | |
puts "" | |
end | |
stores = list_of_stores | |
stores.sort_by! {|s| s.zip} | |
puts stores.length | |
puts "deleted " + delete_duplicates(stores).to_s + " duplicate stores" | |
print_multi_stores(stores) |
Author
sarchertech
commented
Apr 17, 2011
via email
Awesome! Thanks for taking the time to review it. I had no idea that
Hash.new could take a block; I'll have to check out that screencast.
…On Sun, Apr 17, 2011 at 11:25 AM, JohnFord ***@***.*** wrote:
So, here are my mods: https://gist.github.com/923754
At first, I was focused on the block at line 45 in your code. Often in ruby, you can eliminate that whole, "if the key doesn't exist, initialize it, otherwise do something else" idiom by telling ruby, ahead of time, what to do whenever it encounters a key it hasn't seen. In this case, I'm passing in a block that I want it to run whenever it encounters a new key. That block, in turn, creates yet another hash that will initialize any new key's value to 0. So calling seen['New Store']['GA'] += 1 will automagically create something:
{ 'New Store' => { 'GA' => 1} } without us explicitly initializing either hash.
Dave Brady actually posted a screencast with more detail on this just the other day: http://www.heartmindcode.com/blog/2011/04/creating-ruby-hashes/ (There's also a follow-up with JEG2 on Dave's site that's worth watching.)
It ended up being a bit messier than I originally intended since I decided to keep a separate count for each state, as well.
##
Reply to this email directly or view it on GitHub:
https://gist.github.com/923721
I think you could use 1.9's uniq! with a block to simplify your delete_duplicates method: https://gist.github.com/932961
Nice refactor, I believe I'll use it.
…On Wed, Apr 20, 2011 at 5:56 PM, coty ***@***.*** wrote:
I think you could use 1.9's uniq! with a block to simplify your delete_duplicates method: https://gist.github.com/932961
##
Reply to this email directly or view it on GitHub:
https://gist.github.com/923721
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment