Skip to content

Instantly share code, notes, and snippets.

@jqr
Created May 6, 2010 16:05
Show Gist options
  • Save jqr/392297 to your computer and use it in GitHub Desktop.
Save jqr/392297 to your computer and use it in GitHub Desktop.
module FindDuplicates
# Returns a hash of duplicate values and their counts, select can be
# anything that evaluates to a single SQL column.
# Warning: select and minimum_count are not SQL escaped!
#
# User.find_duplicates(:username) # { 'jqr' => 2 }
# User.find_duplicates(LOWER(city), 5) # { 'indianaplis' => 25, 'new york' => 201 }
def find_duplicates(select, minimum_count = 2)
rows = connection.select_all("SELECT #{select} AS value, COUNT(*) AS count FROM #{table_name} GROUP BY #{select} HAVING COUNT(*) >= #{minimum_count}")
rows.inject({}) do |acc, row|
acc[row['value']] = row['count'].to_i
acc
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment