Skip to content

Instantly share code, notes, and snippets.

@DiegoSalazar
Created February 6, 2015 03:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DiegoSalazar/792807c08ed92c3e787a to your computer and use it in GitHub Desktop.
Save DiegoSalazar/792807c08ed92c3e787a to your computer and use it in GitHub Desktop.
Print out hotspots from some kind of data file
#!/usr/bin/env ruby
# from the command line run:
# ruby ./find_hotspots.rb FILE_NAME THRESHOLD
class Array
def to_ranges
compact.sort.uniq.inject([]) do |r,x|
r.empty? || r.last.last.succ != x ? r << (x..x) : r[0..-2] << (r.last.first..x)
end
end
end
# read arguments from the command line
data_file = ARGV[0]
threshold = ARGV[1].to_i
# initiate a dictionary
aggregations = {}
# go through line of the file and provide the line number
File.readlines(data_file).each_with_index do |line, row_number|
next if row_number == 0 # skip the first row
# split the string, by spaces, into an array
columns = line.split /\s+/
# create a dictionary entry and add the 6th column's value to it
aggregations[row_number] = columns[5]
end
# get line numbers and their aggregation column
grouped = aggregations.group_by do |line_number, aggr_value|
aggr_value.to_f > threshold
end
# Grab just the line numbers from an array that looks like [[ROW, aggr], ...] == [ROW, ...]
hotspots = grouped[true].map do |group|
group[0]
end
puts hotspots.to_ranges
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment