Skip to content

Instantly share code, notes, and snippets.

@timuruski
Last active December 14, 2015 18:27
Show Gist options
  • Save timuruski/3824eca2861e4c4c15fc to your computer and use it in GitHub Desktop.
Save timuruski/3824eca2861e4c4c15fc to your computer and use it in GitHub Desktop.
A dumb algorithm for compressing blocks of sorted integers, like row IDs.
# This algorithm takes a *sorted* list of integers, probably record
# IDs and compresses them into a series of ranges. It also includes a
# verification check that no values were lost.
def compress(ary)
range_start = ary.first
compressed = []
ary.each_with_index do |value, index|
# TODO: This could skip indices that have already been considered.
# The value + 2 forces the last cunk to flush at the end.
next_value = ary.fetch(index + 1) { value + 2 }
if next_value - value > 1
chunk = (range_start..value)
range_start = next_value
compressed << chunk
end
end
compressed
end
def verify(compressed, uncompressed)
lost = uncompressed.reject { |n|
compressed.any? { |r|
r.include?(n)
}
}
raise "Lost values: #{lost.join(' ')}" unless lost.empty?
compressed
end
def compress!(ary)
compress(ary).tap { |compressed|
verify(compressed, ary)
}
end
# Process ARGF
ids = ARGF.readlines.map(&:to_i).sort
puts compress!(ids)
@timuruski
Copy link
Author

Takes input like:

1
2
3
5
6
7
9
10

And produces:

1..3
5..7
9..10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment