-
-
Save presidentbeef/7156955 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/ruby -w | |
require 'set' | |
Entry = Struct.new :id, :instance do | |
def self.parse(line) | |
if /ID=\s*'([^']*)'\s+INSTANCE=\s*'([^']*)'/ =~ line | |
new $1, $2 | |
else | |
raise "Cannot parse: %p" % line | |
end | |
end | |
end | |
entries = Set.new | |
ARGV.each do |file| | |
File.foreach file do |line| | |
begin | |
entry = Entry.parse(line) | |
if entries.include? entry | |
puts line | |
else | |
entries << entry | |
end | |
rescue | |
# Ignore lines that don't parse | |
end | |
end | |
end |
And with Robert's original:
$ /usr/bin/time -v ruby doit_orig.rb input > /dev/null
Command being timed: "ruby doit_orig.rb input"
User time (seconds): 16.28
System time (seconds): 0.19
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.50
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 913344
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 57191
Voluntary context switches: 3
Involuntary context switches: 1656
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
So no memory savings, but faster.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With 1,000,000 entries and 452 duplicates: