Skip to content

Instantly share code, notes, and snippets.

@billdueber
Created September 22, 2010 15:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save billdueber/591907 to your computer and use it in GitHub Desktop.
Save billdueber/591907 to your computer and use it in GitHub Desktop.
# The current version. Using self.inxex(field) makes this O(n^2)!
# Rebuild the HashWithChecksumAttribute with the current
# values of the fields Array
def reindex
@tags = {}
self.each do |field|
@tags[field.tag] ||= []
@tags[field.tag] << self.index(field) ##### AAAAAAAHHHHHHHH ####
end
@clean = true
end
### Instead, use this ###
def reindex
@tags = {}
self.each_with_index do |field, i|
@tags[field.tag] ||= []
@tags[field.tag] << i
end
@clean = true
end
### Benchmarking by number of tags
tags = ['001','005', '100','110','111','240','243','245', '700', '710', '711']
rec = MARC::Reader.new('batch.dat').first
Benchmark.bm do |x|
(1..(tags.size)).each do |numtags|
tagset = tags[0..(numtags -1 )]
x.report("#{numtags} tag with #fields ") do
10000.times do
rec.reindex
tagset.each do |tag|
t = rec.fields(tag)
end
end
end
x.report("#{numtags} tag with #find_all") do
10000.times do
tagset.each do |tag|
t = rec.find_all {|f| f.tag == tag}
end
end
end
puts ""
end
end
# 1 tag with #fields 0.630000 0.010000 0.640000 ( 0.638195)
# 1 tag with #find_all 0.200000 0.000000 0.200000 ( 0.206705)
#
# 2 tag with #fields 0.730000 0.010000 0.740000 ( 0.752267)
# 2 tag with #find_all 0.380000 0.000000 0.380000 ( 0.389939)
#
# 3 tag with #fields 0.840000 0.010000 0.850000 ( 1.234076)
# 3 tag with #find_all 0.580000 0.000000 0.580000 ( 0.679571)
#
# 4 tag with #fields 0.910000 0.010000 0.920000 ( 1.014209)
# 4 tag with #find_all 0.760000 0.010000 0.770000 ( 0.796539)
#
# 5 tag with #fields 1.010000 0.010000 1.020000 ( 1.040470)
# 5 tag with #find_all 0.930000 0.000000 0.930000 ( 0.972256)
#
# 6 tag with #fields 1.090000 0.010000 1.100000 ( 1.105366)
# 6 tag with #find_all 1.120000 0.010000 1.130000 ( 1.180861)
#
# 7 tag with #fields 1.190000 0.010000 1.200000 ( 1.614809)
# 7 tag with #find_all 1.310000 0.010000 1.320000 ( 1.359643)
#
# 8 tag with #fields 1.270000 0.010000 1.280000 ( 1.346432)
# 8 tag with #find_all 1.500000 0.020000 1.520000 ( 1.579795)
#
# 9 tag with #fields 1.360000 0.010000 1.370000 ( 1.420000)
# 9 tag with #find_all 1.680000 0.010000 1.690000 ( 1.863628)
#
# 10 tag with #fields 1.460000 0.010000 1.470000 ( 1.665508)
# 10 tag with #find_all 1.870000 0.020000 1.890000 ( 1.976831)
#
# 11 tag with #fields 1.520000 0.010000 1.530000 ( 1.572052)
# 11 tag with #find_all 2.060000 0.020000 2.080000 ( 2.151522)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment