Skip to content

Instantly share code, notes, and snippets.

@cvonkleist
Created October 21, 2009 04:50
Show Gist options
  • Save cvonkleist/214881 to your computer and use it in GitHub Desktop.
Save cvonkleist/214881 to your computer and use it in GitHub Desktop.
# This is a method I use when working with slow ruby scripts that
# operate on huge datasets.
#
# It caches the return value of a block of extremely slow code
# in a file so that subsequent runs are fast.
#
# It's indispensable to me when doing edit-debug-edit-debug-edit
# cycles on giant datasets.
# loads ruby object from a cache file or (if cache file doesn't exist) runs
# provided block and writes result to cache file
def cache(file)
if File.exists?(file)
STDERR.puts "(used cache)"
Marshal.load(File.read(file))
else
object = yield
File.open(file, 'w') { |f| f.print Marshal.dump(object) }
object
end
end
# usage example (loading 8,000 marc-xml files)
STDERR.puts "loading marc files"
marc_records = cache('marc_records.cache') do
marc_records = {}
marc_files.each do |f|
doc = Document.new(File.read(f))
title = doc.elements['//datafield[@tag="245"]/subfield[@code="a"]']
marc_records[f] = title.text
end
marc_records
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment