Skip to content

Instantly share code, notes, and snippets.

@rsinger
Created March 8, 2010 15:55
Show Gist options
  • Save rsinger/325278 to your computer and use it in GitHub Desktop.
Save rsinger/325278 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'json'
def inspect_file(file)
elements = {}
while line = file.gets
parts = line.split("\t")
elements[parts[1]] ||=[]
j = JSON.parse(parts.last)
j.keys.each do | key |
elements[parts[1]] << key unless elements[parts[1]].index(key)
if j[key].is_a?(Array) || j[key].is_a?(Hash)
next if j[key].is_a?(Array) and !j[key].first.is_a?(Hash)
elements[key] ||=[]
if j[key].is_a?(Array)
j[key].each do |child|
child.keys.each do |c|
elements[key] << c unless elements[key].index(c)
end
end
else
j[key].keys.each do |c|
elements[key] << c unless elements[key].index(c)
end
end
end
end
end
return elements
end
file = File.new('/Volumes/External 7/shared/open-library/edition-2009-09-11.txt', 'r')
fields = inspect_file file
puts fields.inspect
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment