Skip to content

Instantly share code, notes, and snippets.

@jpmckinney
Last active February 29, 2016 23:48
Show Gist options
  • Save jpmckinney/7549633 to your computer and use it in GitHub Desktop.
Save jpmckinney/7549633 to your computer and use it in GitHub Desktop.
Oj::ScHandler and Oj.sc_parse have seen little usage, yet they are considered by Oj's author to be the fastest way to parse. I wrote a few parsers to understand how Oj::ScHandler works and to compare its performance to Oj.load. Conclusion: If you want to parse an entire document (usually the case), then the simple Oj.load is still fastest.
require 'json'
require 'oj'
require 'multi_json'
# @see https://github.com/ohler55/oj/blob/master/lib/oj/schandler.rb
# @see https://github.com/ohler55/oj/blob/master/test/test_scp.rb#L37
# @see https://github.com/platzhirsch/metadata-harvester/blob/master/lib/dump_handler.rb
# Prints all calls, identifying hashes and arrays with integers.
class DebugHandler
def initialize
@id = 0
end
def hash_start(*args)
debug :hash_start, *args
@id += 1
end
def hash_end(*args)
debug :hash_end, *args
end
def array_start(*args)
debug :array_start, *args
@id += 1
end
def array_end(*args)
debug :array_end, *args
end
def hash_set(*args)
debug :hash_set, *args
end
def array_append(*args)
debug :array_append, *args
end
def add_value(*args)
debug :add_value, *args
end
def error(*args)
debug :error, *args
end
private
def debug(*args)
p args
end
end
# Constructs Ruby objects and stores all objects added with `add_value` in an array.
class ConservativeHandler
attr_reader :values
def initialize
@values = []
end
# @return [Hash] the first argument to `hash_set`
def hash_start
{}
end
def hash_end
end
# @return [Array] the first argument to `array_append`
def array_start
[]
end
def array_end
end
def hash_set(h, key, value)
h[key] = value
end
def array_append(a, value)
a << value
end
# There seems to be only one call to `add_value`.
def add_value(value)
@values << value
end
def error(message, line, column)
raise Exception.new("#{message} line #{line} column #{column}")
end
end
# Constructs Ruby objects and stores one object added with `add_value`.
class ParseHandler
attr_reader :value
# @return [Hash] the first argument to `hash_set`
def hash_start
{}
end
def hash_end
end
# @return [Array] the first argument to `array_append`
def array_start
[]
end
def array_end
end
def hash_set(h, key, value)
h[key] = value
end
def array_append(a, value)
a << value
end
# There seems to be only one call to `add_value`.
def add_value(value)
@value = value
end
def error(message, line, column)
raise Exception.new("#{message} line #{line} column #{column}")
end
end
f = File.read('/path/to/name.json')
# Inspect the output of the handlers.
Oj.sc_parse(DebugHandler.new, f)
cnt = ConservativeHandler.new
Oj.sc_parse(cnt, f)
puts JSON.pretty_generate(cnt.values)
cnt = ParseHandler.new
Oj.sc_parse(cnt, f)
puts JSON.pretty_generate(cnt.value)
# Compare the running time of different parsers.
t = Time.now;100000.times{Oj.sc_parse(ParseHandler.new, f)};Time.now - t
t = Time.now;100000.times{Oj.load(f)};Time.now - t
t = Time.now;100000.times{MultiJson.load(f)};Time.now - t
t = Time.now;100000.times{JSON.load(f)};Time.now - t
# `Oj.load` is faster than `Oj.sc_parse` if you are parsing a full document.
# `Oj.sc_parse` can be faster if you only want to parse part of a document.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment