Skip to content

Instantly share code, notes, and snippets.

@binarycleric
Last active April 17, 2017 16:01
Show Gist options
  • Save binarycleric/fe26df0d22561499954514a21d2148f8 to your computer and use it in GitHub Desktop.
Save binarycleric/fe26df0d22561499954514a21d2148f8 to your computer and use it in GitHub Desktop.
JSON parsing is horrible

We had a small public API change at work that caused some of our clients to break. This was entirely unintentional and the fact that our tests didn't catch this is somewhat embarassing. We noticed that the problem was due to format changes as a result of moving some code from JSON.generate to MultiJson.dump. This lead me down a bit of a rabbit hole and I discovered how different marshalling can between libraries.

Configuration

Our MultiJson setup uses Oj with the following dump options.

{
                    :mode => :compat,
             :time_format => :ruby,
             :use_to_json => false,
   :bigdecimal_as_decimal => false,
             :use_as_json => true
}

Some areas of code use JSON.generate. Others use ActiveSupport#to_json. A few use Oj.dump directly. It is a total mess and something I'd really like to cleanup. Aren't 8 year old "founder" codebases wonderful?

Parsing Time Objects

WRITEABLE [9] jdaniel(main)> Time.now.to_json
"\"2017-04-17T15:10:13.030+00:00\""

Give me the primitive that is eventually parsed into JSON.

WRITEABLE [10] jdaniel(main)> Time.now.as_json
"2017-04-17T15:11:00.020+00:00"
WRITEABLE [11] jdaniel(main)> JSON.dump Time.now
"\"2017-04-17 15:11:18 +0000\""
WRITEABLE [12] jdaniel(main)> JSON.generate Time.now
"\"2017-04-17 15:11:35 +0000\""
# Uses Oj with the following dump options
# {
#                      :mode => :compat,
#               :time_format => :ruby,
#               :use_to_json => false, # Explicitly override MultiJSON's default
#     :bigdecimal_as_decimal => false,
#               :use_as_json => true
# }
WRITEABLE [13] jdaniel(main)> MultiJson.dump Time.now
"\"2017-04-17T15:11:56.275+00:00\""
# Oj with defaults
WRITEABLE [16] jdaniel(main)> Oj.dump Time.now
"{\"^t\":1492442007.768789000e0}"

With Sequel::Postgres::PGArray

WRITEABLE [19] jdaniel(main)> sgids.class
Sequel::Postgres::PGArray < #<Class:0x007f91465be9e0>
WRITEABLE [20] jdaniel(main)> sgids
[
    [0] "sg-460d6f3a"
]
WRITEABLE [21] jdaniel(main)> sgids.to_json
"[\"sg-460d6f3a\"]"

Give me the primitive that is eventually parsed into JSON.

WRITEABLE [22] jdaniel(main)> sgids.as_json
[
    [0] "sg-460d6f3a"
]
WRITEABLE [23] jdaniel(main)> JSON.dump sgids
"[\"sg-460d6f3a\"]"
WRITEABLE [24] jdaniel(main)> JSON.generate sgids
"[\"sg-460d6f3a\"]"
WRITEABLE [26] jdaniel(main)> MultiJson.dump sgids
"[\"sg-460d6f3a\"]"
WRITEABLE [28] jdaniel(main)> Oj.dump sgids
"{\"^o\":\"Sequel::Postgres::PGArray\",\"delegate_dc_obj\":[\"sg-460d6f3a\"],\"array_type\":\"text\"}"

It is worth noting that the above behavior was happening with the default behavior in MultiJson with Oj. We changed some dump options to fix this case but we wanted to understand the root cause.

Conclusion

Everyone seems to have their own opinions on how JSON should be marshalled. This isn't necessarily a bad thing but it can lead to some real pain when trying to switch to different parsers for performance or uniformity reasons. I'm sure there are numerous other cases out there but these are the two that have caused us the most pain.

Computers were a mistake.

@colindean
Copy link

Computers were a mistake.

XD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment