Skip to content

Instantly share code, notes, and snippets.

@chancancode
Last active December 29, 2015 10:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chancancode/ff3093b101d934065d1f to your computer and use it in GitHub Desktop.
Save chancancode/ff3093b101d934065d1f to your computer and use it in GitHub Desktop.
Semi-formal ActiveSupport JSON spec

JSON Base Types and JSON Primitives

For the purpose of this document...

  1. A Ruby Class is a JSON base type if it inherits from one of these Ruby Classes: FalseClass, NilClass, TrueClass, Numeric and String

  2. A Ruby object is a JSON primitive if and only if it is...

    2.1. An instance of a JSON base type, OR

    2.2. An instance of Array (or an instance of an Array subclass) AND all of its elements are JSON primitives, OR

    2.3. An instance of Hash (or an instance of a Hash subclass) AND all of its keys and values are JSON primitives.

as_json Hook

The purpose of this hook is to provide a way for users to customize how objects that are not JSON primitives should be encoded into JSON without having to directly construct the JSON string themselves.

Users can opt-in by implementing an as_json instance method on their custom Classes subject to the following constraints:

  1. It SHOULD return a "meaningful" Ruby representation of the object.

  2. It MUST accept an optional argument options. The meaning of the its content is intentionally left undefined, other than that it MUST be a Hash (or an instance of a Hash subclass) when present. The implementations MAY alter their behaviour based on their interprutation of the content of options.

  3. Side-effect-free: It MUST NOT mutate the object and the options Hash (when present).

  4. Idempotent: It MUST return the same representation when called with the same arguments.

The as_json hook is merely considered a "hint" for JSON encoders. Notably, it is NOT required to return a JSON primitive. If the implementation chooses to return a non-primitive, it is up to the encoders to interprut the result.

The ActiveSupport JSON Encoder Interface and the Recursive JSON-ify Algorithm described here assumes a "basic implemntation" of this hook to be available on all Ruby objects. A possible implementation could be:

class Object
  def as_json(options = nil)
    to_s
  end
end

class FalseClass
  def as_json(options = nil)
    self
  end
end

class NilClass
  def as_json(options = nil)
    self
  end
end

class TrueClass
  def as_json(options = nil)
    self
  end
end

class Numeric
  def as_json(options = nil)
    self
  end
end

class String
  def as_json(options = nil)
    self
  end
end

class Array
  def as_json(options = nil)
    self
  end
end

class Hash
  def as_json(options = nil)
    self
  end
end

Array#as_json and Hash#as_json in the above implementation are examples of as_json hooks that returns non-primitives (beacuse their content might not be JSON primitives).

ActiveSupport provides a more extensive set of as_json hooks for built-in types in 'lib/active_support/core_ext/object/json.rb'.

ActiveSupport JSON Encoder Interface

ActiveSupport provides a to_json method on all Ruby objects that serializes them into a JSON string. Under the hood, this method uses a JSON encoder to handle the encoding.

While ActiveSupport ships with a default encoder for this purpose, but it also exposes an API for switching to any custom encoder that implements the following interface:

  1. It MUST provide a constructor that accepts an optional argument options, which is the parameter being passed to the original #to_json call. For example, in "My String".to_json(some: "option"), its constructor will be called with {some: "option"} as an argument.

  2. It MUST provide an instance method encode that takes exactly one argument value, which is the object to be encoded into JSON. For example, in "foo".to_json, its encode method will be called with "foo" as the only argument. In addition...

    2.1. It MUST NOT mutate value and options.

    2.2. It MUST return an UTF-8 encoded String that is a valid JSON value as defined in [RFC4627].

    2.3. It SHOULD call as_json on value with options as the only argument (when present) and encode the result appropiately.

    2.4. It SHOULD apply the recursive JSON-ify algorithm (defined below) to the result of the 2.3.

    2.5. If value is nested data-structure (such as an Array or Hash), it SHOULD NOT call as_json with options on any of value's children, even if the recursive JSON-ify algorithm is not implemented.

    2.6. It SHOULD encode any resulting JSON primitives from 2.3. (2.4. if implemented) into their closest JSON value representation defined in [RFC4627]. (For example, nil SHOULD map to null, a Hash SHOULD map to an Object, etc.)

    2.7. It MUST encode the unicode characters \u2028 and \u2029 inside any JSON string values in their escaped forms.

    2.8. If ActiveSupport.escape_html_entities_in_json is set to a "truthy" value, it MUST encode the characters ><& inside any JSON string values in their escaped form (\u003e, \u003c and \u0026 respectively). Otherwise, it MUST NOT encode these characters in their escaped form.

    2.9. It MAY raise built-in or encoder-specific errors when it encounters a value that it cannot handle, such as circular data structures.

  3. It SHOULD provide a minimal public interface, ideally only the exposing the two required methods, in order to avoid potential conflicts with future additions to this interfacae.

In code, this is how it might look like:

class SomeEncoder
  def initialize(options = nil)
    @options = options
  end
  
  def encode(value)
    # Return a JSON representation of value here
    @options ? encode_json(value.as_json(options)) : encode_json(value.as_json)
  end
  
  private
    def encode_json(value)
      # ...
    end
end

Assuming SomeEncoder is a Class conforming to this interface, you can enable it by setting ActiveSupport.json_encoder = SomeEncoder.

The following issues are intentionally left open-ended:

  1. Whether an encoder could mutate the options hash in its constructor, or stores a modified version of options.

  2. Whether an encoder could use the options hash for its own configuration, such as "obj.to_json(pretty: true, indent: 3)".

Recursive JSON-ify Algorithm

The recursive JSON-ify algorithm is designed to recurrsively transform any Ruby object into a JSON primitive while respecting the user-defined as_json hooks. Virtually all available JSON libaries will be able to encode the output of this algorithm and produce very consistent results.

This algorithm is always called with a single argument obj, which is the object being transformed into a JSON primitive:

  1. If obj is an instance of a JSON base type, return obj.

  2. If obj is a Hash, or if obj is an instance of a Hash subclass...

2.1. Let cloned be an empty Hash.

2.2. For each key-value pair in obj traversed in arbitrary order...

2.2.1. Let `stringified_key` be the result of converting `key` into a 
       `String`. If `key` is already a `String`, `stringified_key` MUST be
       identical to (i.e. contains exactly the same characters as) `key`
       itsel, otherwise, the specific algorithm used for this conversion is
       implementation-specific. The implementation MAY raise an error if
       its chosen algorithm is unable to convert `key` into a `String`.

2.2.2. Let `jsonified_value` be the result of applying the **recursive
       JSON-ify algorithm** with `obj` set to `value`.

2.2.3. Set `cloned[stringified_key] = jsonified_value`.

2.3. Return cloned.

  1. If obj is an Array, or if objis an instance of anArray` subclass...

3.1. Let cloned be an empty Array.

3.2. For each value in obj traversed in their original order...

3.2.1. Let `jsonified_value` be the result of applying the **recursive
       JSON-ify algorithm** with `obj` set to `value`.

3.2.3. Add `jsonified_value` to the end of `cloned`.

3.3. Return cloned.

  1. Otherwise...

4.1. Let value be the result of calling as_json on obj without options.

4.2. Let jsonified be the result of applying the recursive JSON-ify algorithm with obj set to value.

4.3. Return jsonified.

The input obj MUST NOT be mutated.

Since it is possible for this algorithm to get "stuck" (e.g. a non-primitive returning itself in +as_json+), an implementation MAY pick an arbitrary maximum depth to traverse and raise an error when exceeded.

An encoder implementing this algorithm MAY also incoperate this algorithm into its encoding step for performance reasons, so long as it maintains the same sementics.

A possible implementation of this algorithm could be:

def jsonify(obj)
  case obj
  when FalseClass, NilClass, TrueClass, Numeric, String
    obj
  when Hash
    Hash[obj.map { |k, v| [jsonify(k).to_s, jsonify(v)] }]
  when Array
    obj.map { |v| jsonify(v) }
  else
    jsonify(obj.as_json)
  end
end

An important limitation of this algorithm is that it does not use the as_json hooks on all JSON base types, Hash, Array and their subclasses, as well as on hash keys (implementation-specific).

Examples

The following examples show some possible implmentation of the ActiveSupport JSON encoder interface and the recursive JSON-ify algorithm. They are optimized for clearity and might not be particularly performant.

An encoder that uses an external library

This example shows an encoder that uses the built-in JSON gem as its core:

require 'json'

class JsonGemEncoder
  def initialize(options = nil)
    @options = options
  end

  def encode(value)
    if @options
      escape stringify jsonify value.as_json(@options)
    else
      escape stringify jsonify value.as_json
    end
  end

  private

    def jsonify(obj)
      case obj
      when FalseClass, NilClass, TrueClass, Numeric, String
        obj
      when Hash
        Hash[obj.map { |k, v| [jsonify(k).to_s, jsonify(v)] }]
      when Array
        obj.map { |v| jsonify(v) }
      else
        jsonify(obj.as_json)
      end
    end

    def stringify(jsonified)
      ::JSON.generate(jsonified, quirks_mode: true)
    end

    def escape(str)
      replacements = {
        ">" => '\u003e',
        "<" => '\u003c',
        "&" => '\u0026',
        "\u2028" => '\u2028',
        "\u2029" => '\u2029'
      }

      if ActiveSupport.escape_html_entities_in_json
        str.gsub!(/[><&\u2028\u2029]/u, replacements)
      else
        str.gsub!(/[\u2028\u2029]/u, replacements)
      end
    end
end

A pure-ruby encoder

This example shows a pure-ruby encoder that does not depend on any external libraries. It also incorporates the recursive JSON-ify algorithm into the encoding process directly:

class PureRubyEncoder
  def initialize(options = nil)
    @options = options
  end

  def encode(value)
    if @options
      stringify value.as_json(@options)
    else
      stringify value.as_json
    end
  end

  private

    def stringify(obj, buffer = '', is_key = false)
      case obj
      when FalseClass
        is_key ? (buffer << %("false")) : (buffer << 'false')
      when NilClass
        is_key ? (buffer << %("null")) : (buffer << 'null')
      when TrueClass
        is_key ? (buffer << %("true")) : (buffer << 'true')
      when Numeric
        is_key ? (buffer << %("#{obj.to_s}")) : (buffer << obj.to_s)
      when String
        buffer << '"' << escape(obj) << '"'
      when Hash
        if is_key
          buffer << '"' << escape(stringify(obj)) << '"'
        else
          buffer << '{'
          obj.each do |k, v|
            stringify(k, buffer, true)
            buffer << ':'
            stringify(v, buffer)
            buffer << ','
          end
          buffer.chop! if buffer.end_with?(',')
          buffer << '}'
        end
      when Array
        if is_key
          buffer << '"' << escape(stringify(obj)) << '"'
        else
          buffer << '['
          obj.each do |v|
            stringify(v, buffer)
            buffer << ','
          end
          buffer.chop! if buffer.end_with?(',')
          buffer << ']'
        end
      else
        stringify(obj.as_json, buffer, is_key)
      end
    end

    def escape(str)
      replacements = Hash.new do |hash, key|
        hash[key] = '\u' + key.unpack('U*')[0].to_s(16).rjust(4,'0')
      end

      replacements["\""] = "\\\""
      replacements["\\"] = "\\\\"

      if ActiveSupport.escape_html_entities_in_json
        str.gsub(/[\u0000-\u001f\\\"><&\u2028\u2029]/u, replacements)
      else
        str.gsub(/[\u0000-\u001f\\\"\u2028\u2029]/u, replacements)
      end
    end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment