Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
How to use custom Python JSON serializers and deserializers to automatically roundtrip complex types.
import json, datetime
class RoundTripEncoder(json.JSONEncoder):
DATE_FORMAT = "%Y-%m-%d"
TIME_FORMAT = "%H:%M:%S"
def default(self, obj):
if isinstance(obj, datetime.datetime):
return {
"_type": "datetime",
"value": obj.strftime("%s %s" % (
self.DATE_FORMAT, self.TIME_FORMAT
))
}
return super(RoundTripEncoder, self).default(obj)
data = {
"name": "Silent Bob",
"dt": datetime.datetime(2013, 11, 11, 10, 40, 32)
}
print json.dumps(data, cls=RoundTripEncoder, indent=2)
import json, datetime
from dateutil import parser
class RoundTripDecoder(json.JSONDecoder):
def __init__(self, *args, **kwargs):
json.JSONDecoder.__init__(self, object_hook=self.object_hook, *args, **kwargs)
def object_hook(self, obj):
if '_type' not in obj:
return obj
type = obj['_type']
if type == 'datetime':
return parser.parse(obj['value'])
return obj
print json.loads(s, cls=RoundTripDecoder)
@simonw
Copy link
Author

simonw commented Oct 15, 2013

First output is:

{
  "dt": {
    "_type": "datetime", 
    "value": "2013-11-11 10:40:32"
  }, 
  "name": "Silent Bob"
}

Second output is:

{u'dt': datetime.datetime(2013, 11, 11, 10, 40, 32), u'name': u'Silent Bob'}

Loading

@Timokasse
Copy link

Timokasse commented Apr 10, 2017

This is very nice, easy to read code. Now I would like to go a step further and avoid having to serialize the datetime into a dict:

{ "dt": "2013-11-11T10:40:32", "name": "Silent Bob" }

To decode it, one would simply check first if the string is ISO format, parse it if so or return the string if not. I don't see how it is possible inheriting from JSONDecoder. Would you have some suggestions?

Loading

@foresmac
Copy link

foresmac commented Jan 8, 2018

@Timokasse I'm not sure it's possible with the current design of JSONDecoder API. It might be possible to override object_pairs_hook, but to be honest, I'm not sure about that either. I just wrote something that loops through all the items in the output of standard json.loads call, and converts anything with date, datetime, or timestamp in the key with a converter, but that's because I can be sure about the expected keys.

Loading

@foresmac
Copy link

foresmac commented Jan 8, 2018

Maybe one strategy is to override scanstring on a subclass of JSONDecoder, though since it will be pure Python, it will likely be slower than the standard C-based implementation. https://github.com/python/cpython/blob/3.6/Lib/json/decoder.py#L69

Loading

@setaou
Copy link

setaou commented Apr 5, 2018

In case anyone is looking for the hack discussed above, I have implemented it here :
https://gist.github.com/setaou/ff98e82a9ce68f4c2b8637406b4620d1
In the end, the function json.decoder.scanstring still uses the C version, but json.scanner.make_scanner must use the python version.
I have not benchmarked this hack as I do not make a heavy usage of it.

Loading

@jtlz2
Copy link

jtlz2 commented Aug 17, 2018

@simonw Awesome example, for which thanks. Does the custom decoder operate recursively over the whole JSON tree, or only on the top level?

Loading

@raph92
Copy link

raph92 commented Nov 7, 2019

@simonw Awesome example, for which thanks. Does the custom decoder operate recursively over the whole JSON tree, or only on the top level?

Recursively

Loading

@Koubae
Copy link

Koubae commented Nov 18, 2020

@raph92 it acts recursively. This is my implementation.

INPUT

`class MainDecoder(json.JSONDecoder):

    date_time_map = {'date', 'datetime', 'day', 'hour', 'minutes', 'month', 'seconds', 'time', 'year'}
    num_type_data = {'fraction', 'decimal', 'complex'}

    def __init__(self, *args, **kwargs):
        super().__init__(object_hook=self.object_hook,strict=False, *args, **kwargs)

    def object_hook(self, obj):
        if '_type' not in obj:
            return obj
        get_type = obj['_type']
        if get_type in self.date_time_map: # check if _type is a datetime type
            obj['value'] = self.date_deserialize(obj['value'], get_type)
        elif get_type in self.num_type_data:  # Checks for fractions, decimal and complex
            try:
                obj['value'] = self.eva_data(obj['value'])
            except ValueError as err:
                print('object_hook ---> in num_type_data eval', err)
        elif get_type == '_set':
            obj['value'] = set(obj['value'])
        return obj

    @staticmethod
    def eva_data(obj):
        """Eval fractions, Decimals and complex num types"""
        return eval(obj)

    @staticmethod
    def date_deserialize(obj, _type):

        # TODO deserialize date with other format types, for instance 2020/11/17
        if _type == 'date':
            try:
                if isinstance(obj, list):  # Date can be [2020, 11, 17] or '2020-11-17)
                    obj = date(*[int(item) for item in obj])
                else:
                    obj = date(*[int(item) for item in obj.split('-')])
            except ValueError as err:
                print('data_serialize -- data', err)

        elif _type == 'datetime':
            try:
                obj = datetime.strptime(str(obj), '%Y-%m-%d %H:%M:%S')
            except ValueError as err:
                try:
                    obj = datetime.fromisoformat(str(obj))
                except ValueError as err:
                    print('data_serialize -- datatime', err)
        return obj`

JSON

`json_schema_ok = '''

    {
      "decimal": {
        "_type": "decimal",
        "value": "Decimal(1.5)",
        "required": null
      },
      "fraction": {
        "_type": "fraction",
        "value": "Fraction(1, 2)",
        "required": null
      },
      "complex": {
        "_type": "complex",
        "value": "complex(2+2j)",
        "required": null
      },
      "datetime": {
        "_type": "datetime",
        "value": "2020-11-18T04:13:07.947272",
        "required": true
      },
      "date": {
        "_type": "date",
        "value": [
          3020,
          11,
          17
        ],
        "required": null
      },
      "_set": {
        "_type": "_set",
        "value": [
          1,
          2,
          3
        ],
        "required": null
      }
    }

'''`

OUTPUT

`schema_output_1 = {'decimal': {'_type': 'decimal', 'value': Decimal('1.5'), 'required': None}, 
                   'fraction': {'_type': 'fraction', 'value': Fraction(1, 2), 'required': None}, 
                   'complex': {'_type': 'complex', 'value': (2+2j), 'required': None}, 
                   'datetime': {'_type': 'datetime', 'value': datetime.datetime(2020, 11, 18, 4, 13, 7, 947272), 'required': True}, 
                   'date': {'_type': 'date', 'value': datetime.date(3020, 11, 17), 'required': None}, 
                   '_set': {'_type': '_set', 'value': {1, 2, 3}, 'required': None}}`

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment