Skip to content

Instantly share code, notes, and snippets.

@simonw
Created October 15, 2013 23:53
Show Gist options
  • Star 56 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save simonw/7000493 to your computer and use it in GitHub Desktop.
Save simonw/7000493 to your computer and use it in GitHub Desktop.
How to use custom Python JSON serializers and deserializers to automatically roundtrip complex types.
import json, datetime
class RoundTripEncoder(json.JSONEncoder):
DATE_FORMAT = "%Y-%m-%d"
TIME_FORMAT = "%H:%M:%S"
def default(self, obj):
if isinstance(obj, datetime.datetime):
return {
"_type": "datetime",
"value": obj.strftime("%s %s" % (
self.DATE_FORMAT, self.TIME_FORMAT
))
}
return super(RoundTripEncoder, self).default(obj)
data = {
"name": "Silent Bob",
"dt": datetime.datetime(2013, 11, 11, 10, 40, 32)
}
print json.dumps(data, cls=RoundTripEncoder, indent=2)
import json, datetime
from dateutil import parser
class RoundTripDecoder(json.JSONDecoder):
def __init__(self, *args, **kwargs):
json.JSONDecoder.__init__(self, object_hook=self.object_hook, *args, **kwargs)
def object_hook(self, obj):
if '_type' not in obj:
return obj
type = obj['_type']
if type == 'datetime':
return parser.parse(obj['value'])
return obj
print json.loads(s, cls=RoundTripDecoder)
@simonw
Copy link
Author

simonw commented Oct 15, 2013

First output is:

{
  "dt": {
    "_type": "datetime", 
    "value": "2013-11-11 10:40:32"
  }, 
  "name": "Silent Bob"
}

Second output is:

{u'dt': datetime.datetime(2013, 11, 11, 10, 40, 32), u'name': u'Silent Bob'}

@Timokasse
Copy link

This is very nice, easy to read code. Now I would like to go a step further and avoid having to serialize the datetime into a dict:

{ "dt": "2013-11-11T10:40:32", "name": "Silent Bob" }

To decode it, one would simply check first if the string is ISO format, parse it if so or return the string if not. I don't see how it is possible inheriting from JSONDecoder. Would you have some suggestions?

@foresmac
Copy link

foresmac commented Jan 8, 2018

@Timokasse I'm not sure it's possible with the current design of JSONDecoder API. It might be possible to override object_pairs_hook, but to be honest, I'm not sure about that either. I just wrote something that loops through all the items in the output of standard json.loads call, and converts anything with date, datetime, or timestamp in the key with a converter, but that's because I can be sure about the expected keys.

@foresmac
Copy link

foresmac commented Jan 8, 2018

Maybe one strategy is to override scanstring on a subclass of JSONDecoder, though since it will be pure Python, it will likely be slower than the standard C-based implementation. https://github.com/python/cpython/blob/3.6/Lib/json/decoder.py#L69

@setaou
Copy link

setaou commented Apr 5, 2018

In case anyone is looking for the hack discussed above, I have implemented it here :
https://gist.github.com/setaou/ff98e82a9ce68f4c2b8637406b4620d1
In the end, the function json.decoder.scanstring still uses the C version, but json.scanner.make_scanner must use the python version.
I have not benchmarked this hack as I do not make a heavy usage of it.

@jtlz2
Copy link

jtlz2 commented Aug 17, 2018

@simonw Awesome example, for which thanks. Does the custom decoder operate recursively over the whole JSON tree, or only on the top level?

@raphant
Copy link

raphant commented Nov 7, 2019

@simonw Awesome example, for which thanks. Does the custom decoder operate recursively over the whole JSON tree, or only on the top level?

Recursively

@Koubae
Copy link

Koubae commented Nov 18, 2020

@raph92 it acts recursively. This is my implementation.

INPUT

`class MainDecoder(json.JSONDecoder):

    date_time_map = {'date', 'datetime', 'day', 'hour', 'minutes', 'month', 'seconds', 'time', 'year'}
    num_type_data = {'fraction', 'decimal', 'complex'}

    def __init__(self, *args, **kwargs):
        super().__init__(object_hook=self.object_hook,strict=False, *args, **kwargs)

    def object_hook(self, obj):
        if '_type' not in obj:
            return obj
        get_type = obj['_type']
        if get_type in self.date_time_map: # check if _type is a datetime type
            obj['value'] = self.date_deserialize(obj['value'], get_type)
        elif get_type in self.num_type_data:  # Checks for fractions, decimal and complex
            try:
                obj['value'] = self.eva_data(obj['value'])
            except ValueError as err:
                print('object_hook ---> in num_type_data eval', err)
        elif get_type == '_set':
            obj['value'] = set(obj['value'])
        return obj

    @staticmethod
    def eva_data(obj):
        """Eval fractions, Decimals and complex num types"""
        return eval(obj)

    @staticmethod
    def date_deserialize(obj, _type):

        # TODO deserialize date with other format types, for instance 2020/11/17
        if _type == 'date':
            try:
                if isinstance(obj, list):  # Date can be [2020, 11, 17] or '2020-11-17)
                    obj = date(*[int(item) for item in obj])
                else:
                    obj = date(*[int(item) for item in obj.split('-')])
            except ValueError as err:
                print('data_serialize -- data', err)

        elif _type == 'datetime':
            try:
                obj = datetime.strptime(str(obj), '%Y-%m-%d %H:%M:%S')
            except ValueError as err:
                try:
                    obj = datetime.fromisoformat(str(obj))
                except ValueError as err:
                    print('data_serialize -- datatime', err)
        return obj`

JSON

`json_schema_ok = '''

    {
      "decimal": {
        "_type": "decimal",
        "value": "Decimal(1.5)",
        "required": null
      },
      "fraction": {
        "_type": "fraction",
        "value": "Fraction(1, 2)",
        "required": null
      },
      "complex": {
        "_type": "complex",
        "value": "complex(2+2j)",
        "required": null
      },
      "datetime": {
        "_type": "datetime",
        "value": "2020-11-18T04:13:07.947272",
        "required": true
      },
      "date": {
        "_type": "date",
        "value": [
          3020,
          11,
          17
        ],
        "required": null
      },
      "_set": {
        "_type": "_set",
        "value": [
          1,
          2,
          3
        ],
        "required": null
      }
    }

'''`

OUTPUT

`schema_output_1 = {'decimal': {'_type': 'decimal', 'value': Decimal('1.5'), 'required': None}, 
                   'fraction': {'_type': 'fraction', 'value': Fraction(1, 2), 'required': None}, 
                   'complex': {'_type': 'complex', 'value': (2+2j), 'required': None}, 
                   'datetime': {'_type': 'datetime', 'value': datetime.datetime(2020, 11, 18, 4, 13, 7, 947272), 'required': True}, 
                   'date': {'_type': 'date', 'value': datetime.date(3020, 11, 17), 'required': None}, 
                   '_set': {'_type': '_set', 'value': {1, 2, 3}, 'required': None}}`

@andelink
Copy link

andelink commented Apr 7, 2022

@Timokasse @simonw I think it is simpler than that, unless I am misunderstanding.

>>> import json
>>> import datetime
>>> data = {
...     "name": "Silent Bob",
...     "dt": datetime.datetime(2013, 11, 11, 10, 40, 32)
... }

# Fails as expected
>>> json.dumps(data)
TypeError: Object of type datetime is not JSON serializable

# Succeeds
>>> json.dumps(data, default=str)
'{"name": "Silent Bob", "dt": "2013-11-11 10:40:32"}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment