Skip to content

Instantly share code, notes, and snippets.

@grapo
Created July 10, 2012 18:14
Show Gist options
  • Save grapo/3085250 to your computer and use it in GitHub Desktop.
Save grapo/3085250 to your computer and use it in GitHub Desktop.
Django serialization

##Django declarative (de)serializers

Serializer is a class that can serialize and deserialize objects like Django models from and to serialization format (json, xml, pyyaml). It is composed from two classes: NativeSerializer and FormatSerializer. NativeSerializer is for serializing and deserializing objects to/from python native datatypes:

  • Iterables are serialized to list generators
  • Objects are serialized to dicts FormatSerializer is for serializing and deserializing python native datatypes to/from serialization format (json, xml, pyyaml).

If some format is defined then user can serialize user_data to this format with 'serializers.serialize' function:

serializers.serialize(format, user_data, **options)

This is backward compatible with current serialization solution. User can also give his own NativeSerializer class if there is need for changing structure or serialize different fields in not usual way.

serializers.serialize(format, user_data, serializer=MyNativeSerializer, **options)

There are two primary types of NativeSerializers:

  • ObjectSerializer - for serializing complex python objects. Should return dict or list of dicts
  • Field - for serializing object's fields or objects to python native datatypes. Should return python native datatypes like string, int, datetime etc.

There is also ModelSerializer which is ObjectSerializer adapted to serialize Django models and DumpdataSerializer which is ModelSerializer backward compatible.

ObjectSerializer.serialize method returns python native datatype version of object passed to it. It can also returns metadata (if runtime option is specified) that can be valuable for serializing further to chosen format (xml, json, yaml, etc.). Metadata can contains information related to presentation layer of serialized format like which fields should be serialized as attributes in xml or how to present data in html (ul/p/table)

Suppose we want to serialize this models:

    class Article(models.Model):
        author = models.ForeignKey(Author)
        headline = models.CharField(max_length=50)
        pub_date = models.DateTimeField()
        categories = models.ManyToManyField(Category)

Below we have definition of serializer classes ArticleSerializer.

    class CommentSerializer(ModelSerializer)
        pass # in this case CommentSerializer is not needed at all

If we want to serialize articled queryset:

serializers.serialize('json|xml|yaml|new_xml', queryset, serializer=ArticleSerializer, **options)

This will build Serializer class that will instantiate ArticleSerializer, pass queryset to it's method serialize and returned value serialize to e.g. json. Each element in queryset will be serialized independently in ArticleSerializer instance but there will be only one instance, not instance for every element.

[
  {
    "headline": "Poker has no place on ESPN", 
    "pub_date": "2006-06-16T11:00:00", 
    "categories": [
      3, 
      1
    ], 
    "author": 2
  }, 
  {
    "headline": "Time to reform copyright", 
    "pub_date": "2006-06-16T13:00:11.000", 
    "categories": [
      2, 
      3
    ], 
    "author": 1
  }
]

If there is need to treat some field in Article object specially then one can declare this special field.

    class ArticleSerializer(ModelSerializer)
        headline = ToUpperCaseField()

This will override default Field serializer for topic field in Article object.

[
  {
    "headline": "POKER HAS NO PLACE ON ESPN", 
    "pub_date": "2006-06-16T11:00:00", 
    "categories": [
      3, 
      1
    ], 
    "author": 2
  }
]

Default all object fields are serialized. For ObjectSerializer - fields in __dict__, for ModelSerializer - fields in _meta.local_fields and _meta.many_to_many. Also fields declared in NativeSerializer will be serialized. Using field and exclude option can override that. It works like in ModelForms

    class ArticleSerializer(ModelSerializer)
        headline = ToUpperCaseField()
        class Meta:
            fields = ('author',)

This will serialize only topic (because is declared) and content fields. fields = () will serialize only declared fields.

[
  {
    "headline": "POKER HAS NO PLACE ON ESPN", 
    "author": 2
  }
]

Declared fields can be renamed:

    class ArticleSerializer(ModelSerializer)
        headline = ToUpperCaseField(label='big_headline')
        class Meta:
            fields = ('author',)

This will serialize object.headline field as 'big_headline'

[
  {
    "big_headline": "POKER HAS NO PLACE ON ESPN", 
    "author": 2
  }
]

Default related field will be serialized to pk value. I can be change for one field:

    class ArticleSerializer(ModelSerializer)
        author = ModelSerializer()
[
  {
    "headline": "Poker has no place on ESPN", 
    "pub_date": "2006-06-16T11:00:00", 
    "categories": [
      3, 
      1
    ], 
    "author": {
      "name": "Jane"
    }
  }
]

Above ToUpperCaseField serializer was introduced. Let's write it:

    class ToUpperCaseField(Field)
        def serialize(self, obj):
            return obj.upper()

This field will serialize strings to only upper case strings.

Sometimes Field shouldn't work on object's field but on the object:

    class PkField(Field)
        def get_object(self, obj, field_name):
            # return getattr(obj, field_name, obj) # default implementation
            return obj

        def serialize(self, obj):
            return obj._get_pk_val()

ObjectSerializer and ModelSerializer class have options like fields, exclude or field_serializer which are determined when class is created. Due to backward compatibility issues fields and exclude can be change in when class is instantiated:

ObjectSerializer options:

  • class_name - usable when deserializing. It is object class or string where is stored object class name. If None then serializer don't initiate new object but using object passed when initialized.
  • fields - List of fields included in serialization and deserialization (default None - serialize all fields)
  • exclude - List of field names that should not be included in serialization or deserialization output (default None - serialize all fields)
  • field_serializer - what NatieSerializer class use for serializing and deserializing object fields (default FlatField)

ModelSerializer options:

  • related_serializer - what NativeSerializer class use for serializing and deserializing object fk fields
  • m2m_serializer - what NativeSerializer class use for serializing and deserializing object m2mfields

There are other options that should be set before ObjectSerializer instance's serialize or deserialize method is call but these option may be different for each instance like using or encoding. This leads to context object that will keep this options. Other reason to use this is that NativeSerializer may be depend on FormatSerializer. For exemple in xml all python native datatypes must be serialized to string before serializing to xml. Some custom model fields can have more sophisticated way to serialize to sting than unicode() so field.value_to_string must be called and field are only accessible in NativeSerializer object.

###NativeSerializer context options: Options that can be set at runtime when NativeSerializer object is initialized:

  • using - db that will be used
  • use_natural_keys - if True then use natural keys
  • text_only - if True then all field are serialized to string
  • use_metadata - if True than add metadata to fields
  • ordered_fields - ordering of fields in serialized data (for byte to byte compatibility)
  • encoding - string encoding

Deserialization will be based also on NativeSerializer fields, so one class can be used to serialization and deserialization. If field shouldn't be deserialized the method set_object must be overridden:

    class SerializableOnlyField(Field)
        def set_object(self, obj, instance, field_name):
            # setattr(instance, field_name, obj) # default action
            pass # will not deserialize this field

Due to performance reasons there will be good to override also deserialize method.

Other NativeSerializer methods:

###NativeSerializer methods:

  • get_object(self, obj, field_name) - returns the object on which serializer will work. Usually it's obj.field_name
  • metatada(self, metadict) - returns metadict with metadate usable by format serializer.
  • get_fields_for_object(self, obj) - returns fields that should be serialized in given object.
  • metadata(self, metadict) - Add user defined values to metadict.
  • serialize(self, obj) - Returns python native datatype from obj. For usability reasons this methods is made from two methods serialize_iterable nad serialize_object.
  • serialize_iterable(self, obj) - Serializes iterable objects.
  • serialize_object(self, obj) - Serializes given object to python native datatype
  • deserialize(self, native) - returns object from python native datatype
  • deserialize_iterable(self, obj) - Deserialize iterable objects.
  • deserialize_object(self, serialized_obj, instance) - Deserializes object from give python native datatype.

###Field methods:

  • set_object(self, obj, instance, field_name) - assigns deserialized object obj to instance

###ObjectSerializer methods:

  • get_object_fields_names(self, obj) - returns all fields names that should be serialized by default for given obj
  • create_instance(self, obj) - returns instance to which given object should be deserialized.

##FormatSerializer

FormatSerializer must implements three methods:

  • serialize(self, object, **options) that will serialize python native datatype to some format
  • deserialize(self, stream, **options) that will deserialize stream to python native datatype
  • get_context(self) that will return dict with options that can be used in NativeSerializer context

If there is need to add data to serialized object then metadata method can be used:

    class ArticleSerializer(ModelSerializer)
        def metadata(self, metadata):
            metadata['attributes'] = ['headline']
            return metadata

This will be usable for xml serializer to inform it that some fields should be rendered as attributes.

There can be need to add additional information to field:

    class FieldWithAdditionalData(Field)
        special_data = AdditionalDataField()

        def serialize(self, obj):
            return obj

This additional fields will be in metadata returned by this field. Xml will defaults render it to field's attributes, json will omit this because default json blocks metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment