ahoy-jon/yo.md

## yo.md

      
    Raw
  

              yo.md
            
          
    For info, I gave a talk about it : http://www.slideshare.net/jwinandy/data-encoding-and-metadata-for-streams/17
a few points :


a reference to a schema is 64 bits (with hashing) or 32 bits if you use a coordination store (like Kafka + Camus does).

It's not a real waste of space, because you can use this reference for multiple payloads.


field renaming is well supported. In Avro you read your data with not one, but 2 schemas :

the one that was used to encode the data with (easy, it's around the data as metadata),
and the one you want to use to read your data.

So you can have a common read schema (thanks to union and renaming) for several write schemas.


One of the great feature of Avro is the genericity. You don't have to generate code to parse a message, so you can build an smart intermediary, like smart hadoop jobs that do generic stuffs : https://github.com/viadeo/viadeo-avro-utils