Skip to content

Instantly share code, notes, and snippets.

@jjyr
Last active June 21, 2019 10:35
Show Gist options
  • Save jjyr/95ad17db22411a3c781da1fe00945f34 to your computer and use it in GitHub Desktop.
Save jjyr/95ad17db22411a3c781da1fe00945f34 to your computer and use it in GitHub Desktop.

fixed length encoding

  • u8 - 1 bytes little endian encode
  • u32 - 4 bytes little endian encode
  • u64 - 8 bytes little endian encode
  • u256 - 32 bytes little endian encode
  • H160 - 20 bytes raw binary
  • H256 - 32 bytes raw binary
def encode_u8(n):
    return n.to_bytes(1, 'little', signed=False)

def encode_u32(n):
    return n.to_bytes(4, 'little', signed=False)

def encode_u64(n):
    return n.to_bytes(8, 'little', signed=False)

variable length encoding

for a variable length data, we encode the length as u32, then concat the data it self, if the data is a list, we encode each element then concat them into bytes.

def encode_bytes(bytes):
    length = encode_u32(len(bytes))
    return length + bytes

# encode_bytes(b"hello world") => b'\x0b\x00\x00\x00hello world'
# ignore int_type when elem is not a int, otherwise use 8, 32 or 64.
def encode_list(lst, int_type=0):
    length = encode_u32(len(lst))
    return length + b"".join([encode_elem(elem, int_type=int_type) for elem in lst])   

def encode_elem(item, int_type=0):
    if isinstance(item, bytes):
        return encode_bytes(item)
    elif isinstance(item, int):
        assert int_type % 8 == 0
        bytes_length = int_type // 8
        assert bytes_length > 0
        return item.to_bytes(bytes_length, 'little', signed=False)
    
    raise Exception("not support nested list in this demo")

# encode_list([1, 2, 3], int_type=8) => b'\x03\x00\x00\x00\x01\x02\x03'
# encode_list([b"hello", b"world", b"blockchain"]) => 
# b'\x03\x00\x00\x00\x05\x00\x00\x00hello\x05\x00\x00\x00world\n\x00\x00\x00blockchain'

advance types:

struct encoding as a list of fields, for example a struct Person(name, description) is encoding to encode_list([person.name, person.description]).

@doitian
Copy link

doitian commented Jun 19, 2019

H160 和 H256 可以当作整数

@xxuejie
Copy link

xxuejie commented Jun 21, 2019

Since most lists will be small, I suggest we add a variable length encoding integer type for encoding list length:

  • If length < 128, encode a single u8 value length << 1
  • If length >= 128, encode a little endian u32 value (length << 1) | 1

That should strike a balance between encoded message length and decoding speed.

@doitian
Copy link

doitian commented Jun 21, 2019

Since most lists will be small, I suggest we add a variable length encoding integer type for encoding list length:

  • If length < 128, encode a single u8 value length << 1
  • If length >= 128, encode a little endian u32 value (length << 1) | 1

That should strike a balance between encoded message length and decoding speed.

Depending on where to use it. If it is used only in streaming digest, simple u32 is fater.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment