Skip to content

Instantly share code, notes, and snippets.

@Demindiro
Last active September 30, 2022 18:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Demindiro/f5358e65d2305c03f2486d445363bc5d to your computer and use it in GitHub Desktop.
Save Demindiro/f5358e65d2305c03f2486d445363bc5d to your computer and use it in GitHub Desktop.
Python script implementing decoder for LZ4 block format. License: 0BSD
import lz4.block
f = open('lz4_Block_format.md', 'rb').read()
e = lz4.block.compress(f, compression=12, store_size=False)
d = [0] * len(f)
i = k = 0
while i < len(e):
def get_len(l):
# Get a variable length
global i
if l == 15:
while True:
n = e[i]
i += 1
l += n
if n != 255:
break
return l
# Get token
t = e[i]
i += 1
# Copy literals
l = get_len(t >> 4)
d[k:k + l] = e[i:i + l]
i += l
k += l
# "The block ends right after the literals (no offset field)"
if i == len(e):
break
# Get offset (16 bit little-endian)
offt = (e[i + 1] << 8) | e[i]
i += 2
assert offt != 0 # zero offset is invalid
# Copy match
l = get_len(t & 0xf) + 4
# We have to copy byte by byte because it may overlap and we're copying
# from low to high
m = k - offt
for _ in range(l):
d[k] = d[m]
k += 1
m += 1
print(bytes(d).decode('utf-8'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment