The following compression scheme compresses characters, ignoring whitespace, and utilizes markers like (10x2) to indicate a repetition of the subsequent 10 characters 2 times. The marker itself isn't included in the decompressed output. Parentheses or other characters within the data referenced by a marker are treated as normal data.
Examples of decompression include:
MURPHY
→MURPHY
(length 6)(3x3)DBZ
→DBZDBZDBZ
(length 9)A(2x2)BCD(2x2)EFG
→ABCBCDEFEFG
(length 11)A(1x5)BC
→ABBBBBC
(length 7)(6x1)(1x3)A
→(1x3)A
(length 6)X(8x2)(3x3)ABCY
→X(3x3)ABC(3x3)ABCY
(length 18)
What is the decompressed length of the file, excluding whitespace?
In version two, the only difference is that markers within decompressed data are decompressed. This, the documentation explains, provides much more substantial compression capabilities, allowing many-gigabyte files to be stored in only a few kilobytes.
For example:
(3x3)XYZ
still becomesXYZXYZXYZ
, as the decompressed section contains no markers.X(8x2)(3x3)ABCY
becomesXABCABCABCABCABCABCY
, because the decompressed data from the (8x2) marker is then further decompressed, thus triggering the (3x3) marker twice for a total of six ABC sequences.(27x12)(20x12)(13x14)(7x10)(1x12)A
decompresses into a string of A repeated 241920 times.(25x3)(3x3)ABC(2x3)XY(5x2)PQRSTX(18x9)(3x2)TWO(5x7)SEVEN
becomes 445 characters long.
What is the decompressed length of the file using this improved format?