Skip to content

Instantly share code, notes, and snippets.

@LiEnby
Last active February 23, 2024 16:39
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save LiEnby/bf97afdf633d3b36a17c48d760668848 to your computer and use it in GitHub Desktop.
Save LiEnby/bf97afdf633d3b36a17c48d760668848 to your computer and use it in GitHub Desktop.
Reverse Engineering; Microsoft Excel Sheet Protection

Microsoft Excel offers a feature called "Sheet Protection" it essentially allows you to lock down an excel document so you cannot edit certain sheets and aspects of it attempting to edit protected sheets will return an error message: image

and trying to unprotect it under the review tab, does ask for a password.

image

i figured; "This is MS Excel, surely someone has found a way to disable this?" and well they have; if you do a quick search, you will find, MS Office files are just .ZIP files with XML data inside; so all you need to do is open the .xlsx file in a zip archiver and remove all tags from the XML within and it'll be unprotected

seems easy enough- so i opened it in 7zip and give it a try;

image

huh? there is no xl/ folder at all, or any XML files at all. it was all binary data

looking in a hex editor it seems this isn't an ZIP file at all image

so i had a look at that "D0 CF 11 E0 A1 B1 1A E1" value right at the start just dropped it into google and okay well it turns out, back in MS Excel 97 all the way to MS Excel 2003, Microsoft had their own completely different and properitary format for MS Office which is .XLS,
at some point they switched to the zipped XML format that is used today which is known as XLSX (see the X at the end?) and the document im trying to mess with is an .XLS file, and (NOT .XLSX) it even has a disctinction between these in the save dialog image

all the information online on removing sheet protection feature is for XLSX format and NOT XLS i tried looking around for other ways to remove it but that seemed to be all anyone had so i figured id try do it myself

so like, the first thought i had was pretty simple just open it, then just save the document as an XLSX

however unfortunately this gives a warning apparently XLSX cannot hold macros apparently this document uses macros- so the functionality of it would be lost in XLSX format

image

i saved an XLSX copy of the file anyway just so i could look at it had a look at its contents; and sure enough the sheetProtection tag is in here image with just a .. password="CDD6" what? was this the password? gave it a try, but nope not that easy image i didnt know what this was it looked too short to be a hash maybe its obscured or encrypted in some way? next i created my own document and set sheet protection on it too and saved it but it seems whatever this thing is microsoft no longer uses it for the newer XLSX format and newer versions of excel use proper hashing algorithms like sha512 image i couldn't figure out what was going on just by comparing files it seemed my version of office is too new for this; around this time i had a thought; can LibreOffice understand the sheet protection?

turns out it totally can and it even supports the unprotect thing with the password

image

this was absolutely great because you see, LibreOffice is open source; so somewhere in there must be what the heck its doing with this password;

even with the source code however it is still not always easy to find exactly what your looking for however

had a guess and just searched "Unprotect" came accross here which, looked right- so i traced it back here image to here image image then finally down to "hashPassword" image it doesn't seem like its using SHA1 or SHA512, etc so i started looking at the HASH_XL and this XL Hash references a UINT16 value. image and finally, bingo; the actual hash function itself image so .. remember the "password' from earlier? "CDD6" so it seems this was actually a 16 bit hash of the password!

but wait i dont think a 16 bit hash would be very secure?

it should be possible to brute force a password that produces the same 0xCDD6 relatively quickly

Then i could use that password to unlock the file as it would have the same password! alright finally an attack plan

== breaking it ==

so. i re-implemented the hash function in python

def excel_hash(password):
    result = 0
    MAX_UINT16 = 0xFFFF

    if len(password) <= MAX_UINT16:
        for c in password[::-1]:
            result = ((result >> 14) & 0x01) | ((result << 1) & 0x7FFF)
            result ^= ord(c)
        
        result = ((result >> 14) & 0x01) | ((result << 1) & 0x7FFF)
        result ^= (0x8000 | (ord('N') << 8) | ord('K'))
        result ^= len(password)

    return result

then can just loop through all possible combinations of charaters until it matches a given hash like so;

def crack_password(targetHash):
    for combLen in range(0, 6):
      for combination in itertools.product(string.printable, repeat=combLen):
        attempt = ''.join(combination)
        gotHash = excel_hash(attempt)
        if gotHash == targetHash:
            return attempt

so then complete brute forcer script is as follows:

import itertools
import string
def excel_hash(password):
    result = 0
    MAX_UINT16 = 0xFFFF

    if len(password) <= MAX_UINT16:
        for c in password[::-1]:
            result = ((result >> 14) & 0x01) | ((result << 1) & 0x7FFF)
            result ^= ord(c)
        
        result = ((result >> 14) & 0x01) | ((result << 1) & 0x7FFF)
        result ^= (0x8000 | (ord('N') << 8) | ord('K'))
        result ^= len(password)

    return result


def crack_password(targetHash):
    for combLen in range(0, 6):
      for combination in itertools.product(string.printable, repeat=combLen):
        attempt = ''.join(combination)
        gotHash = excel_hash(attempt)
        if gotHash == targetHash:
            return attempt

print(crack_password(0xCDD6))

running this gives the output '11g'

image

and well entering "11g" as the password to the sheet protection unlock prompt

image

it very unceremoniously unlocked the sheet protection and i can now edit the excel document hurray! image

== polishing it because why not ==

okay so this is cool i can unlock old excel documents, but in order to extract the hash you have to save it as an XLSX.. yknow what would be really cool would be a script you could just pass ANY XLS file and it output the password needed to unlock sheet protection to do this you would need to parse the password hashes from the XLS files then crack them all

we found this old script for convert MS office files for use with John the Ripper (this is for the full document encryption thing, not sheet protection btw), https://github.com/openwall/john/blob/bleeding-jumbo/run/office2john.py and i modified it to extract the password hash from excel files

old XLS files are an 'ole' archive, which is a format created by microsoft, it is also used for MSI files. and office2john uses a old library called 'olefile' to parse this, inside the "Workbook" and other files is a bunch of "blocks" each have a header with 2 uint32s' the first determines the block type, and the 2nd determines the block size so parsing out the password hashes is relatively easy;

by simply printing out every block in the file and the type associated with it, i was able to find that type 0x13 had the password hash inside it;

here is the code to read the password hash from XLS using olefile library

import olefile
def extract_sheet_hashes(stream):
    hashes = []
    while True:
        pos = stream.tell()
        if pos >= stream.size:
            break            
        try:
            type = struct.unpack("<H", stream.read(2))[0]
            length = struct.unpack("<H", stream.read(2))[0]
            data = stream.read(length)
            
            if type == 0x13:
                hashes.append(struct.unpack("H", data)[0])
        except:
            break
    return hashes

def read_xls(filename):
    ole = olefile.OleFileIO(filename)
    for streamname in ole.listdir():
        stream = ole.openstream(streamname)
        hashes = extract_sheet_hashes(stream)
        print(hashes) 
        stream.close()
    ole.close()

so anyway the result of all of this is finally this python script https://github.com/LiEnby/CrackSheetProtection/tree/master you can just run it on any XLS file with sheet protection and it'll work out passwords for everything

== final thoughts ==

i think a bunch more optiomized method for finding a correct hash value could be used; it seems longer text results in a larger hash value, which makes direct bruteforce a bit slower in some cases randomly guessing is alot quicker for higher values but makes slwoer ones take longer the hash algorithm doesnt look that complicated so im sure theres probably more you could do there to optiomize finding a working password for it

also goes without saying but none of this will work with the newer XLSX files, (for those just remove the sheetProtection tags from the XML ..)

anyway in the end, i didn't need to pull out IDA or Ghidra, because really, the libreoffice people reversed it for me a long time ago-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment