Skip to content

Instantly share code, notes, and snippets.

@vonKrafft
Last active April 18, 2018 09:52
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vonKrafft/5a918fb30bcb53e7f82ff44e2f0864ab to your computer and use it in GitHub Desktop.
Save vonKrafft/5a918fb30bcb53e7f82ff44e2f0864ab to your computer and use it in GitHub Desktop.
Write-up for Shreddinger (NDH Quals 2018, Dev, 500pt)

Shreddinger

The infamous Shredder tried to destroy important documents! Please, help us recover them and prevents their evil scheme.

Points 500
Category Dev
Validation 11
URL http://shreddinger.challs.malice.fr/

The given URL leads to a form asking for the shredded token. We learn that we have 10 seconds to solve the challenge and a link allows us to download a ZIP file containing an image shredded into 100 pieces of 10x1400 pixels. The pieces are mixed and some of them are upside down. We will have to:

  • Download the ZIP file and uncompress it;
  • Rotate the upside down pieces;
  • Sort the pieces to reassemble the document;
  • Read the token using OCR
  • Send the token to retrieve the flag

First of all, we need to flip the pieces which are upside down. Fortunately, the token is not written in the middle of the document. We have to check to position of each letter and if the top of the letter is higher than 688px then we flip the image.

token_height = 750
for w in range(0, 10):
    for h in range(650, 750):
        if sum(img.getpixel((w, h))) < 765:
            token_height = min(h, token_height)
            break
if token_height < 688:
    img = img.rotate(180)

After few tries, we can notice that some letters does not start at 688px. But there are sentences above and below the token we could use to detect the orientation of the piece. Let say the sentences are 9 pixels high and count the number of black pixels in these 10x9 blocks. The first sentence need to be located between 91px and 100px, the second between 291px and 300px, and so on...

We could check every sentence, but remember we have 10 seconds, so we will check the first, the third and the fifth sentences (totally arbitrary choice, but it's me who decide, right?).

sentence_top, sentence_bottom = (0, 0)
for w in range(0, 10):
    for h in range(0, 1400):
        if (91 <= h < 100) or (391 <= h < 400) or (591 <= h < 600):
            sentence_top += min(1, 765-sum(img.getpixel((w, h))))
        if (100 <= h < 109) or (400 <= h < 409) or (600 <= h < 609):
            sentence_bottom += min(1, 765-sum(img.getpixel((w, h))))
if sentence_top < sentence_bottom:
    img = img.rotate(180)

Few tries later, we can see that all pieces have the right orientation. The next step is to sort the pieces. First, we need to decide which piece is the first one. If the first column is completely white but the last is not, then the writing of a sentence has started on that piece. It is not 100% accurate due to the token font, but it will do the job.

Once we have the first pieces, we build a trace of the last column. I call a trace the binary representation of a column: if the pixel is white it's a zero, otherwise it's a one. Then, we xor this trace with the trace of the first column of every other pieces. The xored result that has the least 1 is the one that has the best chance to match.

To sort the pieces, we need to compute the traces of first and last columns for each pieces.

trace_first, trace_last = (0, 0)
for w in range(0, 10):
    for h in range(0, 1400):
        if w == 0:
            trace_first += pow(2, h) * min(1, 765-sum(img.getpixel((w, h))))
        if w == 9:
            trace_last += pow(2, h) * min(1, 765-sum(img.getpixel((w, h))))

The result is not perfect, mostly because it is difficult to find the first piece. With more time, we could build several possibilities for comparison, but we have only 10 seconds. With this approach, we consume 5 to 6 seconds to build the document, and we do not have performed the OCR yet. For this challenge, I decided to deal with these failures: we do not have attempts limit, if we failed with one ZIP file, we can retry.

For the characters recognition, pytesseract do the job pretty well. But due to the token font, the recognition fail every time. To be honest, I had left my script looping a long time before I realize that the OCR is not working as I wanted.

To improve the OCR, we need to keep only uppercase letters and numbers. We can also notice that the token is a hexadecimal string. We could then delete all non-hexadecimal characters but we would miss some letters. We will therefore replace the badly deciphered characters with a similar hexadecimal characters. Thanks to the large number of images previously collected, I was able to observe which characters were often poorly recognized.

text = pytesseract.image_to_string(img)
text = re.sub(r'[^0-9A-Z]', '', text.upper())
text = text.replace('O', '0').replace('P', 'F').replace('I', '1').replace('Z', '2').replace('Q', '0').replace('G', '6').replace('S', '5')

That's all, we can let our script run until it successfully rebuilds a document and recognizes the token. Personally, it took me twenty attempts before I managed to send the right token and get the flag.

Congratz! The flag is : NDH{H0ly_fr@cking_PONY_!--D3Ath-to-the_0n3_whO-ordered_piZzas-w!th_pineapple_!}

from PIL import Image
import pytesseract
import requests
import zipfile
import shutil
import time
import re
class Picture(object):
def __init__(self, index):
super(Picture, self).__init__()
self.index = index
self.img = Image.open('chall/%d.png' % index).convert('RGB')
# Rotation using the token position
token_height = 750
for w in range(0, 10):
for h in range(650, 750):
if sum(self.img.getpixel((w, h))) < 765:
token_height = min(h, token_height)
break
if token_height < 688:
self.img = self.img.rotate(180)
# Traces and rotation adjustment
self.trace_first, self.trace_last = (0, 0)
sentence_top, sentence_bottom = (0, 0)
for w in range(0, 10):
for h in range(0, 1400):
if w == 0:
self.trace_first += pow(2, h) * min(1, 765-sum(self.img.getpixel((w, h))))
if w == 9:
self.trace_last += pow(2, h) * min(1, 765-sum(self.img.getpixel((w, h))))
if (91 <= h < 100) or (391 <= h < 400) or (591 <= h < 600):
sentence_top += min(1, 765-sum(self.img.getpixel((w, h))))
if (100 <= h < 109) or (400 <= h < 409) or (600 <= h < 609):
sentence_bottom += min(1, 765-sum(self.img.getpixel((w, h))))
# Rotation adjustment
if sentence_top < sentence_bottom:
self.img = self.img.rotate(180)
self.trace_first = int(bin(self.trace_last)[:1:-1], 2)
self.trace_last = int(bin(self.trace_first)[:1:-1], 2)
# Main
sess = requests.session()
start = time.time()
print('[*] Downloading ZIP ...')
response = sess.get('http://shreddinger.challs.malice.fr/challenge_accepted', stream=True)
if response.status_code == 200:
with open('output/%s_chall.zip' % start, 'wb') as f:
response.raw.decode_content = True
shutil.copyfileobj(response.raw, f)
print('[*] Downloading complete: %.4f' % (time.time() - start))
print('[*] Extracting ZIP ...')
zip_ref = zipfile.ZipFile('output/%s_chall.zip' % start, 'r')
zip_ref.extractall('chall')
zip_ref.close()
print('[*] Reassembling image ...')
pictures_list = [Picture(i) for i in range(0, 100)]
background = Image.new('RGBA', (1000, 1400), (255, 255, 255, 255))
used_pictures = []
for pic in pictures_list:
if pic.trace_first == 0 and pic.trace_last != 0:
break
background.paste(pic.img, (0, 0))
used_pictures.append(pic.index)
previous_pic = pic
for col in range(1, 100):
lighter_img = (1400, None)
for i in range(0, 100):
pic = pictures_list[i]
if pic.index not in used_pictures:
diff = bin(pic.trace_first ^ previous_pic.trace_last)[2:].count('1')
diff += bin((pic.trace_first >> 1) ^ previous_pic.trace_last)[2:].count('1')
diff += bin((pic.trace_first << 1) ^ previous_pic.trace_last)[2:].count('1')
diff = int(diff / 3)
if diff < lighter_img[0]:
lighter_img = (diff, pic)
if lighter_img[1] is None:
break
background.paste(lighter_img[1].img, (10*col, 0))
used_pictures.append(lighter_img[1].index)
previous_pic = lighter_img[1]
background.save('output/%s_out4.png' % start)
print('[*] Reassembling complete: %.4f' % (time.time() - start))
print('[*] Processing OCR ...')
token = background.crop((0, 650, 1000, 750))
text = pytesseract.image_to_string(token)
text = re.sub(r'[^0-9A-Z]', '', text.upper())
text = text.replace('O', '0').replace('P', 'F').replace('I', '1').replace('Z', '2').replace('Q', '0').replace('G', '6').replace('S', '5')
print('[*] Processing OCR complete: %.4f' % (time.time() - start))
if len(text) >= 40:
print('[*] Sending token ...')
response = sess.post('http://shreddinger.challs.malice.fr/', data={'shredded_token': text})
with open('output/%s_response.html' % start, 'a') as logfile:
logfile.write(response.text)
print('[+] Capturing Token: %s' % text)
token.save('output/%s_%s.png' % (start, text))
elif len(text) > 35:
print('[-] Capturing Token: %s' % text)
token.save('output/%s_%s.png' % (start, text))
else:
print('[!] Capturing Token: %s' % text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment