Challenge: Which Lee?
Category: MISCELLANEOUS
Challenge-Author: waituck
Description:
Which Lee do you want to be? Can you be the best Lee of them all?
Find out which Lee you are at this website!
p.s. we are using pytorch 1.8.0+cpu
Hint: Numerical InstabiLEEty
File-Attached: distrib.zip
Solves: 3
Final-Points: 992
A quick sidenote:
I'm a smol brained person who can't understand ML very well.
This is evident in how easily people solved the ML challenge that I set in STACK the Flags 2020.
Thankfully, I was eventually able to modify my STACK the Flags solution enough to solve this challenge.
In this challenge, we are supposed to get the right LEE returned from the server.
distrib.zip
contains eval.py
and leenet.ph
.
I uploaded a random image, and put that image through eval.py
. I then modified eval.py
a bit to also print the weights. I got this:
So it's quite clear that there are 5 possible results, and the image I uploaded is index 4
or Mark LEE.
The approach I took was to essentially try and fuzz the ML to submission, which is partially outlined in this blog.
Another thing of note are these lines in eval.py
, we are dealing with greyscale images that are 16x16.
transform = transforms.Compose([transforms.Resize(16),
transforms.CenterCrop(16),
transforms.Grayscale(),
transforms.ToTensor()])
After some modification of my STACK the Flags solution (which uses imagemagick
to generate the images) we start with this fuzz.py
:
'''
There were some additional modifications made to thread the thing
But I got too lazy to finish it up properly in the end
'''
import json
import subprocess
def magick(param1, param2):
cmd = f'magick -size 16x16 {param1} -colorspace Gray ./a/{param2}/lee/{param2}.png'
subprocess.call(cmd, shell=True)
def post(param2):
cmd = f'python -W ignore ./eval.py ./a/{param2}'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
return p.communicate()[0].decode('utf-8').strip()
I also modified eval.py
to just print the predictions in an array:
print(f'{[float(a.data) for a in y_pred[0]]}')
We first want to have a look around and see if we can score any easy wins with just altering the greys in fuzz.py
:
def fuzz_greys():
path = '0'
for i in range(0, 256, 8):
magick(f'canvas:rgb({i},{i},{i})', path)
list_ = json.loads(post(path))
maxScore = list_.index(max(list_))
print(f'rgb({i},{i},{i}):\t{list_} - {maxScore}')
fuzz_greys()
result:
rgb(0,0,0): [0.044459812343120575, 0.09277509897947311, 0.26647526025772095, 0.04446981102228165, 0.5518100261688232] - 4
rgb(8,8,8): [0.04275749623775482, 0.11253977566957474, 0.2634689509868622, 0.042767494916915894, 0.5384563207626343] - 4
rgb(16,16,16): [0.041637588292360306, 0.13498768210411072, 0.2140824943780899, 0.04164758697152138, 0.5676347017288208] - 4
rgb(24,24,24): [0.041420068591833115, 0.1492651402950287, 0.17769992351531982, 0.041430067270994186, 0.5901747941970825] - 4
rgb(32,32,32): [0.041490498930215836, 0.16069574654102325, 0.15623055398464203, 0.04150049760937691, 0.6000727415084839] - 4
rgb(40,40,40): [0.04182657226920128, 0.17313922941684723, 0.13585084676742554, 0.04183657094836235, 0.6073367595672607] - 4
rgb(48,48,48): [0.042387332767248154, 0.18530000746250153, 0.11940641701221466, 0.042397331446409225, 0.6104989051818848] - 4
rgb(56,56,56): [0.043064016848802567, 0.1961868554353714, 0.10716338455677032, 0.04307401552796364, 0.610501766204834] - 4
rgb(64,64,64): [0.043797604739665985, 0.20591259002685547, 0.09783407300710678, 0.04380760341882706, 0.6086381673812866] - 4
rgb(72,72,72): [0.04454972594976425, 0.21459907293319702, 0.0905671939253807, 0.044559724628925323, 0.6057142615318298] - 4
...
The result was rather disappointing (even after I tried to iterate through all the greys instead of stepping 8 at a time).
There was only some variances to index 1
,2
and 4
, but the model clearly favours index 4
a large proportion of the time.
I then attempted a pure brute force approach that generates random images, but that didn't work out well either (fuzz.py
):
def fuzz_random():
path = '0'
while(True):
magick('xc: +noise Random', path)
list_ = json.loads(post(path))
maxScore = list_.index(max(list_))
print(f'score:\t{list_} - {maxScore}')
if maxScore != 4:
exit()
fuzz_random()
After trying out various options and failing, I decided to try a more logical approach.
Since the canvas is small, 16x16, we can try to see how each pixel influences the model by brute force.
Lets start with a black canvas, then draw a single white pixel on the canvas each time (such that there is only 1 white pixel on a black canvas)(fuzz.py
).
scoreList = []
def fuzz_test(i, j, path='0'):
global scoreList
magick(f'canvas:rgb(0,0,0) -fill white -draw "color {i},{j} point"', path)
list_ = json.loads(post(path))
scoreList.append(list_)
maxScore = list_.index(max(list_))
print(f'({i},{j}):\t{list_} - {maxScore}')
def fuzz_test_single():
for i in range(0, 16):
for j in range(0, 16):
fuzz_test(i, j)
fuzz_test_single()
# saved into a file so I can analyze as and how I want later
with open('results.txt', 'w+') as f:
f.write(json.dumps(scoreList))
result:
(0,0): [0.04147734120488167, 0.13917842507362366, 0.2549462914466858, 0.04148733988404274, 0.5229007005691528] - 4
(0,1): [0.05107993632555008, 0.05947576090693474, 0.2170117348432541, 0.05108993500471115, 0.6213326454162598] - 4
(0,2): [0.04738770052790642, 0.09460298717021942, 0.09097101539373398, 0.04739769920706749, 0.7196305990219116] - 4
(0,3): [0.04606426879763603, 0.0795510783791542, 0.21261478960514069, 0.046074267476797104, 0.6156855821609497] - 4
(0,4): [0.04434482753276825, 0.09341193735599518, 0.16994573175907135, 0.04435482621192932, 0.6479326486587524] - 4
(0,5): [0.046178340911865234, 0.0826142281293869, 0.3667926788330078, 0.046188339591026306, 0.45821645855903625] - 4
(0,6): [0.04499290883541107, 0.08992206305265427, 0.14749741554260254, 0.045002907514572144, 0.6725746989250183] - 4
(0,7): [0.043800752609968185, 0.10243406146764755, 0.14953751862049103, 0.04381075128912926, 0.6604069471359253] - 4
(0,8): [0.041059162467718124, 0.16706378757953644, 0.1807200312614441, 0.041069161146879196, 0.5700778365135193] - 4
(0,9): [0.04617907851934433, 0.08152937889099121, 0.3128729462623596, 0.0461890771985054, 0.5132195353507996] - 4
...
Still lots of 4
s with 2
s sprinked in a few areas.
If I upload the 2
, I get Bruce Lee.
But lets analyze the results a bit more, I really want a index 1
since its the only other one that goes above 0.10.
I saved the results to saved_initial.txt
so that it does not get overwritten.
I wrote analyzer.py
:
import json
import math
IDX = 1
# we can calculate the coordinate this corresponds to based on the index
def coord(i):
x = math.floor(i/16)
y = i % 16
return x, y
def idxFind(elem):
return elem[IDX]
with open('saved_initial.txt') as f:
json_ = json.loads(f.read())
# we add the index to each result as we will be sorting later and losing the old index
for i, val in enumerate(json_):
json_[i].append(i)
json_.sort(key=idxFind)
for val in json_:
print(f'{coord(val[5])}\t{val[IDX]}')
result:
(5, 1) 0.053585462272167206
(1, 12) 0.053627923130989075
(1, 4) 0.05366024374961853
...
(1, 15) 0.2926003336906433
(13, 14) 0.3117857575416565
(14, 5) 0.33840033411979675
This result tells me that a white pixel on (14, 5)
makes the model think that its more similar to whatever index 1
is.
So lets add that into fuzz.py
by slightly modifying fuzz_test
!
def fuzz_test(i, j, path='0'):
global scoreList
l = [
(14,5)
]
s = ''
for v in l:
s += f'-fill white -draw "color {v[0]},{v[1]} point" '
magick(f'canvas:rgb(0,0,0) {s} -fill white -draw "color {i},{j} point"', path)
list_ = json.loads(post(path))
scoreList.append(list_)
maxScore = list_.index(max(list_))
print(f'({i},{j}):\t{list_} - {maxScore}')
result:
...
(0,8): [0.04008222371339798, 0.3721823990345001, 0.22077898681163788, 0.04009222239255905, 0.3268541693687439] - 1
...
Nice we got an index 1
, (it's Bobby Lee) but its still not the right solution.
After a few failed attempts at trying this method for index 0
and index 3
, I decided to try something a bit different.
What if instead of trying to find the highest possible value for index 0
and index 3
individually, I try to find the lowest possible value for index 1
, 2
, 4
as a sum.
This way, I optimize removing the index 1
, 2
, 4
results that I'm no longer interested in.
I modify analyzer.py
slightly:
def idxFind(elem):
return elem[1] + elem[2] + elem[4]
...
for val in json_:
print(f'{coord(val[5])}\t{val[1] + val[2] + val[4]}')
results:
(5, 10) 0.8913063481450081
(9, 0) 0.8916431814432144
(2, 12) 0.8921749405562878
...
(10, 3) 0.919761523604393
(6, 5) 0.9197917729616165
(14, 5) 0.92027947306633
I add the coordinates that correspond to the lowest sum to fuzz_test
in fuzz.py
, and re run the script.
Then, re-analyzed the new set of results and keep repeating for a while.
I ended up with:
in this state, when I re-ran the script, something happened:
...
(2,15): [0.046100229024887085, 0.1069311872124672, 0.09369169920682907, 0.04611022770404816, 0.7071566581726074] - 4
(3,0): [0.058916423469781876, 0.07276828587055206, 0.7504523992538452, 0.05892642214894295, 0.05892642214894295] - 2
Traceback (most recent call last):
File "C:\Users\duckness\Downloads\distrib\fuzz.py", line 76, in <module>
fuzz_test_all()
File "C:\Users\duckness\Downloads\distrib\fuzz.py", line 72, in fuzz_test_all
fuzz_test(i, j)
File "C:\Users\duckness\Downloads\distrib\fuzz.py", line 64, in fuzz_test
list_ = json.loads(post(path))
File "C:\Python39\lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Python39\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python39\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
Clearly, we have an interesting image.
Lets run that in eval.py
:
python -W ignore eval.py ./a/0
[nan, nan, nan, nan, nan]
And uploading it to the website:
Hey, if it works, it works.
final fuzz.py
:
import json
import subprocess
scoreList = []
def magick(param1, param2):
cmd = f'magick -size 16x16 {param1} -colorspace Gray ./a/{param2}/lee/{param2}.png'
subprocess.call(cmd, shell=True)
def post(param2):
cmd = f'python -W ignore ./eval.py ./a/{param2}'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
return p.communicate()[0].decode('utf-8').strip()
def fuzz_greys():
path = '0'
for i in range(0, 256, 8):
magick(f'canvas:rgb({i},{i},{i})', path)
list_ = json.loads(post(path))
maxScore = list_.index(max(list_))
print(f'rgb({i},{i},{i}):\t{list_} - {maxScore}')
def fuzz_random():
path = '0'
while(True):
magick('xc: +noise Random', path)
list_ = json.loads(post(path))
maxScore = list_.index(max(list_))
print(f'score:\t{list_} - {maxScore}')
if maxScore != 4:
exit()
def fuzz_test(i, j, path='0'):
global scoreList
l = [
(5, 10),
(14, 5)
]
s = ''
for v in l:
s += f'-fill white -draw "color {v[0]},{v[1]} point" '
magick(f'canvas:rgb(0,0,0) {s} -fill white -draw "color {i},{j} point"', path)
list_ = json.loads(post(path))
scoreList.append(list_)
maxScore = list_.index(max(list_))
print(f'({i},{j}):\t{list_} - {maxScore}')
def fuzz_test_all():
for i in range(0, 16):
for j in range(0, 16):
fuzz_test(i, j)
# fuzz_greys()
# fuzz_random()
fuzz_test_all()
with open('saved.txt', 'w+') as f:
f.write(json.dumps(scoreList))
final analyzer.py
:
import json
import math
# we can calculate the coordinate this corresponds to based on the index
def coord(i):
x = math.floor(i/16)
y = i % 16
return x, y
def idxFind(elem):
return elem[1] + elem[2] + elem[4]
with open('saved.txt') as f:
json_ = json.loads(f.read())
# we add the index to each result as we will be sorting later and losing the old index
for i, val in enumerate(json_):
json_[i].append(i)
json_.sort(key=idxFind)
for val in json_:
print(f'{coord(val[5])}\t{val[1] + val[2] + val[4]}')
An actual BIG BRAIN solution by 4yn: https://github.com/4yn/slashbadctf/blob/master/sgctf21/which-lee/which-lee-solution.md