Goal: identify unique patterns and strings
Note: executable in Linux (ELF) has the sequence 7f 45 4c 46 02 01 01 Note: On Windows (MZ) it’s 4D 5A. add this sequence of hexadecimal numbers to avoid false positives Note: Hexadecimal strings must be put between { }. $magic_bytes = { 7f 45 4c 46 02 01 01 } //linux elf
Note: The ascii, wide , and nocase keywords tell YARA to search for ASCII and wide strings, and to be case-insensitive.
Viewing strings in a file
Note: strings malware.exe | less
String rule: It is common to find unique and interesting strings within a malware sample, these are ideal for building out a YARA rule. To define a string within a rule, the string itself needs to be declared as a variable.
$a=”string from malware sample”
In addition to declaring a string, we can also append modifiers after the declared string to fine-tune the search.
$a=”malwarestring” fullword – This modifier will match against an exact word. For example ‘www.malwarestring.com’ would return a match, but ‘www.abcmalwarestring.com’ would not.
$a=”malwarestring” wide – This would match unicode strings which are separated by null bytes, for example ‘w.w.w…m.a.l.w.a.r.e.s.t.r.i.n.g…c.o.m.’
$a=”malwarestring” wide ascii – This will allow the rule to match on unicode and ascii characters.
$a=”MalwareString” nocase – The rule will match the string regardless of case.
$a={5C 70 68 6F 74 6F 2E 70 6E 67} – Note the use of curly brackets hex string instead of speech quotations.
$a={5C 70 68 6F ?? ?F 2E 70 6E 67} – Question marks can be used as wildcards if you have detected a slight variation of a hex pattern within multiple samples.
$a={5C [2-10] 6F 74 6F 2E 70 6E 67} – In this example, I have stated that the string may start with the value ‘5C’ but there may be 2 – 10 random bytes before the matching pattern begins again.
$a={5C (01 02 | 03 04) 6F 2E 70 6E 67} – In this example i have stated that the hex values in this location could be ‘01 02’ or ‘03 04’.
Strings example:
strings:
$pdf_magic = {25 50 44 46}
$magic_bytes = { 7f 45 4c 46 02 01 01 } //linux elf
$s_anchor_tag = "<a " ascii wide nocase //$s_anchor_tag matches any HTML anchor tag, which some PDF converters may leave an a document converted from HTML.
$s_uri = /\(http.+\)/ ascii wide nocase //$s_uri is a regular expression that matches any URI/URL in parenthesis, which will match the PDF standard for URI actions and URLs in forms.
Condition example:
uint16(0) == 0x5A4D – Checking the header of a file is a great condition to include in your YARA rules. This condition is stipulating that the file must be a Windows executable, this is because the hex values 4D 5A are always located at the start of an executable file header. This is reversed in YARA due to endianness.
uint32(0)==0x464c457f) or (uint32(0) == 0xfeedfacf) or (uint32(0) == 0xcffaedfe) or (uint32(0) == 0xfeedface) or (uint32(0) == 0xcefaedfe) – Used to identify Linux binaries by checking the file header.
(#a == 6) – String count is equal to 6.
(#a > 6) – String count is greater than 6
There are a few different ways to specify the file size condition.
(filesize>512)
(filesize<5000000)
(filesize<5MB)
Once the strings have been declared within a rule you can then customize how many matches
need to be triggered as a condition for the rule to return what it deems a successful condition.
Where possible try and use 2-3 groups of conditions in order to avoid generating false positives
and to also create a reliable rule.
2 of ($a,$b,$c)
3 of them
4 of ($a*)
all of them
any of them
$a and not $b
Imports:
PE Library:
Adding the syntax ‘import pe’ to the start of a YARA rule will allow you to use the PE functionality of YARA,
this is useful if you cannot identify any unique strings.
pe.exports(“Botanist”, “Chechako”, “Originator”, “Repressions”)
pe.imports(“winhttp.dll”, “WinHttpConnect”)
pe.machine == pe.MACHINE_AMD64 – Used for checking machine type.
pe.imphash() == “0E18F33408BE6E4CB217F0266066C51C”
pe.timestamp == 1616850469 // Tue Dec 08 17:58:56 2020 // timestamp must be converted to an epoch unix timestamp
pe.version_info[“CompanyName”] contains AmAZon.cOm
pe.language(0x0804) // China – Languages identified can be used by specifying the Microsoft language code.