You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
Instantly share code, notes, and snippets.
Wesley Shields
wxsBSD
Security Engineer. Retired FreeBSD committer. I tend to hack on things involving security or networking.
I've shared this technique with some people privately, but might as well share it publicly now since I was asked about it. I've been using this for a while now with good success. It works well for parsing .NET droppers and other things.
If you don't know what the -D flag to YARA does I suggest you import a module and run a file through using that flag. It will print, to stdout, everything the module parsed that doesn't involve you calling a function. This is a great way to get a quick idea for the structure of a file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've been working on optimizing the YARA compiler to generate better bytecode for loops. The goal is to skip as much of loops as possible by not iterating further once the loop condition is met. Here's the rule I'm using. Completely contrived and excessive, but it's to show the performance improvement:
wxs@wxs-mbp yara % cat rules/test.yara
rule a {
condition:
for any i in (0..100000000): (i == 1)
}
wxs@wxs-mbp yara %
Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:
wxs@wxs-mbp yara % cat rules/test.yara
rule a {
strings:
// This program cannot VGhpcyBwcm9ncmFtIGNhbm5vdA==
// AThis program cannot QVRoaXMgcHJvZ3JhbSBjYW5ub3Q=
// AAThis program cannot QUFUaGlzIHByb2dyYW0gY2Fubm90
$a = "This program cannot" base64 ascii
// Custom alphabets are supported, but I have it commented out for now. ;)
This started with a tweet from Steve Miller (https://twitter.com/stvemillertime/status/1508441489923313664) in which he asked what is better for performance: 1 rule with 10k strings or 10k rules with 1 string each? Based upon my understanding of YARA I guessed it wouldn't matter for search time and the difference in bytecode evaluation would be in the noise. Effectively, I guessed you would not be able to tell the difference between the two.
Costin was the first to provide actual results and he claimed a 35 second vs 31 second difference between the two (https://twitter.com/craiu/status/1508445059129163783). That didn't make much sense to me so I asked for his rules so I could test them. He provided me with two rules files (10k.yara and 10kv2.yara) and a text file with a bunch of strings in it.
This is my attempt to replicate his findings and also document why he was getting the warning he was getting. Because I wanted the run to take a bit of time I ended up not using his text file with all the strings (it
One way to find PE files that start at offset 0 and have a single byte xor key:
rulesingle_byte_xor_pe_and_mz {
meta:
author="Wesley Shields <wxs@atarininja.org>"description="Look for single byte xor of a PE starting at offset 0"strings:$b="PE\x00\x00"xor(0x01-0xff)condition: