Wesley Shields wxsBSD

## yara-parser.md

      
              1 file
            
          
              5 forks
            
          
              1 comment
            
          
              9 stars
            
          
                wxsBSD
                / yara-parser.md
            
            
              Last active
              February 5, 2023 20:18
            
          
    Using YARA python interface to parse files

I've shared this technique with some people privately, but might as well share it publicly now since I was asked about it. I've been using this for a while now with good success. It works well for parsing .NET droppers and other things.
If you don't know what the -D flag to YARA does I suggest you import a module and run a file through using that flag. It will print, to stdout, everything the module parsed that doesn't involve you calling a function. This is a great way to get a quick idea for the structure of a file.
For example:
wxs@mbp yara % cat always_false.yara


## gist:3e9452c3699bf68ff2e83a5d6a521801

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              3 stars
            
          
                wxsBSD
                / gist:3e9452c3699bf68ff2e83a5d6a521801
            
            
              Created
              September 29, 2021 02:23
            
              
                french yara hits, no sorting
              
          
    Test rules:
wxs@wxs-mbp yara % cat rules/test.yara
rule b {
  strings:
    $a = "LSCOLORS"
  condition:
    $a
}


## gist:2936585412fd57f039fd7ecd7b24cd1b
/*
 * fmtid + 24 == number of property identifiers and offsets
 * fmtid + 28 == start of property identifier and offsets (4 bytes each)
 */
rule test {
  strings:
    //$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
    $fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
    $redacted_author = "REDACTED AUTHOR"
  condition:

## yara-loop-optimization.md

      
              1 file
            
          
              2 forks
            
          
              1 comment
            
          
              3 stars
            
          
                wxsBSD
                / yara-loop-optimization.md
            
            
              Last active
              April 29, 2022 15:08
            
          
    I've been working on optimizing the YARA compiler to generate better bytecode for loops. The goal is to skip as much of loops as possible by not iterating further once the loop condition is met. Here's the rule I'm using. Completely contrived and excessive, but it's to show the performance improvement:
wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  condition:
    for any i in (0..100000000): (i == 1)
}
wxs@wxs-mbp yara %


## yara-loop-optimization-details.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                wxsBSD
                / yara-loop-optimization-details.md
            
            
              Last active
              April 29, 2022 15:08
            
          
    YARA Loop Optimization Details

Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:

memory 0: lower bound
memory 1: boolean_expression accumulator
memory 2: iteration counter
memory 3: upper bound

We'll be using this rule for the first example:

  
## base64 and ascii working.md

      
              1 file
            
          
              2 forks
            
          
              1 comment
            
          
              1 star
            
          
                wxsBSD
                / base64 and ascii working.md
            
            
              Created
              December 4, 2019 04:33
            
          
    wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  strings:
    // This program cannot VGhpcyBwcm9ncmFtIGNhbm5vdA==
    // AThis program cannot QVRoaXMgcHJvZ3JhbSBjYW5ub3Q=
    // AAThis program cannot QUFUaGlzIHByb2dyYW0gY2Fubm90
    $a = "This program cannot" base64 ascii

 // Custom alphabets are supported, but I have it commented out for now. ;)


## gist:4ec929a0eb07d8e3feeccc49e0d9aa2a

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              1 star
            
          
                wxsBSD
                / gist:4ec929a0eb07d8e3feeccc49e0d9aa2a
            
            
              Last active
              April 29, 2022 15:06
            
              
                Counting string matches in YARA with awk
              
          
    Counting number of times strings match in YARA with awk...
wxs@wxs-mbp yara % cat rules/test.yara
rule a { strings: $a = "FreeBSD" nocase  $b = "usage: " condition: any of them }
wxs@wxs-mbp yara % ./yara -s rules/test.yara /bin/ls
a /bin/ls
0xb8e1:$a: FreeBSD
0xb9a1:$a: FreeBSD
0xb9f1:$a: FreeBSD


## sets.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                wxsBSD
                / sets.md
            
            
              Created
              December 2, 2021 02:30
            
              
                Example of using rule sets to write higher order logic
              
          
    wxs@wxs-mbp yara % cat rules/sets.yara
rule a0 { condition: false }
rule a1 { condition: true }
rule b { condition: 1 of (a*) }
rule c { condition: 2 of (a*) }
rule d { condition: 50% of (a*) }
rule e { condition: 1 of (a1) }
rule f { condition: all of (a1, e) }
wxs@wxs-mbp yara %

  
## gist:76dc97427252f2dda8e7c9f4870ebb5a

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                wxsBSD
                / gist:76dc97427252f2dda8e7c9f4870ebb5a
            
            
              Last active
              March 29, 2022 02:15
            
              
                y10k - YARA 10k test
              
          
    This started with a tweet from Steve Miller (https://twitter.com/stvemillertime/status/1508441489923313664) in which he asked what is better for performance: 1 rule with 10k strings or 10k rules with 1 string each? Based upon my understanding of YARA I guessed it wouldn't matter for search time and the difference in bytecode evaluation would be in the noise. Effectively, I guessed you would not be able to tell the difference between the two.
Costin was the first to provide actual results and he claimed a 35 second vs 31 second difference between the two (https://twitter.com/craiu/status/1508445059129163783). That didn't make much sense to me so I asked for his rules so I could test them. He provided me with two rules files (10k.yara and 10kv2.yara) and a text file with a bunch of strings in it.
This is my attempt to replicate his findings and also document why he was getting the warning he was getting. Because I wanted the run to take a bit of time I ended up not using his text file with all the strings (it

  
## rules.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              2 stars
            
          
                wxsBSD
                / rules.md
            
            
              Last active
              January 12, 2022 19:51
            
              
                xor PE rules
              
          
    One way to find PE files that start at offset 0 and have a single byte xor key:
rule single_byte_xor_pe_and_mz {
  meta:
    author = "Wesley Shields <wxs@atarininja.org>"
    description = "Look for single byte xor of a PE starting at offset 0"
  strings:
    $b = "PE\x00\x00" xor(0x01-0xff)
 condition:
	/*
	* fmtid + 24 == number of property identifiers and offsets
	* fmtid + 28 == start of property identifier and offsets (4 bytes each)
	*/
	rule test {
	strings:
	//$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
	$fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
	$redacted_author = "REDACTED AUTHOR"
	condition: