Steve stvemillertime

## entropy_functions.yar
import "math"

rule general_vba_high_entropy_function_names : General
{
    meta:
        author = "threatintel@volexity.com"
        description = "Looks for VBA files containing function names that have been randomized based on their entropy."
        date = "2022-03-14"
        hash1 = "c2badcdfa9b7ece00f245990bb85fb6645c05b155b77deaf2bb7a2a0aacbe49"
        memory_suitable = 0

## entropy.yar
// Add as an alias like:
// alias entropy=yara /path/to/entropy.yar $*

// Usage:
// entropy file.bin

import "console"
import "math"

rule entropy

## generate-stackstrings-yara.py
#!/usr/bin/env python3

import sys, string, struct

def strByByte(_strval):
    strval = bytearray(_strval.encode())
    for s in strval: yield s

def strByDword(_strval):
    strval = bytearray(_strval.encode())

## boilerplate.py
#!/usr/bin/env python3


import argparse
import sys
import json

import logging


## yara-parser.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                stvemillertime
                / yara-parser.md
            
            
              Created
              April 29, 2022 15:10
                — forked from wxsBSD/yara-parser.md
            
          
    Using YARA python interface to parse files

I've shared this technique with some people privately, but might as well share it publicly now since I was asked about it. I've been using this for a while now with good success. It works well for parsing .NET droppers and other things.
If you don't know what the -D flag to YARA does I suggest you import a module and run a file through using that flag. It will print, to stdout, everything the module parsed that doesn't involve you calling a function. This is a great way to get a quick idea for the structure of a file.
For example:
wxs@mbp yara % cat always_false.yara


## gist:c38f9fbdaaf17455921915a7d01fb63d
/*
 * fmtid + 24 == number of property identifiers and offsets
 * fmtid + 28 == start of property identifier and offsets (4 bytes each)
 */
rule test {
  strings:
    //$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
    $fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
    $redacted_author = "REDACTED AUTHOR"
  condition:

## yara-loop-optimization.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                stvemillertime
                / yara-loop-optimization.md
            
            
              Created
              April 29, 2022 15:08
                — forked from wxsBSD/yara-loop-optimization.md
            
          
    I've been working on optimizing the YARA compiler to generate better bytecode for loops. The goal is to skip as much of loops as possible by not iterating further once the loop condition is met. Here's the rule I'm using. Completely contrived and excessive, but it's to show the performance improvement:
wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  condition:
    for any i in (0..100000000): (i == 1)
}
wxs@wxs-mbp yara %


## yara-loop-optimization-details.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                stvemillertime
                / yara-loop-optimization-details.md
            
            
              Created
              April 29, 2022 15:08
                — forked from wxsBSD/yara-loop-optimization-details.md
            
          
    YARA Loop Optimization Details

Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:

memory 0: lower bound
memory 1: boolean_expression accumulator
memory 2: iteration counter
memory 3: upper bound

We'll be using this rule for the first example:

  
## base64 and ascii working.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                stvemillertime
                / base64 and ascii working.md
            
            
              Created
              April 29, 2022 15:07
                — forked from wxsBSD/base64 and ascii working.md
            
          
    wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  strings:
    // This program cannot VGhpcyBwcm9ncmFtIGNhbm5vdA==
    // AThis program cannot QVRoaXMgcHJvZ3JhbSBjYW5ub3Q=
    // AAThis program cannot QUFUaGlzIHByb2dyYW0gY2Fubm90
    $a = "This program cannot" base64 ascii

 // Custom alphabets are supported, but I have it commented out for now. ;)


## gist:5cfb0c71c8d7b296e75830031fdfd2b8

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                stvemillertime
                / gist:5cfb0c71c8d7b296e75830031fdfd2b8
            
            
              Created
              April 29, 2022 15:06
                — forked from wxsBSD/gist:4ec929a0eb07d8e3feeccc49e0d9aa2a
            
              
                Counting string matches in YARA with awk
              
          
    Counting number of times strings match in YARA with awk...
wxs@wxs-mbp yara % cat rules/test.yara
rule a { strings: $a = "FreeBSD" nocase  $b = "usage: " condition: any of them }
wxs@wxs-mbp yara % ./yara -s rules/test.yara /bin/ls
a /bin/ls
0xb8e1:$a: FreeBSD
0xb9a1:$a: FreeBSD
0xb9f1:$a: FreeBSD
	import "math"

	rule general_vba_high_entropy_function_names : General
	{
	meta:
	author = "threatintel@volexity.com"
	description = "Looks for VBA files containing function names that have been randomized based on their entropy."
	date = "2022-03-14"
	hash1 = "c2badcdfa9b7ece00f245990bb85fb6645c05b155b77deaf2bb7a2a0aacbe49"
	memory_suitable = 0
	// Add as an alias like:
	// alias entropy=yara /path/to/entropy.yar $*

	// Usage:
	// entropy file.bin

	import "console"
	import "math"

	rule entropy
	#!/usr/bin/env python3

	import sys, string, struct

	def strByByte(_strval):
	strval = bytearray(_strval.encode())
	for s in strval: yield s

	def strByDword(_strval):
	strval = bytearray(_strval.encode())
	#!/usr/bin/env python3


	import argparse
	import sys
	import json

	import logging
	/*
	* fmtid + 24 == number of property identifiers and offsets
	* fmtid + 28 == start of property identifier and offsets (4 bytes each)
	*/
	rule test {
	strings:
	//$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
	$fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
	$redacted_author = "REDACTED AUTHOR"
	condition: