Skip to content

Instantly share code, notes, and snippets.

@stvemillertime
stvemillertime / entropy_functions.yar
Created September 26, 2022 00:03 — forked from tlansec/entropy_functions.yar
Generic rule for suspicious function names
import "math"
rule general_vba_high_entropy_function_names : General
{
meta:
author = "threatintel@volexity.com"
description = "Looks for VBA files containing function names that have been randomized based on their entropy."
date = "2022-03-14"
hash1 = "c2badcdfa9b7ece00f245990bb85fb6645c05b155b77deaf2bb7a2a0aacbe49"
memory_suitable = 0
@stvemillertime
stvemillertime / entropy.yar
Created September 26, 2022 00:03 — forked from tlansec/entropy.yar
Print out information about a files entropy
// Add as an alias like:
// alias entropy=yara /path/to/entropy.yar $*
// Usage:
// entropy file.bin
import "console"
import "math"
rule entropy
@stvemillertime
stvemillertime / generate-stackstrings-yara.py
Created May 14, 2022 17:15 — forked from notareverser/generate-stackstrings-yara.py
Script to generate stackstrings YARA signatures for common implementation patterns
#!/usr/bin/env python3
import sys, string, struct
def strByByte(_strval):
strval = bytearray(_strval.encode())
for s in strval: yield s
def strByDword(_strval):
strval = bytearray(_strval.encode())
@stvemillertime
stvemillertime / boilerplate.py
Created May 13, 2022 14:22 — forked from notareverser/boilerplate.py
Boilerplate Python script
#!/usr/bin/env python3
import argparse
import sys
import json
import logging

Using YARA python interface to parse files

I've shared this technique with some people privately, but might as well share it publicly now since I was asked about it. I've been using this for a while now with good success. It works well for parsing .NET droppers and other things.

If you don't know what the -D flag to YARA does I suggest you import a module and run a file through using that flag. It will print, to stdout, everything the module parsed that doesn't involve you calling a function. This is a great way to get a quick idea for the structure of a file.

For example:

wxs@mbp yara % cat always_false.yara
/*
* fmtid + 24 == number of property identifiers and offsets
* fmtid + 28 == start of property identifier and offsets (4 bytes each)
*/
rule test {
strings:
//$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
$fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
$redacted_author = "REDACTED AUTHOR"
condition:

I've been working on optimizing the YARA compiler to generate better bytecode for loops. The goal is to skip as much of loops as possible by not iterating further once the loop condition is met. Here's the rule I'm using. Completely contrived and excessive, but it's to show the performance improvement:

wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  condition:
    for any i in (0..100000000): (i == 1)
}
wxs@wxs-mbp yara %

YARA Loop Optimization Details

Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:

  • memory 0: lower bound
  • memory 1: boolean_expression accumulator
  • memory 2: iteration counter
  • memory 3: upper bound

We'll be using this rule for the first example:

wxs@wxs-mbp yara % cat rules/test.yara
rule a {
  strings:
    // This program cannot VGhpcyBwcm9ncmFtIGNhbm5vdA==
    // AThis program cannot QVRoaXMgcHJvZ3JhbSBjYW5ub3Q=
    // AAThis program cannot QUFUaGlzIHByb2dyYW0gY2Fubm90
    $a = "This program cannot" base64 ascii

 // Custom alphabets are supported, but I have it commented out for now. ;)

Counting number of times strings match in YARA with awk...

wxs@wxs-mbp yara % cat rules/test.yara
rule a { strings: $a = "FreeBSD" nocase  $b = "usage: " condition: any of them }
wxs@wxs-mbp yara % ./yara -s rules/test.yara /bin/ls
a /bin/ls
0xb8e1:$a: FreeBSD
0xb9a1:$a: FreeBSD
0xb9f1:$a: FreeBSD