Skip to content

Instantly share code, notes, and snippets.

Counting number of times strings match in YARA with awk...

wxs@wxs-mbp yara % cat rules/test.yara
rule a { strings: $a = "FreeBSD" nocase  $b = "usage: " condition: any of them }
wxs@wxs-mbp yara % ./yara -s rules/test.yara /bin/ls
a /bin/ls
0xb8e1:$a: FreeBSD
0xb9a1:$a: FreeBSD
0xb9f1:$a: FreeBSD
@stvemillertime
stvemillertime / sets.md
Created April 29, 2022 15:05 — forked from wxsBSD/sets.md
Example of using rule sets to write higher order logic
wxs@wxs-mbp yara % cat rules/sets.yara
rule a0 { condition: false }
rule a1 { condition: true }
rule b { condition: 1 of (a*) }
rule c { condition: 2 of (a*) }
rule d { condition: 50% of (a*) }
rule e { condition: 1 of (a1) }
rule f { condition: all of (a1, e) }
wxs@wxs-mbp yara %
@stvemillertime
stvemillertime / xor_kernel32_dll_twobyte_65K.yar
Created April 15, 2022 20:45
Two byte XOR brute force of "kernel32.dll"
This file has been truncated, but you can view the full file.
rule xor_kernel32_dll_key_0001 {
strings:
$0001 = { 6b 64 72 6f 65 6d 33 33 2e 65 6c 6d }
condition:
any of them
}
rule xor_kernel32_dll_key_0002 {
strings:
$0002 = { 6b 67 72 6c 65 6e 33 30 2e 66 6c 6e }
condition:

This started with a tweet from Steve Miller (https://twitter.com/stvemillertime/status/1508441489923313664) in which he asked what is better for performance: 1 rule with 10k strings or 10k rules with 1 string each? Based upon my understanding of YARA I guessed it wouldn't matter for search time and the difference in bytecode evaluation would be in the noise. Effectively, I guessed you would not be able to tell the difference between the two.

Costin was the first to provide actual results and he claimed a 35 second vs 31 second difference between the two (https://twitter.com/craiu/status/1508445059129163783). That didn't make much sense to me so I asked for his rules so I could test them. He provided me with two rules files (10k.yara and 10kv2.yara) and a text file with a bunch of strings in it.

This is my attempt to replicate his findings and also document why he was getting the warning he was getting. Because I wanted the run to take a bit of time I ended up not using his text file with all the strings (it

@stvemillertime
stvemillertime / yara-rules-for-libraries.txt
Created February 25, 2022 14:20 — forked from notareverser/yara-rules-for-libraries.txt
Brief treatise on the tradeoffs between YARA rules made from strings, code, and data
Today for #100DaysOfYARA I want to further explore one of my favorite topics
"How to reliably detect libraries", or how to identify that a particular program has linked or otherwise included a particular library.
Detecting libraries (especially ones written in C) pose unique challenges compared to malware, to include:
- libraries tend to be platform/architecture nonspecific
- compilerisms overwhelm otherwise decent signal
- copy/pasta and groupthink across libraries
@stvemillertime
stvemillertime / externals_example.py
Created February 21, 2022 14:28 — forked from tlansec/externals_example.py
Simple script to demo use of yara-python + externals
# Simple script to demo use of yara-python + externals
# think of all the externals you could define!
import os
import sys
import yara
example_rule = '''
rule demo_externals
{
@stvemillertime
stvemillertime / code-signatures.treatise.txt
Created February 15, 2022 16:50 — forked from notareverser/code-signatures.treatise.txt
A brief treatise on code-based YARA signatures
Today for #100DaysOfYARA I want to dive in to some of the dirty secrets of creating/maintaining code-based YARA signatures
Let's use SQLite3 as an example. Go get the source here (I prefer the amalgamation):
https://sqlite.org/download.html
I would like to reliably detect when a file is using SQLite. I often look at Windows executables, so I'm going to first concentrate on x86 programs that use this library. The easiest way to find them is to first concentrate on cleartext strings. In this case, I'm gonna pop over to VirusTotal and search for an easily-identifiable string:
content: "failed to allocate %u bytes of memory" type:pe
import "pe"
import "console"
rule CreatePEPolyObject {
strings:
$a = "CreatePEPolyObject" xor
$b = "CreatePEPolyObject" nocase ascii wide
$c = "CreatePEPolyObject" base64 base64wide
condition:
any of them
}
import "pe"
import "math"
import "hash"
rule IterateResourcesDemo
{
meta:
description = "Example rule to iterate over PE resources and calculate entropy, MD5 and check for strings"
strings:
rule Cerebro_FALLCHILL_common_PE_strings
{
strings:
$ThisprogramcannotberuninDOSmode_fallchill = "Tsrh kiltian xammlg yv ifm rm DOS nlwv"
$ThisprogrammustberununderWin32_fallchill = "Tsrh kiltian nfhg yv ifm fmwvi Wrm32"
$abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_fallchill = "ayxwvutsrqponmlkjihgfedcbzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
$ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_fallchill = "ABCDEFGHIJKLMNOPQRSTUVWXYZayxwvutsrqponmlkjihgfedcbz0123456789+/" nocase
$0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_fallchill = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZayxwvutsrqponmlkjihgfedcbz+/" nocase
$msvcrtdll_fallchill = "nhexig.woo" nocase
$VCRUNTIME140dll_fallchill = "VCRUNTIME140.woo" nocase