Skip to content

Instantly share code, notes, and snippets.

Wesley Shields wxsBSD

Block or report user

Report or block wxsBSD

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile

Using YARA python interface to parse files

I've shared this technique with some people privately, but might as well share it publicly now since I was asked about it. I've been using this for a while now with good success. It works well for parsing .NET droppers and other things.

If you don't know what the -D flag to YARA does I suggest you import a module and run a file through using that flag. It will print, to stdout, everything the module parsed that doesn't involve you calling a function. This is a great way to get a quick idea for the structure of a file.

For example:

wxs@mbp yara % cat always_false.yara
View gist:07a5709fdcb59d346e9e

Problems with pehash implementations

I've started to add a pehash implementation to YARA. I decided to base my implementation on the description in the paper and only use the totalhash and viper implementations for comparing results. In doing so I've noticed some problems, and it is unclear who is right.

Totalhash implementation

For starters let's take a look at running the implementation from totalhash against a binary.

wxs@psh Desktop % shasum 4180ee367740c271e05b3637ee64619fb9fe7b1d2b28866e590e731b9f81de36
View gist:a3ba7f4733125813e58a


This is outdated. The canonical source of documentation on this is over here.


I recently put YARA inside osquery and thought I would provide some details on how to use it. There are two YARA related tables in osquery, which serve very different purposes. The first table, called yara_events, uses osquery's pub-sub framework to monitor for filesystem changes and will execute YARA when a file change event fires. The second table, called yara, is an on-demand YARA scanning table.



I've been working on optimizing the YARA compiler to generate better bytecode for loops. The goal is to skip as much of loops as possible by not iterating further once the loop condition is met. Here's the rule I'm using. Completely contrived and excessive, but it's to show the performance improvement:

wxs@wxs-mbp yara % cat rules/test.yara
rule a {
    for any i in (0..100000000): (i == 1)
wxs@wxs-mbp yara %
wxsBSD / gist:019740e83faa7a7206f4
Last active May 11, 2019
YARA, now with more Math(TM)! (Thanks @alexcpsec)
View gist:019740e83faa7a7206f4


I'd like to explain some of the new things I've added to YARA which will be in the next release. This is in addition to the stuff I've written about here, which are already in 3.2.0. If you have not read that I suggest you start there as it will tie in nicely with some of the things I'm going to mention here. Lastly, some of these things are not yet merged into master but I expect them to be very soon.

Math Module

There is a new module in YARA called math. The intention of this module is to expose some functions which you can use in your rules to calculate specific properties.


In particular it provides these functions for calculating different values:

  • entropy

Keybase proof

I hereby claim:

  • I am wxsbsd on github.
  • I am wxs ( on keybase.
  • I have a public key whose fingerprint is 96D1 2E6B F61C 2F3D 83EF 8F0B BE54 310C 17F0 AA37

To claim this, I am signing this object:


YARA Loop Optimization Details

Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:

  • memory 0: lower bound
  • memory 1: boolean_expression accumulator
  • memory 2: iteration counter
  • memory 3: upper bound

We'll be using this rule for the first example:

View ssl-profiling-local.bro

SSL Profiling in Bro

I wrote profiling applications over SSL recently and this is my attempt at doing so in Bro. I haven't written a Bro script before this one so I'm betting I've got a bunch of things wrong here. The code comes in two parts. The first is the main script which has the core logic. The second part is the "local" script which defines the application profiles you are interested in.

The Main Script

@load base/protocols/conn
@load base/protocols/ssl
@load base/frameworks/notice
wxsBSD / gist:6d5e777afc31b3cf46d0
Last active Jul 14, 2018
Inferring contents of SSL sessions
View gist:6d5e777afc31b3cf46d0


Everything I'm talking about below is not new, but I thought it was an interesting idea and realized I already had the majority of pieces in place to play with it. I want to share what I learned. If you are at all interested in exploring this topic further a good paper on it is here. Also, a few years ago IOActive published a blog post on the technique which is also a good read. Finally, the last two paragraphs in section 6 of RFC5246 clearly document the problem the best I've been able to find:

Any protocol designed for use over TLS must be carefully designed to
deal with all possible attacks against it.  As a practical matter,
this means that the protocol designer must be aware of what security
properties TLS does and does not provide and cannot safely rely on
the latter.
View gist:2936585412fd57f039fd7ecd7b24cd1b
* fmtid + 24 == number of property identifiers and offsets
* fmtid + 28 == start of property identifier and offsets (4 bytes each)
rule test {
//$fmtid = { 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae }
$fmtid = { e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 }
$redacted_author = "REDACTED AUTHOR"
You can’t perform that action at this time.