wxsBSD/yara-loop-optimization-details.md

## yara-loop-optimization-details.md

      
    Raw
  

              yara-loop-optimization-details.md
            
          
    YARA Loop Optimization Details

Let's look at the bytecode without my optimizations. Before we do that let's set some terminology, because I find it easier to use names compared YARA VM memory locations. These are the names I've mostly borrowed from the comments in the grammar:

memory 0: lower bound
memory 1: boolean_expression accumulator
memory 2: iteration counter
memory 3: upper bound

We'll be using this rule for the first example:
"any" integer range case

for any i in (0..5): (true)

Unoptimized

These are the instructions generated by the master branch:
1       INIT_RULE
2       PUSH 1      ; "any"
3       PUSH UNDEF  ; "end of list"
4       PUSH 0      ; integer range lower bound
5       PUSH 5      ; integer range upper bound
6       CLEAR_M 1   ; clear boolean expression accumulator
7       CLEAR_M 2   ; clear iteration counter
8    .--POP_M 3     ; store upper bound
9    |  POP_M 0     ; store lower bound
10   |  PUSH 1      ; true
11   |  ADD_M 1     ; add boolean_expression result to accumulator
12   |  INCR_M 2    ; increment loop iteration counter
13   |  INCR_M 0    ; increment lower bound (more like current bound)
14   |  PUSH_M 0    ; lower (current) bound
15   |  PUSH_M 3    ; upper bound
16   `--JLE         ; jump to start of loop if we haven't iterated enough
17      POP         ; if we didn't take the jump we have to clean up the stack
18      POP
19      POP         ; pop end of list
20      SWAPUNDEF 2 ; at this point only our "any" is on the stack, this is effectively a NOP
21      PUSH_M 1    ; push the boolean_expression accumulator
22      INT_LE      ; compare boolean_expression accumulator to primary_expression ("any")
23      MATCH_RULE
24      NOP
25      HALT

Optimized

My branch adds another memory location:
memory 4: primary_expression minimum
Here is the instructions generated by my optimization branch:
1       INIT_RULE
2       PUSH 1      ; "any"
3       CLEAR_M 4   ; store primary_expression in m4 while
4       ADD_M 4     ; keeping it on the stack by pushing
5       PUSH_M 4    ; it back on after ADD_M pop'ed it off.
6       PUSH UNDEF  ; "end of list"
7       PUSH 0      ; integer range lower bound
8       PUSH 5      ; integer range upper bound
9       CLEAR_M 1   ; clear loop local variables
10      CLEAR_M 2   ; clear loop local variables
11   .--POP_M 3     ; store upper bound
12   |  POP_M 0     ; store lower bound
13   |  PUSH 1      ; true
14   |  ADD_M 1     ; add boolean_expression result to accumulator
15   |  INCR_M 2    ; increment loop iteration counter
16   |  INCR_M 0    ; increment lower bound (more like current bound)
17   |  PUSH_M 4    ; primary expression minimum
18   |  PUSH_M 1    ; boolean_expression accumulator
19 .-+--JLE         ; jump out of loop if (minimum <= accumulator)
20 | |  POP         ; clean up stack if we don't take jump
21 | |  POP         ; clean up stack if we don't take jump
22 | |  PUSH_M 0    ; lower (current) bound
23 | |  PUSH_M 3    ; upper bound
24 | `--JLE         ; jump to start of loop if we haven't iterated enough
25 `----NOP         ; jump target for early exit
26      POP         ; if we took the early exit this will clean up those two pushes
27      POP         ; if we didn't take the second jump this cleans up those two pushes
28      POP         ; pop end of list
29      SWAPUNDEF 2 ; at this point only our "any" is on the stack, this is effectively a NOP
30      PUSH_M 1    ; push the boolean_expression accumulator
31      INT_LE      ; compare boolean_expression accumulator to primary_expression ("any")
32      MATCH_RULE
33      NOP
34      HALT

Some things to note:

Lines 3 to 5 are new. They exist because we want to store the primary expression in memory 4 while still keeping it on the stack. These could be reduced to a single SET_M instruction which restores the value on the stack after pop'ing it and setting it in memory.
Lines 17 to 21 are new. We compare the primary_expression minimum in memory 4 ("any" or 1 in this example) to the boolean expression accumulator (1 the first time through the loop) and jump out of the loop if we have met the condition. If we haven't met the condition we pop the two numbers off the stack and do the regular loop iteration comparison.
Line 25 is new and serves as a jump target for the early exit check. It's worth noting that the number of POP instructions after the loop are the same because we will either be cleaning up after the early exit check or the loop iteration check, but the size of the stack is the same in both cases.

"all" integer range case

Now let's look at the "all" case:
In this case I'm using memory 4 to store the last boolean expression result. This is stored differently from the accumulator because I want to check it at each iteration of the loop.
for all i in (0..5): (true)

Unoptimized

Here's the instructions generated by the master branch:
1       INIT_RULE
2       PUSH UNDEF  ; "all"
3       PUSH UNDEF  ; "end of list"
4       PUSH 0      ; integer range lower bound
5       PUSH 5      ; integer range upper bound
6       CLEAR_M 1   ; clear boolean expression accumulator
7       CLEAR_M 2   ; clear iteration counter
8    .--POP_M 3     ; store upper bound
9    |  POP_M 0     ; store lower bound
10   |  PUSH 1      ; true
11   |  ADD_M 1     ; add boolean_expression result to accumulator
12   |  INCR_M 2    ; increment loop iteration counter
13   |  INCR_M 0    ; increment lower bound (more like current bound)
14   |  PUSH_M 0    ; lower (current) bound
15   |  PUSH_M 3    ; upper bound
16   `--JLE         ; jump to start of loop if we haven't iterated enough
17      POP         ; if we didn't take the jump we have to clean up the stack
18      POP
19      POP         ; pop end of list
20      SWAPUNDEF 2 ; swap the UNDEF ("all") with loop iteration counter (memory 2)
21      PUSH_M 1    ; push the boolean_expression accumulator
22      INT_LE      ; compare boolean_expression accumulator to loop iteration counter
23      MATCH_RULE
24      NOP
25      HALT

The difference between the "any" case is two things:

The expression which is pushed on line 2 is UNDEF instead of 1.
Line 20 is no longer a NOP because we pushed UNDEF on line 2.

Optimized

Here's the instructions for this condition as generated by my loop_optimization branch:
1       INIT_RULE
2       PUSH UNDEF  ; "all"
3       PUSH UNDEF  ; "end of list"
4       PUSH 0      ; integer range lower bound
5       PUSH 5      ; integer range upper bound
6       CLEAR_M 1   ; clear loop local variables
7       CLEAR_M 2   ; clear loop local variables
8    .--POP_M 3     ; store upper bound
9    |  POP_M 0     ; store lower bound
10   |  PUSH 1      ; true
11   |  POP_M 4     ; store boolean expression result in memory 4
12   |  PUSH_M 4    ; we still want it on the stack so we can add it to the accumulator
13   |  ADD_M 1     ; add boolean_expression result to accumulator
14   |  INCR_M 2    ; increment loop iteration counter
15   |  INCR_M 0    ; increment lower bound (more like current bound)
16   |  PUSH_M 4    ; boolean expression result
17 .-+--JFALSE      ; jump out of loop if last result is false
18 | |  POP         ; clean up stack if we don't take jump
19 | |  PUSH_M 0    ; lower (current) bound
20 | |  PUSH_M 3    ; upper bound
21 | `--JLE         ; jump to start of loop if we haven't iterated enough
22 |    POP         ; clean up the upper bound left on the stack
23 `----NOP         ; jump target for early exit
24      POP         ; if we took the early exit this will clean up boolean expression result or it will clean up line 19
25      POP         ; pop end of list
26      SWAPUNDEF 2 ; swap the UNDEF ("all") with loop iteration counter (memory 2)
27      PUSH_M 1    ; push the boolean_expression accumulator
28      INT_LE      ; compare boolean_expression accumulator to loop iteration counter
29      MATCH_RULE
30      NOP
31      HALT

This is similar to the "any" case but rather than a JLE we have a JFALSE and jump if the boolean expression result is ever false.
Integer Sets (TBD)

I haven't quite figured out integer sets or string sets (they are the same problem). Here's how the instructions look for this rule:
for any i in (0,1,2,3,4,5): (true)

1       INIT_RULE
2       PUSH 1      ; "any"
3       PUSH UNDEF  ; "end of list"
4       PUSH 0
5       PUSH 1
6       PUSH 2
7       PUSH 3
8       PUSH 4
9       PUSH 5
10      CLEAR_M 1   ; clear boolean expression accumulator
11      CLEAR_M 2   ; clear iteration counter
12   .--POP_M 0     ; store lower bound
13   |  PUSH 1      ; true
14   |  ADD_M 1     ; add boolean_expression result to accumulator
15   |  INCR_M 2    ; increment loop iteration counter
16   `--JNUNDEF     ; Are we at the end of list?
17      POP         ; pop end of list
18      SWAPUNDEF 2 ; at this point only our "any" is on the stack, this is effectively a NOP
19      PUSH_M 1    ; push the boolean_expression accumulator
20      INT_LE      ; compare boolean_expression accumulator to primary_expression ("any")
21      MATCH_RULE
22      NOP
23      HALT

If I apply the same logic for early exits in the "any" and "all" case as I have explained above I won't know where to put the jump target so that the stack unwinds properly. To put it another way, I would need to add some number of POP instructions to clean up the stack but I won't know how many (and thus where to put the jump target), because the right value depends entirely upon how many times we have iterated through the loop (and thus consumed something off the stack). I'm still considering this case but for now I'm leaving that unoptimized.