I present some new additions to what AFL can currently offer. It helps to make AFL have better heurstics, and along with DTA, it helps AFL find what data points to fuzz more intensively. Will the decrease in performance outweigh its benefits of heurstics?
Recap of AFl
Esentiaally AFL starts by taking some seed inputs, and then mutates it. Then it tests whether if the mutated input results in a different control flow. If it does, AFL will retain it to base other further fuzzing from this input.
Limitations of AFL
Overall, AFL has a very primitive and minimal understanding of the target binary. It doesn't know how to manipulate the input other than making "educated guesses" of mutations. It also doesn't know what parts of an input to change.
Furthermore, AFL can detect when a certain input causes an advancement of the "frontier" (i.e. a branch path that is has not taken before), and it will save this input so that it will not have to jump over this frontier again, but does not know how to focus its resources on advancing on a "frontier" that it has not discovered yet.
The problem with focusing on resources on advancing a "frontier" that is not discovered yet is that it is pretty difficult to determine the probabilities of getting a certain input value that will allow getting past the current frontier. For example, you could possibly focus so much of your resources on a particular branch only to realize that most of your attempts are useless.
AFL with frontier
AFL frontier creates a heuristic that AFL itself lacks. It creates a heuristic for detecting a frontier.
A frontier is a code location that has never been visited before, but it can be directly reached by going the other branch of a visited conditional branch.
Currently, AFL frontier uses three steps to incorporate frontier heuristics.
First, it documents all the possible frontiers during compile time, and it places those possible frontiers into the instrumentation code. This involves working a bit at the assembly level to do that.
Second, during runtime, it tracks the visited and frontier locations. This allows AFL to confirm that the frontier locations reported by the instrumentation actually becomes "frontier", or if it is already visited, it should not be regarded as a frontier. It also records the number of times that a frontier is hit (i.e. the conditional branch leading up to the frontier is hit). This allows AFL-FTR to boost those frontiers that are being hit more times, but cannot be penetrated yet.
Third, when AFL-FTR computes the performance score to determine the number of undeterministic mutation, AFL-FTR will boost the performance score of those input seeds with higher frontier/ frontier hit counts.
How does AFL-FTR determine frontiers? This will briefly describe the algorithm for determining frontiers. First we have to modify the instrumentation code, particularly when we are at a conditional branch. During runtime, when we encounter a conditional branch we first determine where the branch will take us based on the conditional expression.
We define the branch that we end up taking as "visited", and the other branch as "frontier". Once we find which path we end up taking, we mark the taken path as "visited" and the not-taken path as "frontier". There is an exception to this rule: if the not-taken path has been taken before, i.e. "visited" we do NOT mark it as frontier again.
Then simultaneously, we also count the number of times the "frontier" branch has been marked as a frontier. This constitutes asthe "hit count" of our frontier branch. By acheiving higher levels of hits, we can assume that there might be a higher chance of possibly breaking into that frontier branch.
Then in the last step, AFL-FTR then takes these frontier bits and updates the global frontier bits. Sometimes frontiers found in this particular input is actually not a frontier when all the other paths have also been calculated, i.e. maybe a frontier location is in fact already visited before. These locations are then retroactively revoked as "frontier" in the particular input case.
AFL vs AFL-FTR
So I ran my AFL frontier code alongside with AFL to see how they fare. It appears that AFL Frontier has a pretty significant performance gap, which is expected because I had to add more processing for each run, and have not done much optimizations yet. However, we can see that AFL frontier has very evidentially provided with more paths than AFL could find, despite the less execs. We can see that AFL frontier generates almost 25% more favored paths than its AFL counterparts, in effect, greatly increasing its overall efficiency.
When I ran this, I only tested with the undeterministic form only because that was the only change that might affect the efficiency of AFL vs AFL Frontier. Afterwards, more rigourous testing should be done in order to test the performance of AFL frontier vs AFL. Another interesting to note is that AFL does also generate more "unique" crashes because of the shear number of executions. Furthermore, I theorize that perhaps AFL performs better because the crash probably is a "shallow" crash, i.e., it doesn't require much fuzzing to find the crash at all. In theory, if AFL frontier can find more paths, then it would be more keen to discover deeper and more elusive bugs and crashes.
I also added some additional interesting statistics that show the overall frontier coverage. The most important values are the proximity degree, and average proximity. The proximity degree here takes into account the total number of frontiers for each input case and also the hit count of each frontier. Intuitively, the more frontiers that an input case is exposed to, the higher the proximity. We also have an average proximity, with a standard deviation attached to it. Together the average proximity and current proximity can show whether, relative to recent input cases whether this new input seed should be used over others. Currently, I have not incorporated the standard deviation, but later on, I will aim at incorporating the standard deviation as well as a better standard of comparing input cases for more frontiers.
AFL Frontier TODO
There are a few things that need to be added to AFL frontier to be better.
First of all, AFL frontier is unable to detect frontiers from jump tables and switch blocks. The reason why currently it is out of reach is because a switch block could have multiple frontiers, and currently the instrumentation only supports one other frontier.
Second of all, AFL frontier does not filter out input cases that have a low proximity. I have not been able to test out its resulting efficacy in time, so this hasn't been fully implemented yet. This allows AFL to skip those input cases that have lower number of frontiers favor of those input cases that have much higher frontier counts
Also currently, the heuristic for determining whether an input case "stands out" as having a much higher/lower proximity is done so by using a factor multiple. I would like to instead utilize standard deviations since I found that standard deviations could widely differ, and perhaps a constant multiple may not always be suitable.
If given more time, I would also like to perform more rigorous testing and optimize any performances to really confirm its efficiency over AFL. There are also some miscellaneous backend documentation and code that I want to also touch on (but it's not particularly interesting to address here).
Unfortunately, I will not be here for long. I will only be here for just two more weeks, will officially leave next next Friday.
I will close with making some ideas for incorporating dynamic taint analysis. First to note one thing: I've actually been digging around, and found that the author of AFL has actually made a few comments on dynamic taint analysis, symbolic execution, and any other whitebox fuzzing in general. He acknowledges that there are many more sophisticated fuzzers than his AFL fuzzer, but he also remarks that these more sophisticated fuzzers are often much slower than traditional fuzzing, that any gains in deeper knowledge of the program gets overcome by its shear lack of performance.
Therefore, I think one thing to reiterate: with adding more sophisticated components for fuzzing, they should be done more and more sparingly. With dynamic taint analysis, I think to make it minimal as possible, DTA should only run on each interesting input seed once. Once this one-time transaction is completed, it should not need to run again. We also should only run it on just intersting input seeds, to avoid some redundancies with similar input seeds.
After DTA is invoked once, we can use DTA to map each input byte to a corresponding frontier code location. For even more efficiency, we can just whether if a bytes IS a frontier byte or not at that time. This gives AFL a pretty good basis of what to probably fuzz in order to provide even more coverage.