james0zan/llvm-bloat.md

## llvm-bloat.md

      
    Raw
  

              llvm-bloat.md
            
          
    Removing Redundant Operations with LLVM

Summary

An operation is classified as redundant if either (a) it is unneeded thus simply omitting the execution of it will not change the output of the program; or (b) it is repeated, which means the result of this operation is the same as a former run.
The prevalent existence of these redundant operations is a main cause of performance bugs [1].
In this project, we intend to detect these bugs by 1) identifying the redundant operations on instruction granularity via dynamic slicing; and then 2) aggregating them into higher-level blocks that worth fixing.
In comparison with our method, the previous works [2, 3] on this topic only focus on detecting cacheable data (the repeated operations) and neglect the unneeded operations.
Introduction

A key reason why performance bugs escape so easily to production is the lack of reliable oracles, which means that it is usually hard for the developers to distinguish whether a performance bug is triggered or not.
The profiler, which is the most commonly used method at present, are not effective enough because it only reports what takes time but not what wastes time, thus it may miss a performance bug if the buggy code is not slow compared to the rest of that execution.
However, the situation changes when we only focus on the bugs cause by redundant operations.
An operation is classified as redundant if either (a) it is unneeded thus simply omitting the execution of it will not change the output of the program; or (b) it is repeated, which means the result of this operation is the same as a former run.
The time consumed by these redundant operations are simply wasted (although it does not mean that these wastage can be easily avoided).
And more importantly, according to our investigation, the prevalent existence of them is a main source of performance bugs [1].
As an illustration, Apache#45464 gives an example of unneeded operations, in which the programmer passes an inappropriate parameter to apr_stat() thus unneeded data is calculated and causes more than ten times slowdown in Apache servers.
For the repeated operations, Apache#48778 shows a bug that can be avoided by memorizing the result of NumberFormat.getInstance().
In this project, we intend to resolve the performance bugs caused by redundant operations in three steps:

identifying the redundant operations on instruction granularity via dynamic slicing;
aggregating them into higher-level blocks that worth fixing;
and 3) fixing the bugs by passing specific parameters or adding proper break/return/if statements to bypass the redundant code.

In comparison with our method, the previous works on this topic only focus on detecting cacheable data [2] or similar memory-access patterns [3], which are both a subset of the repeated operations.
Thus they cannot detect the bugs caused by unneeded operations.
Tentative Method

In the reset of this section, the word “instruction” means a dynamic instance of a static instruction observed at runtime.
Step 1: Identifying the redundant operations on instruction granularity
    Part 1: Identifying unneeded instructions
        All the parameters of an output system call (e.g., print, write, create) are needed.
        If a data is needed, the last efficacious write instruction to it is needed. （A write is efficacious if the written value is different from the previous value).
        If an instruction is needed,
        1) all the data it reads is needed;
        2) the control flow instruction that leads to its execution is needed.
        An instruction is unneeded if it is not needed.
    Part 2: Identifying repeated instructions
        An instruction is repeated if there exist K other instructions that
        1) are derived from the static instruction; and
        2) read/write the same value.
Aggregating them into higher-level blocks that worth fixing
    Performance bugs caused by redundant operations can be classified into several different categories
        The result of certain function is usually the same
        Only part of the function result are used in future
        The later part of a loop/function need not to be executed
        Some operation needs not to be executed under certain context
    Identifying performance bugs by calculating the average ratio of redundant instructions of each call-site (call stack + calling instruction) or loop. (Different formula is used for different category.)
Fixing the bugs by inspecting manually.

All the information is gathered by using static instrumenting implemented with LLVM.
Reference

[1] Apache#44408 TILES#521 WW#3577 Apache#48778 Apache#50792 PIVOT#708 WW#3581 HDFS#1676 ZOOKEEPER#1015 Apache#19101 Apache#50716 Mozilla#913310 Apache#45464 Apache#50878 Mozilla#855464 GCC#57786 Mozilla#897258 Mozilla#983570 …
[2] Cachetor: Detecting Cacheable Data to Remove Bloat. FSE ‘13
[3] Toddler: Detecting Performance Problems via Similar Memory-Access Patterns. ICSE ‘13