Skip to content

Instantly share code, notes, and snippets.

@Arsenic-ATG
Last active December 1, 2023 20:19
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Arsenic-ATG/8f4ac194f460dd9b2c78cf51af39afef to your computer and use it in GitHub Desktop.
Save Arsenic-ATG/8f4ac194f460dd9b2c78cf51af39afef to your computer and use it in GitHub Desktop.
GSoC 2021 : Extending gcc static analyzer to support virtual functions calls.

Overview

The project aims to make GCC's static analysis pass ( -fanalyzer option ) understand dynamic dispatch ( calls to virtual functions ) in C++.

This project will greatly benefit people who, like me, use static analysis pass to analyse their C++ programs for various problems at compile-time rather than spending a lot more time finding them at runtime, making the overall debugging process of any C++ project much faster and easier without using any additional tool except from the GCC compiler itself.

Status of Analyzer Before (GCC 11):

Although the static analyzer of GCC does it's job amazingly well for C code, it doesn't properly support C++ yet. Let's see this with an example where the analyzer triggers a false positive :-

#include <cstdlib>

struct A
{
    virtual int foo (void) 
    {
        return 42;
    }
};

struct B: public A
{
    int *ptr;
    void alloc ()
    {
        ptr = (int*)malloc(sizeof(int));
    }
    int foo (void) 
    { 
        free(ptr);
        return 0;
    }
};

int test()
{
    struct B b, *bptr=&b;
    b.alloc();
    bptr->foo();
    return bptr->foo();
}

int main()
{
    test();
}

(compiler explorer link : https://godbolt.org/z/xWhcY63Tz )

In the following program, when run with static analysis pass enabled ( -fanayzer option ), the analyzer fails to locate a double free() in the program which is then caught on runtime instead.

This occurs because the analyzer doesn't understand dynamic dispatch and thus fails to give the warning about the problem which arose because of it. This can also be confirmed by looking at the "exploded graph" of the analyzer for the following program ( obtained via -fdump-analyzer-exploded-graph option ) where it fails to understand the call to functions when called via a virtual table pointer (via _vptr) and not directly.

Project Aims

  • Find out what analyzer emits in case of a vfunc call.
  • Find out which function is called at the callsite.
  • Make analyzer analyse calls to functions that don't possess an underlying supergraph edge ( or a cgraph edge ) representing the call.
  • Update gcc's testsuite with tests covering this new feature.

Challenges Faced :

  1. Whenever a call to function happens indirectly ( either via function pointer or via a vtable pointer ), gcc's middle end doesn't know which function is called, so doesn't create an cgraph edge representing the call. This lead to adding enough logic to analyzer to dynamically detect those calls which analying the program.

  2. A call in analyzer's call stack was represented by a supergraph edge, making it almost impossible to represent calls that doesn't have a supergraph edge for the call. This lead to changing the way analyzer represents it's call stack, now a call in the call stack is represented by a custom structure that contain information about callee and caller nodes.

List of commits :

List of Bugs solved along the way :

Status of Analyzer now (GCC 12):

The project not only made the analyzer understand virtual function calls, but now the analyzer can also understand function calls that happen via function pointers.

Let's take the same example ( now also a part of gcc's testsuite ) where we saw the analyzer failing to catch double free before and see if it catches it now : -

#include <cstdlib>

struct A
{
    virtual int foo (void) 
    {
        return 42;
    }
};

struct B: public A
{
    int *ptr;
    void alloc ()
    {
        ptr = (int*)malloc(sizeof(int));
    }
    int foo (void) 
    { 
        free(ptr);
        return 0;
    }
};

int test()
{
    struct B b, *bptr=&b;
    b.alloc();
    bptr->foo();
    return bptr->foo();
}

int main()
{
    test();
}

The following program, now when run with static analysis pass enabled, detects the double free properly and generates the following warning : -

20:13: warning: double-'free' of 'b.B::ptr' [CWE-415] [-Wanalyzer-double-free]
   20 |         free(ptr);
      |         ~~~~^~~~~
  'int test()': events 1-2
    |
    |   25 | int test()
    |      |     ^~~~
    |      |     |
    |      |     (1) entry to 'test'
    |......
    |   28 |     b.alloc();
    |      |     ~~~~~~~~~
    |      |            |
    |      |            (2) calling 'B::alloc' from 'test'
    |
    +--> 'void B::alloc()': events 3-4
           |
           |   14 |     void alloc ()
           |      |          ^~~~~
           |      |          |
           |      |          (3) entry to 'B::alloc'
           |   15 |     {
           |   16 |         ptr = (int*)malloc(sizeof(int));
           |      |                     ~~~~~~~~~~~~~~~~~~~
           |      |                           |
           |      |                           (4) allocated here
           |
    <------+
    |
  'int test()': events 5-6
    |
    |   28 |     b.alloc();
    |      |     ~~~~~~~^~
    |      |            |
    |      |            (5) returning to 'test' from 'B::alloc'
    |   29 |     bptr->foo();
    |      |     ~~~~~~~~~~~
    |      |              |
    |      |              (6) calling 'B::foo' from 'test'
    |
    +--> 'virtual int B::foo()': events 7-8
           |
           |   18 |         int foo (void)
           |      |             ^~~
           |      |             |
           |      |             (7) entry to 'B::foo'
           |   19 |     {
           |   20 |         free(ptr);
           |      |         ~~~~~~~~~
           |      |             |
           |      |             (8) first 'free' here
           |
    <------+
    |
  'int test()': events 9-10
    |
    |   29 |     bptr->foo();
    |      |     ~~~~~~~~~^~
    |      |              |
    |      |              (9) returning to 'test' from 'B::foo'
    |   30 |     return bptr->foo();
    |      |            ~~~~~~~~~~~
    |      |                     |
    |      |                     (10) calling 'B::foo' from 'test'
    |
    +--> 'virtual int B::foo()': events 11-12
           |
           |   18 |         int foo (void)
           |      |             ^~~
           |      |             |
           |      |             (11) entry to 'B::foo'
           |   19 |     {
           |   20 |         free(ptr);
           |      |         ~~~~~~~~~
           |      |             |
           |      |             (12) second 'free' here; first 'free' was at (8)
           |

(compiler explorer link : https://godbolt.org/z/9oYqoxbTe )

@mw6969
Copy link

mw6969 commented Nov 22, 2023

Hello, tell me please if it's possible to add '-fanalyzer' option through CMakeLists?

@Arsenic-ATG
Copy link
Author

Hello, tell me please if it's possible to add '-fanalyzer' option through CMakeLists?

Hi @mw6969, -fanalyzer is just a GCC command line option so theoretically It shouldn't be different than adding any other compiler option to your Cmake project.

https://cmake.org/cmake/help/latest/command/add_compile_options.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment