Skip to content

Instantly share code, notes, and snippets.

@htfy96
Last active July 25, 2022 17:43
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save htfy96/24b8d6ec78236b17674b548595984315 to your computer and use it in GitHub Desktop.
Save htfy96/24b8d6ec78236b17674b548595984315 to your computer and use it in GitHub Desktop.
Restrict C++ unsafe behaviors in unsafe{} block using clang-query and simple commands

C++ has a lot of dark corners. Unfortunately, sometimes we need to allow inexperienced developers to write some C++ code to meet the deadline. The intersection of the two cases often makes things worse: programmers used to delegate memory management to garbage collection tend to throw off new everywhere in the source, and those stuck with compile error will use every evil hack to get around with it. Code review is a feasible way to ensure code quality in this case, but a better choice is to restrict them into a relatively safe subset of the language.

In this article, I will show how to use clang-query and a simple script to restrict some unsafe behaviors in unsafe block/namespace using simple commands:

#include "common.hpp"

struct X {
    int f: 2; // error: use of bit field without enclosing Unsafe
};

inline namespace Unsafe {
    struct Y {
        int g: 2; // ok
    };
}

void f(...) { // error: Use of C-style variadic function without enclosing unsafe{}

}

inline namespace Unsafe {
    int g(...) { // OK
        return 0;
    }
}

int main()
{
    unsafe {
        int* p = new int; // OK
        memset(p, 0, sizeof(int)); // OK
        delete p; // OK
    }
    int *p = new int; // error: Use of new without enclosing unsafe{}
    memset(p, 0, sizeof(int)); // error: Use of memset without enclosing unsafe{}
    delete p; // error: Use of delete without enclosing unsafe
    return 15;
}

Clang-query

clang-query is a powerful tool to match the AST of a translation unit (TU) using a simple DSL. Its command-line interface is quite simple:

USAGE: clang-query [options] <source0> [... <sourceN>]

OPTIONS:

  -c=<command>               - Specify command to run
  -extra-arg=<string>        - Additional argument to append to the compiler command line
  -extra-arg-before=<string> - Additional argument to prepend to the compiler command line
  -f=<file>                  - Read commands from file
  -p=<string>                - Build path

-p <build-path> is used to read a compile command database.

<source0> ... specify the paths of source files. These paths are

A typical use example:

## CMake
mkdir build && cd build
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .. # generate compile_commands.json
clang-query -f command.txt -p . YOU_SOURCE.cpp

## Or make
bear make # generate compile_commands.json
clang-query -f command.txt -p . YOU_SOURCE.cpp

Commands

The main commands we will use here is match, which is followed by a set of nested ASTMatchers. The syntax is introduced in detail in its official documentation. We will use it to filter unsafe behaviors without enclosing unsafe{} blocks. The let NAME matcher command is also worth to mention since we can simplify our code with it.

The complete commands are included in command.txt. We first define an ancestorUnsafe matcher to find that if there is any ancestor including if (0x81515127) or namespace Unsafe. Then we define a series of match rules and use bind as the error message.

You can add your own filters following the documentation mentioned above.

Script

I write a simple script clang-restrict.sh to exit non-zero for successful matches. Just set ROOT_DIR environment variable to the directory of command.txt and BUILD_DIR to the directory of compile_commands.txt and call clang-restrict.sh your_source.cpp.

#!/bin/bash
# ENV:
# ROOT_DIR: your root path where command.txt exists
# BUILD_DIR: your build path where compile_commands.json exists
set -xe
clang-query -f "$ROOT_DIR"/command.txt -p "$BUILD_DIR" -extra-arg="-DRESTRICT_CHECK" $@ | (! grep -Ez --color=never 'Match.*?[1-9][0-9]* match')
let ancestorUnsafe anyOf(hasAncestor(ifStmt(hasDescendant(integerLiteral(equals(0x81515127))))),hasAncestor(namespaceDecl(hasName("Unsafe"))))
m cxxNewExpr(allOf(isExpansionInMainFile(), unless(ancestorUnsafe))).bind("Use of new without enclosing unsafe {} or Unsafe namespace")
m cxxDeleteExpr(allOf(isExpansionInMainFile(),unless(ancestorUnsafe))).bind("Use of delete without enclosing unsafe{} or Unsafe namespace")
m fieldDecl(allOf(isExpansionInMainFile(), isBitField(), unless(ancestorUnsafe))).bind("Use of bit field without enclosing unsafe{} or Unsafe namespace")
m cxxDynamicCastExpr(allOf(isExpansionInMainFile(),unless(ancestorUnsafe))).bind("Use of dynamic_cast without enclosing unsafe{} of Unsafe namespace")
m cxxConstCastExpr(allOf(isExpansionInMainFile(),unless(ancestorUnsafe))).bind("Use of const_cast without enclosing unsafe{} of Unsafe namespace")
m functionDecl(allOf(isExpansionInMainFile(), isVariadic(),unless(ancestorUnsafe))).bind("Use of C-style variadic function without enclosing unsafe{} of Unsafe namespace")
m callExpr(allOf(isExpansionInMainFile(), callee(functionDecl(hasName("memset"))),unless(ancestorUnsafe))).bind("Use of memset without enclosing unsafe{} of Unsafe namespace")
#pragma once
#ifdef RESTRICT_CHECK
#define unsafe if(0x81515127)
#else
#define unsafe
#endif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment