MaskRay/asan.md

## asan.md

      
    Raw
  

              asan.md
            
          
    AddressSanitizer (ASan) is a compiler technology that checks addressability-related memory errors with some add-on checks.
It consists of two parts: compiler instrumentation and runtime library. To put it in the simplest way,

The compiler instruments global variables, stack frames, and heap allocations to track shadow memory.
The compiler instruments memory access instructions to check shadow memory.
In case of an error, the inserted code calls a callback (implemented in the runtime library) to report an error with a stack trace. Normally the program will exit after the error message is printed.

Clang 3.1 implemented AddressSanitizer in 2011.
GCC 4.8 integrated AddressSanitizer in 2012.
MSVC (starting in Visual Studio 2019 version 16.9) added /INFERASANLIBS.
compiler-rt/lib/asan provides the runtime library for Clang and GCC.
In the Linux kernel, kasan: add kernel address sanitizer infrastructure introduced a runtime implementation in 2015.

Usage

Shadow memory

The core is shadow memory.
Things such as global variables, stack variables, and heap allocated memory are called addressable.
8 bytes are called a granularity. A granularity of bytes are mapped to a shadow byte.
Many platforms use a fixed shadow offset.
Some use a dynamic shadow offset loaded from __asan_shadow_memory_dynamic_address. Specify -mllvm -asan-force-dynamic-shadow to force using a dynamic shadow offset.
Android AArch32 is special and uses __asan_shadow
For a memory access of to the address addr, AddressSanitizer conceptually converts the following code
*addr = ...; // [addr, addr+size)
to
// after instrumentation
if (isPoisoned((uintptr_t)addr, size)) {
  reportError(addr);
  __builtin_unreachable();
}
*addr = ...;
When size<=8 and the access is aligned (align >= size), isPoisoned can be implemented as:
bool isPoisoned(uintptr_t addr, uintptr_t size) {
  int8_t shadow = *(addr/granularity + shadow_offset);
  return shadow && int8_t(addr%granularity+size-1) >= shadow;
}
With the default granularity (8 bytes), when size == 8, int8_t(addr%granularity+size-1) >= shadow; is tautological and the check can be simplified to *(addr/8+offset) != 0.
When size==16 && min(size, align) >= granularity (the access is aligned by the granularity), a similar isPoisoned can be used as well.
bool isPoisoned(uintptr_t addr, uintptr_t size) {
  int16_t shadow = *(int16_t*)(addr/granularity + shadow_offset);
  return shadow;
  // as shadow<granularity<=size, addr%granularity+size-1 >= shadow is tautological.
}
For other accesses (unaligned and size<=8, size==16 && align < granularity, or size>16), we can perform two checks, one at the first byte and the other at the last byte.
Two shadow bytes, if different, cover a region of of 2*granularity bytes. When size > 2*granularity, some middle bytes are not covered and the implementation may have false negatives.
bool isPoisoned(uintptr_t addr, uintptr_t size) {
  int8_t shadow = *(addr/granularity + shadow_offset);
  if (shadow && int8_t(addr%granularity+size-1) >= shadow)
    return true;

  addr += size-1;
  shadow = *(addr/granularity + shadow_offset);
  if (shadow && int8_t(addr%granularity+size-1) >= shadow)
    return true;

  return false;
}
Shadow memory is checked by interceptors and instrumented memory operands.
An instrumented memory operand typically expands to an inline code sequence, but you can customize asan-instrumentation-with-call-threshold (default: 7000).
If the number of memory accesses in a function exceeds asan-instrumentation-with-call-threshold, AddressSanitizer will call a runtime function to check the shadow memory.
(See -fsanitize-address-outline-instrumentation below.)
Function instrumentation

AddressSanitizer instruments certain intrinsics, and certain library function calls, and interesting memory operands.
ASan instrumentation converts MemIntrinsic intrinsics to __asan_memset/__asan_memcpy/__asan_memmove callbacks.
The callbacks check the full range using ASAN_WRITE_RANGE.
Some library function that perform memory accesses are recognized as builtins (e.g. strcpy, strlen).
They need to marked with the function attribute nobuiltin to prevent them from being optimized out.
These functions will be intercepted by the runtime library.
For an interesting memory operand, see isPoisoned in the beginning.
When the address is unaligned, the access spans two shadow bytes, and we select the isPoisoned implementation that checks one shadow byte (size <= 8 && addr % align && addr%granularity+size > granularity), isPoisoned may have false negatives when the first shadow byte is zero.
// clang++ -O0 -fsanitize=alignment a.cc
int main() {
  char a[8];
  auto *p = (long *)(a+1);
  *p = 1; // out-of-bounds write is not detected with granularity==8
  return *p;
}
-fsanitize=alignment can catch some missing checks.
However, certain cases still cannot be caught. For example, with -O1 or above, the memcpy call below is not optimized out.
-fsanitize=alignment does not instrument the code and the implementation checks just one shadow byte, leading to a false negative.
// clang++ -O1 -fsanitize=address,alignment a.cc
#include <string.h>
int main(int argc, char *argv[]) {
  char a[8];
  auto *p = (long *)(a+1);
  memcpy(p, &argv, sizeof(long));
  return a[7];
}
As an optimization, within a basic block, two addresses do not need to be instrumented repeatedly, unless they are separated by a call instruction (2011-11).
Global variable instrumentation

See AddressSanitizer: Global variable instrumentation.
Stack instrumentation

ASan collects stack variables (AllocaInst) and sorts them by descending alignment (from low address to high address).
Variables are padded with redzones and are either placed on the stack or on the heap for use-after-return detection.
The layout of stack variables and redzones looks like the following:
left_redzone var0 middle_redzone var1 middle_redzone var2 right_redzone

The left redzone holds 4 words:

a magic value (kCurrentStackFrameMagic = 0x41B58AB3)
the address of the frame description string
the function address
unused

Multiple shadow bytes of the variables and redzones are defined to provide better diagnostics.

0: a variable whose lifetime is alive
0xf8 (kAsanStackUseAfterScopeMagic): a variable whose lifetime has ended
0xf1: the left redzone
0xf2: middle redzones
0xf3: the right redzone
0xca (kAsanAllocaLeftMagic): the left redzone of a dynamic alloca (see the next section)
0xcb (kAsanAllocaRightMagic): the right redzone of a dynamic alloca (see the next section)

For each stack varaible, VarAndRedzoneSize computes the total size of the variable and its red zone on the right.
static uint64_t VarAndRedzoneSize(uint64_t Size, uint64_t Granularity,
                                  uint64_t Alignment) {
  uint64_t Res = 0;
  if (Size <= 4)  Res = 16;
  else if (Size <= 16) Res = 32;
  else if (Size <= 128) Res = Size + 32;
  else if (Size <= 512) Res = Size + 64;
  else if (Size <= 4096) Res = Size + 128;
  else                   Res = Size + 256;
  return alignTo(std::max(Res, 2 * Granularity), Alignment);
}
If -fsanitize-address-stack-use-after-scope (default) is enabled, when a variable gets out of scope, its shadow memory is filled with 0xf8 (kAsanStackUseAfterScopeMagic).
Accessing the variable will lead to a stack-use-after-scope error.
When the stack frame is out of the scope, the magic value in the left redzone is set to kRetiredStackFrameMagic = 0x45E0360E.
There are some heuristics to remove instructions for memory operands that are always safe.
First, AllocaInsts that can be optimized by mem2reg do not need instrumentation (2015-02).
(Non-instrumented alloca instructions in the entry block before the first instrumented alloca should not be moved around, otherwise the debug information will be broken.)
Direct in-bounds accesses to stack variables do not need instrumentation (AddressSanitizer::isSafeAccess).
Second, StackSafetyGlobalAnalysis (-mllvm -asan-use-stack-safety=1) can identify stack variables that are always safely accessed, which do not need instrumentation, e.g. int x; foo(&x); return x; (isAllocaPromotable is false and does not optimize the case).
Stack use after return

Similar to stack-use-after-scope detection, ASan performs stack-use-after-return detection. -fsanitize-address-use-after-return= accepts one of the following values:

runtime (default): instrumented code checks a global variable __asan_option_detect_stack_use_after_return to decide whether a fake stack frame is used.
always: instrumented code unconditionally creates a fake stack frame. This saves code size.
never: don't detect use-after-return

When the instrumentation decides to create a fake stack frame, it allocates one using __asan_stack_malloc_{0..10}(uptr ptr, uptr size).
The runtime function may return nullptr, in which case a fake stack frame is unavailable, and alloca will be used to allocate the local stack frame; otherwise, stack variables and associated redzones are allocated on the fake stack frame.
The local stack base needs llvm.dbg.declare to provide reliable debug info for local variables at -O0.
Function argument spills are also placed before the __asan_stack_malloc_* call. This is for two purposes:

Location information of function arguments is correct when __asan_stack_malloc_* is called.
At -O0, an argument does not need to be spilled before storing into the reserved alloca, saving a stack slot.

When the stack frame is out of the scope, the shadow memory of the fake stack is filled with 0xf5 (kAsanStackAfterReturnMagic) bytes.
Smaller stack frames use an inline code sequence while a larger frame calls __asan_stack_free_*.
Instrumentation inserts stack shadow unpoisoning code before each ret/resume/cleanupret instruction.
If a function calls a noreturn function, it needs to unpoison the stack shadow before the noreturn function; otherwise if the stack is reused by another process, there may be a false positive.
google/sanitizers#37
Instrumentation inserts __asan_handle_no_return calls before some noreturn functions (e.g. _exit),
Interceptors of some noreturn functions call __asan_handle_no_return.

Dynamic alloca

AllocaInst::isStaticAlloca checks whether an alloca is in the entry block and has a constant size.
Such alloca instructions can be folded into the prolog/epilog code.
Other alloca instructions are called dynamic.
AddressSanitizer instrumentation changes an interesting dynamic alloca instruction to allocate a larger space to cover left, possible partial right, and right redzones.
(alignment_padding) left_redzone var (partial_right_redzone) right_redzone

var%32 == 0 && alignment_padding%32 == 0

The start address of the variable (new_alloca+alignment) is aligned to at least 32.
The address minus the alignment gives us the start address of the left redzone.
If the alignment is greater than 32, there may be padding on the left of the left redzone.
The instrumentation inserts an __asan_alloca_poison(&var, size) call to poison the redzones.
The left zone gets 0xca (kAsanAllocaLeftMagic) bytes while the right redzone gets 0x0xcb (kAsanAllocaRightMagic) bytes.
0xca and 0xcb identify the runtime error type dynamic-stack-buffer-overflow.
Uses of the dynamic alloca instruction are replaced with left_redzone+alignment.
The most recent dynamic alloca's address is stored in a local variable DynamicAllocaLayout, whose address is larger than every dynamic alloca.
The redzone needs to be unpoisoned when the function returns or when an llvm.stackrestore instruction is encountered; otherwise, accessing a different, future alloca may get a false positive.
Before each ret instruction, the instrumentation inserta an __asan_allocas_unpoison call to unpoison the region [*DynamicAllocaLayout, DynamicAllocaLayout).