Skip to content

Instantly share code, notes, and snippets.

@typesanitizer
Created October 24, 2020 22:22
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save typesanitizer/1992badabf9dc3de8cddae0258b1651d to your computer and use it in GitHub Desktop.
Save typesanitizer/1992badabf9dc3de8cddae0258b1651d to your computer and use it in GitHub Desktop.
Assembly intro comic - textual version

Assembly is wacky but cool!

Author: Varun Gandhi (@typesanitizer)

Panel 1

Compilers generate assembly in the penultimate stage of compilation.

The assembly is processed by an assembler, which generates machine code for subsequent processing by a linker.

Panel 2

Assembly typically consists of instructions, each of which has a name and zero or more operands.

Code example showing comparison between assembly and Python.

Example 1: nop in Assembly is like pass in Python. Example 2: inc rax in Assembly is like rax += 1 in Python. Example 3: mov rax 42 in Assembly is like rax = 42 in Python.

Panel 3

Assembly instructions are specific to a family of processors.

A laptop with the x86_64 architecture may have the following assembly:

mov rax [rdx]
call bake_cake

For the same task, a phone with the arm64 architecture may have the following assembly:

ldr x0 [x1]
call bake_cake

We will focus on x86_64 assembly.

Panel 4

Register provide scratch space for processors to do calculations.

Data is loaded from memory into the registers, where it is used for computation, and the data is stored back into memory.

In between the registers and the memory, there are some CPU caches.

Processors can also use constants (aka immediates) for calculations.

Panel 5

Registers can be special purpose or general purpose.

rax is a general purpose register that is used for calculations and passing arguments and return values.

RFLAGS is a special purpose register that holds bitflags for information related to overlfow, comparisons and so on.

Panel 6

Registers can overlap!

For example, the eax register is 32 bits wide. The lower 16 bits of eax form the ax register. The upper 8 bits of ax form the ah register. The lower 8 bits of ax form the al register.

eax itself represents the lower 32 bits of the rax register!

Panel 7

Some instructions modify bitflags to provide additional information.

add rax 7
jo .overflow

Adding 7 to rax sets OF (overflow flag) to 1 on unsigned overflow. If OF is 1, jo makes execution jump to the instruction immediately after the .overflow label.

Panel 8

Different addressing modes can be tricky to understand.

Example 1: mov [rax] rdx in assembly is like *rax = rdx in C. Example 2: mov rax [rdx + 2*rbx] in assembly is like rax = rdx[2*rbx] in C.

Terms and conditions apply: The exact mapping may require additional factors of 2, 4, or 8 depending on the types of different variables in the C code.

Panel 9

Compiler Explorer (https://godbolt.org) is a friendly tool to explore the assembly generated by different compilers.

Panel 10

In practice, understanding assembly can be tricky due to the large number of instructions and concepts.

Search keywords: instruction selection, register allocation, position-independent code, global offset table, disassembler, SIMD instructions, atomic instructions, memory model.

Learning resources: Intel and ARM architecture manuals, Compiler Explorer, Agner Fog instruction tables, Computer Architecture and Compilers coures, Shenzhen I/O.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment