Skip to content

Instantly share code, notes, and snippets.

@pdsouza
Last active February 15, 2023 13:51
Show Gist options
  • Save pdsouza/bc1d32344ed00af785f98f0ec04a6bb0 to your computer and use it in GitHub Desktop.
Save pdsouza/bc1d32344ed00af785f98f0ec04a6bb0 to your computer and use it in GitHub Desktop.
linux kernel: debugging

Linux Kernel Debugging

Debugging an Oops

Let's say we have a kernel oops like in maruos/platform_external_lxc#2 (comment). Here's the relevant snippet:

[  289.238687] Unhandled fault: synchronous external abort (0x96000010) at 0xffffff800087a000
[  289.238853] Internal error: : 96000010 [#1] PREEMPT SMP
[  289.239003] CPU: 0 PID: 4928 Comm: lxc-start Not tainted 3.10.73-g84d48e8-00002-gfb7b9c7 #1
[  289.239083] task: ffffffc01f4635c0 ti: ffffffc05fe34000 task.ti: ffffffc05fe34000
[  289.239243] PC is at msm_hsl_startup+0xc8/0x2ac
[  289.239323] LR is at msm_hsl_startup+0xac/0x2ac
[  289.239468] pc : [<ffffffc000563cbc>] lr : [<ffffffc000563ca0>] pstate: 800001c5
[  289.239545] sp : ffffffc05fe37a20
...

In this case, we fortunately have the immediate faulting function location by looking at the PC value: msm_hsl_startup+0xc8/0x2ac. The oops also gives us the exact hex address as ffffffc000563cbc. This can be read as "The CPU was executing an instruction at a location 0xc8 (decimal: 200) bytes offset from the beginning of the function msm_hsl_startup".

But what we would really like is the source code line number, so we can fix the issue.

First, let's find the location of the start of the faulting function msm_hsl_startup. The obvious way to do this is to calculate 0xffffffc000563cbc - 0xc8 = 0xffffffc000563bf4. Hex arithmetic is no fun though, so we can cheat and search the System.map file generated by your kernel build:

$ grep msm_hsl_startup System.map 
ffffffc000563bf4 t msm_hsl_startup

We can see that this agrees with our arithmetic, and is a nice check that we are using the right System.map file to debug this particular kernel's oops.

When you build the Linux kernel, it will create a vmlinux object file that you can disassemble in situations like this. So go to your kernel directory, and run:

$ aarch64-linux-android-objdump -S \
    --show-raw-insn --prefix-addresses --line-numbers \
    --start-address=0xffffffc000563bf4 --stop-address=0xffffffc000564000 \
    vmlinux > objdump.txt

Notice that we pass the start address we figured out previously, and a reasonable offset away as the stop address so we don't need to wait forever for the disassembly of the entire kernel.

Inspect objdump.txt and we can find the exact faulting instruction and source code:

...
ffffffc000563cb0 <msm_hsl_startup+0xbc> b8626a83        ldr     w3, [x20,x2]
ffffffc000563cb4 <msm_hsl_startup+0xc0> f9400a62        ldr     x2, [x19,#16]
ffffffc000563cb8 <msm_hsl_startup+0xc4> 8b030042        add     x2, x2, x3
__raw_readl_no_log():
/var/maru/android_kernel_msm/arch/arm64/include/asm/io.h:77
}

static inline u32 __raw_readl_no_log(const volatile void __iomem *addr)
{
        u32 val;
        asm volatile("ldar %w0, [%1]" : "=r" (val) : "r" (addr));
ffffffc000563cbc <msm_hsl_startup+0xc8> 88dffc42        ldar    w2, [x2]
msm_hsl_read():
/var/maru/android_kernel_msm/drivers/tty/serial/msm_serial_hs_lite.c:190
                port->membase + off));
        __iormb();
ffffffc000563cc0 <msm_hsl_startup+0xcc> d5033d9f        dsb     ld
...

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment