Skip to content

Instantly share code, notes, and snippets.

@mcastelino
Last active September 27, 2023 14:47
Show Gist options
  • Save mcastelino/b31f0648707b25478eb2a44f94a861fd to your computer and use it in GitHub Desktop.
Save mcastelino/b31f0648707b25478eb2a44f94a861fd to your computer and use it in GitHub Desktop.
Tracing QEMU-KVM Interactions

Tracing QEMU-KVM Interactions

But default in linux you can figure out how many times and for what reasons there is a VM Exit from a VM into the kvm kernel module. However given the ubiquity of vhost and the ability of kvm to emulate most device models directly in the kernel, most of those VM exits do not result in a transition from host kernel into the QEMU. The transitions from VM -> kvm -> QEMU are typically the most expensive.

Here we try to figure out how many of the VM Exits result in the invocation of QEMU.

Tracking VM-KVM Interactions

This can be done very simply with perf

sudo ./perf stat -e 'kvm:*' -a sleep 1s

This will give you statistics collected over 1s for all VM exits (across VMs)

You can also do this directly and get the full trace logs via

echo 1 > /sys/kernel/debug/tracing/events/kvm/enable
cat /sys/kernel/debug/tracing/trace_pipe

Background

All primary exits from kvm to QEMU are via the return path of KVM_RUN ioctl. Fortunately for us QEMU already has built in instrumentation for tracing. We will leverage that to selectively trace the return path of KVM_RUN and record the exit reasons

There is a second class of KVM-QEMU interaction that go through the ioeventfs,irqfd mechanism which we will instrument later.

Code that triggers tracing of the (explicit) kvm to qemu exit

https://github.com/qemu/qemu/blob/7e56accdaf35234b69c33c85e4a44a5d56325e53/accel/kvm/kvm-all.c#L2041

QEMU Tracing Documentation

http://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/devel/tracing.txt;hb=HEAD https://fossies.org/linux/qemu/docs/tracing.txt

ftrace backend

Enabling this backend allows you to trace both the VM-KVM events as well the KVM-QEMU events.

Build QEMU with tracking with ftrace backend

./configure --disable-bluez --disable-brlapi --disable-bzip2 --disable-curl --disable-curses --disable-debug-tcg --disable-fdt --disable-glusterfs --disable-gtk --disable-libiscsi --disable-libnfs --disable-libssh2 --disable-libusb --disable-linux-aio --disable-lzo --disable-opengl --disable-qom-cast-debug --disable-rbd --disable-rdma --disable-sdl --disable-seccomp --disable-slirp --disable-snappy --disable-spice --disable-strip --disable-tcg-interpreter --disable-tcmalloc --disable-tools --disable-tpm --disable-usb-redir --disable-uuid --disable-vnc --disable-vnc-jpeg --disable-vnc-png --disable-vnc-sasl --disable-vte --disable-xen --enable-attr --enable-cap-ng --enable-kvm --enable-virtfs --target-list=x86_64-softmmu --extra-cflags="-fno-semantic-interposition -O3 -falign-functions=32" --datadir=/usr/share/qemu-lite --libdir=/usr/lib64/qemu-lite --libexecdir=/usr/libexec/qemu-lite --enable-vhost-net --disable-docs --enable-trace-backends=ftrace

Launch QEMU with tracing enabled

Enable the tracking of just the kvm_run exit.

rm -f /tmp/event
echo kvm_run_exit > /tmp/events

sudo ../x86_64-softmmu/qemu-system-x86_64 -trace events=/tmp/events -machine pc-lite,accel=kvm,kernel_irqchip,nvdimm -cpu host -m 256,maxmem=1G,slots=2 -smp 2 -no-user-config -nodefaults -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -kernel ./vmlinux-4.9.34-63.1.container -append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1  init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false " -device virtio-serial-pci,id=virtio-serial0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -nographic -object memory-backend-file,id=mem0,share,mem-path=./clear-16160-containers.img,size=235929600 -device nvdimm,memdev=mem0,id=nv0 -no-reboot -device virtio-net-pci,netdev=net0,mac=DE:AD:BE:EF:07:83 -netdev tap,id=net0,ifname=net0,script=no,downscript=no,queues=2,vhost=off -qmp tcp:localhost:4444,server,nowait

Tracing I/O interactions

The kvm_run exit only records the explicit control exits. However for handing I/O QEMU-KVM irqfd and ioeventfd interaction mechanism. Unfortunately there does not exit a direct way to tap into the ioeventfd. However we can trace the actuall callback functions ultimately invoked.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d34e6b175e61821026893ec5298cc8e7558df43a http://blog.allenx.org/2015/07/05/kvm-irqfd-and-ioeventfd

Obtained the combined trace logs

If you had enabled kvm tracing before, you will see a mix of KVM exits as well as KVM to QEMU transistion traces.

cat /sys/kernel/debug/tracing/trace_pipe

Simple Backend

The following example shows how to trace using the simple backend. This is another method that is simpler and directly creates trace files on disk.

Build QEMU with tracing

./configure --disable-bluez --disable-brlapi --disable-bzip2 --disable-curl --disable-curses --disable-debug-tcg --disable-fdt --disable-glusterfs --disable-gtk --disable-libiscsi --disable-libnfs --disable-libssh2 --disable-libusb --disable-linux-aio --disable-lzo --disable-opengl --disable-qom-cast-debug --disable-rbd --disable-rdma --disable-sdl --disable-seccomp --disable-slirp --disable-snappy --disable-spice --disable-strip --disable-tcg-interpreter --disable-tcmalloc --disable-tools --disable-tpm --disable-usb-redir --disable-uuid --disable-vnc --disable-vnc-jpeg --disable-vnc-png --disable-vnc-sasl --disable-vte --disable-xen --enable-attr --enable-cap-ng --enable-kvm --enable-virtfs --target-list=x86_64-softmmu --extra-cflags="-fno-semantic-interposition -O3 -falign-functions=32" --datadir=/usr/share/qemu-lite --libdir=/usr/lib64/qemu-lite --libexecdir=/usr/libexec/qemu-lite --enable-vhost-net --disable-docs --enable-trace-backends=simple

Launch QEMU with tracing enabled

rm -f /tmp/event
echo kvm_run_exit > /tmp/events

sudo ../x86_64-softmmu/qemu-system-x86_64 -trace events=/tmp/events -machine pc-lite,accel=kvm,kernel_irqchip,nvdimm -cpu host -m 256,maxmem=1G,slots=2 -smp 2 -no-user-config -nodefaults -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -kernel ./vmlinux-4.9.34-63.1.container -append "reboot=k panic=1 rw tsc=reliable no_timer_check noreplace-smp root=/dev/pmem0p1  init=/usr/lib/systemd/systemd initcall_debug rootfstype=ext4 rootflags=dax,data=ordered dhcp rcupdate.rcu_expedited=1 clocksource=kvm-clock console=hvc0 single iommu=false " -device virtio-serial-pci,id=virtio-serial0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -nographic -object memory-backend-file,id=mem0,share,mem-path=./clear-16160-containers.img,size=235929600 -device nvdimm,memdev=mem0,id=nv0 -no-reboot -device virtio-net-pci,netdev=net0,mac=DE:AD:BE:EF:07:83 -netdev tap,id=net0,ifname=net0,script=no,downscript=no,queues=2,vhost=off -qmp tcp:localhost:4444,server,nowait

Analyze the Trace Events

simpletrace.py trace-events-all trace-6440 &> /tmp/xxx

Note:

  • The trace-events-all file will be generated at the time of qemu compilation
  • The trace-xyz file will be generated at qemu execution

KVM-QEMU Exit Reasons

This will help you co-relate exit reasons to the trace logs. Teh most common ones will be for IO (2), as these are used for device modeling.

https://github.com/qemu/qemu/blob/0748b3526e8cb78b9cd64208426bfc3d54a72b04/linux-headers/linux/kvm.h#L182

#define KVM_EXIT_UNKNOWN          0
#define KVM_EXIT_EXCEPTION        1
#define KVM_EXIT_IO               2
#define KVM_EXIT_HYPERCALL        3
#define KVM_EXIT_DEBUG            4
#define KVM_EXIT_HLT              5
#define KVM_EXIT_MMIO             6
#define KVM_EXIT_IRQ_WINDOW_OPEN  7
#define KVM_EXIT_SHUTDOWN         8
#define KVM_EXIT_FAIL_ENTRY       9
#define KVM_EXIT_INTR             10
#define KVM_EXIT_SET_TPR          11
#define KVM_EXIT_TPR_ACCESS       12
#define KVM_EXIT_S390_SIEIC       13
#define KVM_EXIT_S390_RESET       14
#define KVM_EXIT_DCR              15 /* deprecated */
#define KVM_EXIT_NMI              16
#define KVM_EXIT_INTERNAL_ERROR   17
#define KVM_EXIT_OSI              18
#define KVM_EXIT_PAPR_HCALL	  19
#define KVM_EXIT_S390_UCONTROL	  20
#define KVM_EXIT_WATCHDOG         21
#define KVM_EXIT_S390_TSCH        22
#define KVM_EXIT_EPR              23
#define KVM_EXIT_SYSTEM_EVENT     24
#define KVM_EXIT_S390_STSI        25
#define KVM_EXIT_IOAPIC_EOI       26
#define KVM_EXIT_HYPERV           27

Tracing all events

All events supported by QEMU can be found in the trace-events-all file. You can extract them all (or a subset) and feed them via a script such as the one below

cat trace-events-all | grep "(" | grep -v "#" | cut -f 1 -d "(" > /tmp/events
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment