Skip to content

Instantly share code, notes, and snippets.

Avatar

Oleksandr Natalenko pfactum

View GitHub Profile
@pfactum
pfactum / ksm-5.1.patch
Created May 9, 2019
KSM "always" mode preparations
View ksm-5.1.patch
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2b8ee90bb644..510766a3fa05 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2008,6 +2008,13 @@
0: force disabled
1: force enabled
+ ksm_mode=
+ [KNL]
View bmq_oops.log
[ 9225.305695] ------------[ cut here ]------------
[ 9225.305699] ------------[ cut here ]------------
[ 9225.305707] DEBUG_LOCKS_WARN_ON(val > preempt_count())
[ 9225.305808] WARNING: CPU: 0 PID: 11923 at kernel/sched/bmq.c:2851 preempt_count_sub+0x5e/0xa0
[ 9225.318289] pvqspinlock: lock 0xffff8d43ff721980 has corrupted value 0x0!
[ 9225.327346] Modules linked in: joydev iTCO_wdt iTCO_vendor_support mousedev kvm_intel kvm bochs_drm ttm drm_kms_helper irqbypass psmouse pcspkr
drm i2c_i801 lpc_ich input_leds intel_agp syscopyarea intel_gtt sysfillrect qemu_fw_cfg evdev sysimgblt fb_sys_fops agpgart mac_hid nls_iso88
View bmq_panic.log
[ 371.383452] psi: inconsistent task state! task=1317:Compositor cpu=2 psi_flags=4 clear=0 set=4
[ 371.383486] ------------[ cut here ]------------
[ 371.383513] bmq: dequeue task reside on cpu0 from cpu2
[ 371.383532] WARNING: CPU: 0 PID: 1227 at kernel/sched/bmq.c:607 detach_task+0x30f/0x370
[ 371.383538] Modules linked in: netconsole md4 cmac nls_utf8 cifs ccm dns_resolver fscache bridge stp llc nft_counter nft_ct nf_conn
track nf_defrag_ipv6 nf_defrag_ipv4 msr nf_tables nfnetlink tun ext4 crc16 mbcache jbd2 arc4 ath9k intel_rapl x86_pkg_temp_thermal int
el_powerclamp ath9k_common coretemp kvm_intel ath9k_hw ath kvm mac80211 snd_hda_codec_hdmi snd_hda_codec_cirrus snd_hda_codec_generic
snd_hda_intel snd_hda_codec psmouse dell_wmi snd_hda_core irqbypass intel_cstate i2c_i801 intel_uncore rtsx_usb_ms dell_laptop ledtrig
_audio dell_smbios wmi_bmof mei_hdcp memstick dell_wmi_descriptor dcdbas dell_smm_hwmon iTCO_wdt sparse_keymap alx snd_hwdep intel_rap
l_perf iTCO_vendor_support cfg80211 lpc_ich mdio mo
View build_err.log
kernel/sched/psi.c: In function ‘cgroup_move_task’:
kernel/sched/psi.c:661:7: error: implicit declaration of function ‘task_rq_lock’; did you mean ‘task_unlock’? [-Werror=implicit-function-declarati
on]
rq = task_rq_lock(task, &rf);
^~~~~~~~~~~~
task_unlock
kernel/sched/psi.c:661:5: warning: assignment to ‘struct rq *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
rq = task_rq_lock(task, &rf);
^
kernel/sched/psi.c:663:6: error: implicit declaration of function ‘task_on_rq_queued’ [-Werror=implicit-function-declaration]
View oops.log
Dec 18 00:07:39 archlinux kernel: Unregister pv shared memory for cpu 30
Dec 18 00:07:39 archlinux kernel: ------------[ cut here ]------------
Dec 18 00:07:39 archlinux kernel: sched: Unexpected reschedule of offline CPU#30!
Dec 18 00:07:39 archlinux kernel: WARNING: CPU: 31 PID: 239 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x34/0x40
Dec 18 00:07:39 archlinux kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat 9p fscache joydev mousedev iTCO_wdt iTCO_vendor$
support kvm_intel kvm irqbypass psmouse pcspkr i2c_i801 bochs_drm input_leds ttm qemu_fw_cfg drm_kms_helper drm syscopyarea intel_agp
sysfillrect sysimgblt intel_gtt evdev fb_sys_fops agpgart lpc_ich mac_hid ip_tables x_tables xfs dm_thin_pool dm_persistent_data dm_b$
o_prison dm_bufio libcrc32c crc32c_generic dm_crypt algif_skcipher af_alg dm_mod raid10 md_mod serio_raw sr_mod atkbd cdrom sd_mod li$
ps2 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcb
View oops.txt
Dec 14 08:19:39 spock kernel: sd 0:0:0:0: [sda] Stopping disk
Dec 14 08:19:39 spock kernel: ACPI: EC: interrupt blocked
Dec 14 08:19:39 spock kernel: ACPI: Preparing to enter system sleep state S3
Dec 14 08:19:39 spock kernel: ACPI: EC: event blocked
Dec 14 08:19:39 spock kernel: ACPI: EC: EC stopped
Dec 14 08:19:39 spock kernel: PM: Saving platform NVS memory
Dec 14 08:19:39 spock kernel: Disabling non-boot CPUs ...
Dec 14 08:19:39 spock kernel: smpboot: CPU 1 is now offline
Dec 14 08:19:39 spock kernel: ------------[ cut here ]------------
Dec 14 08:19:39 spock kernel: sched: Unexpected reschedule of offline CPU#2!
View investigation.txt
1. 100% on laptop and *almost* always on QEMU.
For QEMU I use the following command with a minimal kernel:
$ qemu-system-x86_64 -machine q35,accel=kvm -cpu core2duo -smp cores=8,threads=2 -kernel arch/x86/boot/bzImage -append "ignore_loglevel threadirqs nokaslr"
I've added "threadirqs", because it triggers the issue in the VM more reliably (lots of threads to schedule at early stages). On a real machine the issue is reproducible without this option too.
Minimal kernel config: [1]
View dmesg.txt
[ 0.000000] Linux version 4.19.0-pf7+ (pf@spock) (gcc version 8.2.1 20181127 (GCC)) #1 SMP PREEMPT Fri Dec 7 10:24:43 CET 2018
[ 0.000000] Command line: ignore_loglevel threadirqs nokaslr
[ 0.000000] x86/fpu: x87 FPU will use FXSAVE
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000007fdefff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000007fdf000-0x0000000007ffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
View config
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.19.0-pf7 Kernel Configuration
#
#
# Compiler: gcc (GCC) 8.2.1 20181127
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=80201
View pds_panic_smt.txt
[ 2.504941] smp: Brought up 1 node, 4 CPUs
[ 2.510787] smpboot: Max logical packages: 1
[ 2.514909] smpboot: Total of 4 processors activated (12800.00 BogoMIPS)
[ 2.528811] pds: cpu #0 affinity check mask - smt 0x00000002
[ 2.528816] pds: cpu #0 affinity check mask - coregroup 0x0000000e
[ 2.534857] pds: cpu #1 affinity check mask - smt 0x00000001
[ 2.544831] pds: cpu #1 affinity check mask - coregroup 0x0000000d
[ 2.554853] pds: cpu #2 affinity check mask - smt 0x00000008