Skip to content

Instantly share code, notes, and snippets.

@terdon
Created March 1, 2023 18:34
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save terdon/e1dc9cdec58d0351a0f6bbeb0051474b to your computer and use it in GitHub Desktop.
Save terdon/e1dc9cdec58d0351a0f6bbeb0051474b to your computer and use it in GitHub Desktop.
Terdon's README for P14s CPU throttling

Running any intensive job immediately throttles the CPU to 400Mhz! I found this:

https://www.reddit.com/r/thinkpad/comments/pvb87e/thinkpad_p14s_gen_2_intel_fix_for_aggressive_cpu/

which sent me to https://forums.lenovo.com/t5/Other-Linux-Discussions/X1C6-T480s-low-cTDP-and-trip-temperature-in-Linux/m-p/4028489?page=40#5069052 which explained there are actual Fn keys for this!

Can you check something for me. Do:

cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  

And then try FN+L, FN+M, FN+H with the command above in between each one. These function keys should be switching between low, medium and high mode and you should see the power limit shift with each key press.

FN+H seems to have helped significantly!

Need to go through https://www.reddit.com/r/thinkpad/wiki/os/linux/

Try disabling Lenovo Intelligent Thermal Solution Service in the BIOS. No suchj thing, seems to be Windows shit.

OK, found this: https://forums.lenovo.com/topic/findpost/1306/5087833/5530806

The good news is that, at least on Linux, the thinkpad_acpi driver allows you to set the fan level and the intel_rapl driver allows you to set a reasonable power budget (the data sheet for my CPU calls for 12 to 28 watts depending on how much performance you want).

I put a bit more background here https://github.com/daniel-kristjansson/smart-fancontrol/blob/main/README.md

Readme has a lot of info, but suggests using thermal-deamon instead (quoting the readme):

Ubuntu actually ships with a deamon that manages the power budget! It's the thermal-deamon mentoned in the alternatives section above. Unfortunately, it disables itself when the thinkpad_acpi driver is loaded. You can make sure it does its job by adding --ignore-cpuid-check to the ExecStart line in the systemd /lib/systemd/system/thermald.service file. This will prevent the catostrophic level of throttling that happens when nothing is managing the power budget.

So:

sudo pacman -S thermald 

Then edit /lib/systemd/system/thermald.service and change the ExecStart command to:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive

sudo systemctl enable thermald.service 

Holy fuck! I am now in a GMeet, with CPUs at slightly > 1000 MHz! No idea what will happen after restarting, but so far so good!

Not that good. 1200MHz is too slow for meet and anything else. Better than 400MHz, yes, but still way too slow to be usable. However, I just tried

sudo systemctl restart  thermald.service 

And then:

[root@oregano ~]# sudo systemctl status  thermald.service 
○ thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: inactive (dead) since Fri 2022-07-22 15:17:35 BST; 4s ago
Duration: 3ms
Process: 3132908 ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive (code=exited, status=0/SUCCESS)
Main PID: 3132908 (code=exited, status=0/SUCCESS)
CPU: 9ms

Jul 22 15:17:35 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 15:17:35 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 15:17:35 oregano thermald[3132908]: 27 CPUID levels; family:model:stepping 0x6:8c:1 (6:140:1)
Jul 22 15:17:35 oregano thermald[3132908]: [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run on this p>
Jul 22 15:17:35 oregano thermald[3132908]: Unsupported cpu model or platform
Jul 22 15:17:35 oregano systemd[1]: thermald.service: Deactivated successfully.

It isn't active, and suddenly my CPUs are at 2500MHz! WTF!?!?!?

Trying to open two meet instances, one in chromium one in brave (talking to myself again). Was working perfectly, at 2.5GHz for a while, then I pressed Fn+L and checked that /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw was back down to 6000000 and CPU down to 1.2GHz, but Fn+H doesn't bring it back up. I tried

# echo 35000000 > /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  ; cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw 

Which worked to bring the value up, but the CPU speed stayed constant and everything is slow again. Reboot and come back.

After reboot, output of systemctl status thermald.service is the same as above, (deactivated) and CPUs are now at 400MHz again.

# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  
8000000

Try Fn+H: worked, we're back to 1.1MHz and

# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  
10000000

Still clearly capped at 1.2 though.

Found https://forums.lenovo.com/t5/Linux-Discussion/T480s-low-cTDP-and-trip-temperature-in-Linux/td-p/4028489?page=46 which suggests a different set of parameters in /lib/systemd/system/thermald.service:

If you pass "--ignore-cpuid-check" to thermald, then it should still run on those platforms. Disabling the check currently does not work in adaptive mode. Probably these people don't want "--adaptive" anyway, but it likely makes sense to explicitly remove "--adaptive" and add "--ignore-cpuid-check" at the same time.

Yeah, that makes it work, but no dice: back to capping, at 900MHz this time and Fn+H makes no difference (I also tried toggling with Fn+L and Fn+M and then Fn+H). Governor is set to powersave though:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave
[root@oregano ~]# 

Try changing that:

[root@oregano ~]# echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
[root@oregano ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
[root@oregano ~]# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  
10000000

Nope, still slow. We're now back to 1.2GHz which is better but still crap. I also tried forcing 35000000 which is what I had seen in /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw when it was working well, but no dice:

# echo 35000000 > /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  ; cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw 
35000000
# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  
35000000

Showing the right value, but CPUs capped at 1.2GHz. Try changing the thermald service file to have adaptive as well:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check --adaptive

Then

systemctl daemon-reload
systemctl restart thermald.service ; systemctl status thermald.service 

Loaded but capped. Removed the --ignore-cpuid-check, left --adaptive, restarted and now it doesn't load BUT we're back at decent speeds again, so WTF!? I'm now at >2.5GHz. Do I need to remove thermald completely or something?

# cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw  
35000000

I will try a full shutdown and then turn on again and see where I'm at.

I also re-anabled the intel power whatever service and set it to max performance but I'm now back to400MHz.

back to 1200 and:

cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw 
8000000

Fn+H takes me to

cat /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio\:0/constraint_0_power_limit_uw 
10000000

And 1000MHz. Thermald is still failing:

○ thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: inactive (dead) since Fri 2022-07-22 16:32:46 BST; 3min 9s ago
Duration: 4ms
Process: 454 ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive (code=exited, status=0/SUCCESS)
Main PID: 454 (code=exited, status=0/SUCCESS)
CPU: 17ms

Jul 22 16:32:46 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 16:32:46 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 16:32:46 oregano thermald[454]: NO RAPL sysfs present
Jul 22 16:32:46 oregano thermald[454]: 27 CPUID levels; family:model:stepping 0x6:8c:1 (6:140:1)
Jul 22 16:32:46 oregano thermald[454]: [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run on this platform
Jul 22 16:32:46 oregano thermald[454]: Unsupported cpu model or platform
Jul 22 16:32:46 oregano systemd[1]: thermald.service: Deactivated successfully.

But governor is back to powersave:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave

Change it:

 # echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
 performance
 # cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
 performance
 performance
 performance
 performance
 performance
 performance
 performance
 performance

Still capped... try the thermald restart thing again nope, restarting it made no change. Try re-adding the --ignore-cpuid-check:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check --adaptive

Sigh. No, still capped after systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service. Go back to:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive

then

systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service

And there we go, this works. I'm now at my full 2.8GHz. And if I remove --adaptive?

ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check

and:

# systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
● thermald.service - Thermal Daemon Service
Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
Active: active (running) since Fri 2022-07-22 16:56:59 BST; 7ms ago
Main PID: 24393 (thermald)
Tasks: 2 (limit: 38139)
Memory: 3.3M
CPU: 11ms
CGroup: /system.slice/thermald.service
└─24393 /usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check

Jul 22 16:56:59 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 16:56:59 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: sensor id 14 : No temp sysfs for reading raw temp
Jul 22 16:56:59 oregano thermald[24393]: Config file /etc/thermald/thermal-conf.xml does not exist
Jul 22 16:56:59 oregano thermald[24393]: Config file /etc/thermald/thermal-conf.xml does not exist

Still good. So how do I automate this? Will try another hard reboot (shutdown, restart) and see what's up.

This time, I saw a "FAILED to start simple and lightweight fan manager" message, thinkfan seems to not have loaded. Indeed, fans are very loud, but still throttled at 1.1-1.2 GHz.

# systemctl status thinkfan.service
× thinkfan.service - simple and lightweight fan control program
Loaded: loaded (/usr/lib/systemd/system/thinkfan.service; enabled; preset: disabled)
Drop-In: /etc/systemd/system/thinkfan.service.d
└─override.conf
Active: failed (Result: exit-code) since Fri 2022-07-22 17:25:58 BST; 2min 42s ago
Process: 463 ExecStart=/usr/bin/thinkfan $THINKFAN_ARGS (code=exited, status=1/FAILURE)
CPU: 18ms

Jul 22 17:25:58 oregano systemd[1]: Starting simple and lightweight fan control program...
Jul 22 17:25:58 oregano thinkfan[463]: ERROR: /etc/thinkfan.conf:11:
name: thinkpad
^
Could not find a hwmon with this name.
Jul 22 17:25:58 oregano systemd[1]: thinkfan.service: Control process exited, code=exited, status=1/FAILURE
Jul 22 17:25:58 oregano systemd[1]: thinkfan.service: Failed with result 'exit-code'.
Jul 22 17:25:58 oregano systemd[1]: Failed to start simple and lightweight fan control program.

That looks like it's because of this section in /etc/thinkfan.conf:

# Chassis
  • hwmon: /sys/class/hwmon name: thinkpad indices: [3, 5, 6, 7]

I will comment that out and restart thinkfan. Yep, that worked. Whatevs, still throttled, I'm back to powersave:

# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave
powersave
powersave
powersave
powersave
powersave
powersave
powersave

Fn+H made no difference to that. So I did:

echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Which worked. Still throttled, but now I have the right governor.

systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service

Still throttled. It looks like I really need to edit the damn config file every time. Try sed:

sed -i 's/--ignore-cpuid-check//' /lib/systemd/system/thermald.service; sed -i '/ExecStart=/s/$/--ignore-cpuid-check/' /lib/systemd/system/thermald.service systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service

nope. Fn+H now? Nope. Remove --ignore-cpuid-check with emacs and then reload again? No change, thermald working still throttled. Put it back? same. Fn+H again?

Tried some more combinations of editing and eventually I got rid of the throttle again with:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive

And thermald off.

OK, try disabling thermald and rebooting.

# systemctl disable thermald.service 
Removed "/etc/systemd/system/multi-user.target.wants/thermaldq.service".
Removed "/etc/systemd/system/dbus-org.freedesktop.thermald.service".
# shutdown -h now

OK, rebooted straight to 1.2GHz capping now. Governor back to powersave. Changing to performance

echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

still capped.

systemctl status thermald.service ○ thermald.service - Thermal Daemon Service Loaded: loaded (/usr/lib/systemd/system/thermald.service; disabled; prese> Active: inactive (dead)

As soon as I opened chromium, I got capped to 400MHz. Fn+H raised it to 900MHz.

Reanable thermald:

#grep ExecStart /lib/systemd/system/thermald.service
ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive


systemctl enable thermald.service 
systemctl start thermald.service 

Status is back to [/sys/devices/platform/thinkpad_acpi/dytc_lapmode] present: Thermald can't run , so add the --cpu thing

ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check

then

systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service

Started, but still throttled. Put back to

ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive

Still throttled, thermald not on.

ExecStart=/usr/bin/thermald --systemd --dbus-enable --ignore-cpuid-check

Nope, still throttled. Played around a bit more and in the end this worked:

ExecStart=/usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpuid-check 

With thermald running:

# systemctl daemon-reload; systemctl restart thermald.service ; systemctl status thermald.service
● thermald.service - Thermal Daemon Service
 Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disabled)
 Active: active (running) since Fri 2022-07-22 18:10:52 BST; 6ms ago

Main PID: 16468 (thermald) Tasks: 2 (limit: 38139) Memory: 3.2M CPU: 7ms CGroup: /system.slice/thermald.service └─16468 /usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpuid-check

Jul 22 18:10:52 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:10:52 oregano systemd[1]: Started Thermal Daemon Service.

So enable the service, hard reboot (shutdown, then restart). Back to therottling, bloody hell.

[root@oregano ~]# systemctl status thermald.service 

● thermald.service - Thermal Daemon Service Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset> Active: active (running) since Fri 2022-07-22 18:13:15 BST; 1min 17s ago Main PID: 409 (thermald) Tasks: 4 (limit: 38139) Memory: 6.2M CPU: 135ms CGroup: /system.slice/thermald.service └─409 /usr/bin/thermald --systemd --dbus-enable --adaptive --igno>

Jul 22 18:13:15 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:13:15 oregano thermald[409]: NO RAPL sysfs present
Jul 22 18:13:15 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: sensor id 12 : No temp sysfs for readin>
Jul 22 18:13:15 oregano thermald[409]: Polling mode is enabled: 4

NO RAPL sysfs present? What if I just restart it a few times? Yes! restarting fixed it without modifying config files!

● thermald.service - Thermal Daemon Service
 Loaded: loaded (/usr/lib/systemd/system/thermald.service; enabled; preset: disab>
 Active: active (running) since Fri 2022-07-22 18:15:38 BST; 19s ago

Main PID: 4890 (thermald) Tasks: 4 (limit: 38139) Memory: 3.6M CPU: 32ms CGroup: /system.slice/thermald.service └─4890 /usr/bin/thermald --systemd --dbus-enable --adaptive --ignore-cpu>

Jul 22 18:15:38 oregano systemd[1]: Starting Thermal Daemon Service...
Jul 22 18:15:38 oregano systemd[1]: Started Thermal Daemon Service.
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: sensor id 14 : No temp sysfs for reading raw >
Jul 22 18:15:39 oregano thermald[4890]: Polling mode is enabled: 4

I'm now above 2GHz even with the powersave governor!

Last time, reboot and then just restart the service and see if that fixes everything again.

YESH!

OK, try adding a systemd-timer to do this a minute after each reboot so I don't need to restart manually":

https://wiki.archlinux.org/title/Systemd/Timers#Timer_units

emacs /etc/systemd/system/thermaldRestart.timer

Add:

[Unit]
Description=Restart the thermald service to make it work (see ~terdon/README.install)

[Timer]
OnBootSec=1min

[Install]
WantedBy=timers.target

Then /etc/systemd/system/thermaldRestart.service:

[Unit]
Description=Restart Thermal Daemon Service
ConditionVirtualization=no

[Service]
Type=oneshot
ExecStart=/sbin/systemctl restart thermald.service

And:

$ systemctl enable thermaldRestart.timer 
Created symlink /etc/systemd/system/timers.target.wants/thermaldRestart.timer → /etc/systemd/system/thermaldRestart.timer.

Try rebooting and check if it works. YESH! Works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment