Skip to content

Instantly share code, notes, and snippets.

@jmbjorndalen
Created March 23, 2023 12:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jmbjorndalen/23e6922b8ef069014463ffc97d717506 to your computer and use it in GitHub Desktop.
Save jmbjorndalen/23e6922b8ef069014463ffc97d717506 to your computer and use it in GitHub Desktop.
Hack to fix high idle CPU time for Minecraft servers on a Linux host
#!/usr/bin/env python3
"""
This is a quick hack that is intended to be used as a tool to help diagnose
high CPU usage by what should be an idle minecraft server.
Use with care! It may break things.
Needs to run as root to work.
The main problem is that Minecraft appears to be running parkNanos in a tight
loop with a very low timeout. Linux has higher precision timeouts by default
than, for instance, Windows. When this happens, you can observe that Minecraft
spins calling 3 system calls
- clock_gettime
- futex - with a timeout 1 microsecond away from the timestamp returned by clock_gettime
- futex - possibly to wake some other thread after the first futex times out.
The result is that an idle Minecraft server ends up consuming considerable CPU
time. As far as I could tell, out of around 60-70 threads on my modded server,
there are at least 3 threads that consume significant time. The most active
will be around 35% CPU time (out of a single core) on my system.
This hack works by tweaking the timer resolution for Minecraft's Java threads,
setting them closer to what a Windows developer might have experienced.
The problem can be a bit confusing to diagnose. If you, for instance, use the
spark profiler in Minecraft, it looks like the process is spending 99% of its
time sleeping. As far as I can tell, this is an illusion caused by the level
that the profiler works on.
Disclaimer:
- I have not tried to look at how Minecraft uses parkNanos internally.
- I have not traced the JDK source (apparently, it uses pthread condition
variables that then use futexes, but I have not verified this).
References (for more details on the problem and the hack to alleviate it):
- https://bugs.mojang.com/browse/MC-183518
- https://hazelcast.com/blog/locksupport-parknanos-under-the-hood-and-the-curious-case-of-parking/
TODO:
- probably don't need to do it for all of the threads (possibly only
those that call parkNanos). This is simpler though.
"""
import sys
import re
import subprocess
def check_cmd_java(tpid):
"Check that the command line for the pid starts with java"
cmd = open(f"/proc/{tpid}/cmdline").read().strip()
return cmd.startswith("java")
def get_java_pids(mainpid):
"""Returns a list of pids for the main process (the first in the list) and the threads in the java process"""
p = subprocess.Popen(f"jstack {pid}", shell=True, stdout=subprocess.PIPE)
jpids = [mainpid]
for line in p.stdout.readlines():
line = line.decode("utf-8").strip()
# nid is the thread pid as a hex value
m = re.search(".*tid=(0x[0-9A-Fa-f]+) nid=(0x[0-9A-Fa-f]+) .*", line)
if m:
tpid = int(m.group(2), base=16)
is_java = check_cmd_java(tpid)
print("Found ", m.group(1), m.group(2), tpid, is_java)
if is_java:
jpids.append(tpid)
return jpids
def set_timerslack(pids, slack):
"""Sets timerslack_ns for each of the provided pids"""
for pid in pids:
with open(f"/proc/{pid}/timerslack_ns", 'w', encoding="utf-8") as f:
f.write(str(slack))
if len(sys.argv) < 2:
print("fix-java-timer.py pid [timer res in nanoseconds]")
exit()
pid = int(sys.argv[1])
timerslack = int(sys.argv[2]) if len(sys.argv) > 2 else 5_000_000 # set to 5 ms by default
if check_cmd_java(pid):
print(f"Checking process {pid} and trying to set timerslack {timerslack}")
jpids = get_java_pids(pid)
set_timerslack(jpids, timerslack)
else:
print(f"Specified pid {pid} is not a java process")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment