Skip to content

Instantly share code, notes, and snippets.

@gaol
Last active June 1, 2022 09:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gaol/4d96eace8290e6549635fdc0ea41d0b4 to your computer and use it in GitHub Desktop.
Save gaol/4d96eace8290e6549635fdc0ea41d0b4 to your computer and use it in GitHub Desktop.
OpenJDK 17.0.2 - Cgroup v1 initialization causes NullPointerException when cgroup path does not start with the mount root

When I ran wildfly testsuite on JDK 17 within a podman container, I got NPE for all tests, please refer to NPE.stacktrace.java below on the stack trace, all work fine if I run it on base metal.

It relates to JDK issue: https://bugs.openjdk.java.net/browse/JDK-8272124, but this demostrates another case when the cgroup path does not start with the mount root.

In this case /proc/self/cgroup has the following line:

9:memory:/user.slice/user-1000.slice/session-3.scope

while /proc/self/mountinfo has the following line:

941 931 0:36 /user.slice/user-1000.slice/session-50.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

The environment:

  • Java version
openjdk version "17.0.2" 2022-01-18
OpenJDK Runtime Environment 21.9 (build 17.0.2+8)
OpenJDK 64-Bit Server VM 21.9 (build 17.0.2+8, mixed mode, sharing)
  • RHEL 8.5:
[jenkins@testjenkins ~]$ uname -a
Linux testjenkins 4.18.0-348.el8.x86_64 #1 SMP Mon Oct 4 12:17:22 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux
[jenkins@testjenkins ~]$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.5 (Ootpa)
  • podman version:
[jenkins@testjenkins ~]$ podman --version
podman version 3.4.2
```
[ERROR] Failed to execute goal org.wildfly.plugins:wildfly-maven-plugin:2.0.1.Final:execute-commands (apply-elytron) on project wildfly-ts-integ-smoke: Failed to execute commands: Exception in thread "main"
java.lang.NullPointerException
[ERROR] at java.base/java.util.Objects.requireNonNull(Objects.java:208)
[ERROR] at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:263)
[ERROR] at java.base/java.nio.file.Path.of(Path.java:147)
[ERROR] at java.base/java.nio.file.Paths.get(Paths.java:69)
[ERROR] at java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$1(CgroupUtil.java:67)
[ERROR] at java.base/java.security.AccessController.doPrivileged(AccessController.java:569)
[ERROR] at java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:69)
[ERROR] at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
[ERROR] at java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
[ERROR] at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:175)
[ERROR] at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:149)
[ERROR] at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:84)
[ERROR] at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getInstance(CgroupV1Subsystem.java:60)
[ERROR] at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:116)
[ERROR] at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:167)
[ERROR] at java.base/jdk.internal.platform.SystemMetrics.instance(SystemMetrics.java:29)
[ERROR] at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:58)
[ERROR] at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
[ERROR] at jdk.management/com.sun.management.internal.OperatingSystemImpl.<init>(OperatingSystemImpl.java:182)
[ERROR] at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(PlatformMBeanProviderImpl.java:280)
[ERROR] at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl$3.nameToMBeanMap(PlatformMBeanProviderImpl.java:199)
[ERROR] at java.management/java.lang.management.ManagementFactory.lambda$getPlatformMBeanServer$0(ManagementFactory.java:488)
[ERROR] at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273)
[ERROR] at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
[ERROR] at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1779)
[ERROR] at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
[ERROR] at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
[ERROR] at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
[ERROR] at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
[ERROR] at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
[ERROR] at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
[ERROR] at java.management/java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:489)
[ERROR] at org.jboss.modules.ModuleLoader$RealMBeanReg$1.run(ModuleLoader.java:1258)
[ERROR] at org.jboss.modules.ModuleLoader$RealMBeanReg$1.run(ModuleLoader.java:1256)
[ERROR] at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
[ERROR] at org.jboss.modules.ModuleLoader$RealMBeanReg.<init>(ModuleLoader.java:1256)
[ERROR] at org.jboss.modules.ModuleLoader$TempMBeanReg.installReal(ModuleLoader.java:1240)
[ERROR] at org.jboss.modules.ModuleLoader.installMBeanServer(ModuleLoader.java:273)
[ERROR] at org.jboss.modules.Main.main(Main.java:605)
```
[
{
"Id": "1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac",
"Created": "2022-05-02T20:48:21.078644068+08:00",
"Path": "/var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait.sh",
"Args": [
"/var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait.sh"
],
"State": {
"OciVersion": "1.0.2-dev",
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 7632,
"ConmonPid": 7619,
"ExitCode": 0,
"Error": "",
"StartedAt": "2022-05-02T20:48:21.288271319+08:00",
"FinishedAt": "0001-01-01T00:00:00Z",
"Healthcheck": {
"Status": "",
"FailingStreak": 0,
"Log": null
}
},
"Image": "bfebb38e834973abcdd7928c858b36c7b1f2540f409ce440671bb6237bdbe03f",
"ImageName": "localhost/automatons:latest",
"Rootfs": "",
"Pod": "",
"ResolvConfPath": "/run/user/1000/containers/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/resolv.conf",
"HostnamePath": "/run/user/1000/containers/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/hostname",
"HostsPath": "/run/user/1000/containers/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/hosts",
"StaticDir": "/home/jenkins/.local/share/containers/storage/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata",
"OCIConfigPath": "/home/jenkins/.local/share/containers/storage/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/config.json",
"OCIRuntime": "runc",
"ConmonPidFile": "/run/user/1000/containers/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/conmon.pid",
"PidFile": "/run/user/1000/containers/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/pidfile",
"Name": "automaton-slave-eap-7.4.x-testsuite-19",
"RestartCount": 0,
"Driver": "overlay",
"MountLabel": "system_u:object_r:container_file_t:s0:c334,c907",
"ProcessLabel": "system_u:system_r:container_t:s0:c334,c907",
"AppArmorProfile": "",
"EffectiveCaps": null,
"BoundingCaps": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FOWNER",
"CAP_FSETID",
"CAP_KILL",
"CAP_NET_BIND_SERVICE",
"CAP_NET_RAW",
"CAP_SETFCAP",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETUID",
"CAP_SYS_CHROOT"
],
"ExecIDs": [
"cd38093eb72bc4947e6a09a297cd767ec49ef26cac452b2839d7c47713cb0de6"
],
"GraphDriver": {
"Name": "overlay",
"Data": {
"LowerDir": "/home/jenkins/.local/share/containers/storage/overlay/904c65c233abd3aa8ecbe08a1f3d6a4a4a7704abe47e673ffcea9030d082a9b8/diff:/home/jenkins/.local/share/containers/storage/overlay/930a368523e8d9ca054beac02aa2ec0009395486ab58c5a9d7646102e7a33601/diff:/home/jenkins/.local/share/containers/storage/overlay/ee494184ff635908355ff3828a5fdb21dae83f453bac2cdd4f147b315fbb75d2/diff:/home/jenkins/.local/share/containers/storage/overlay/1e4f5e9e1e495a879ebbb6c9c406a3b77016d74024d2d8ebdc4a56ee7434a580/diff:/home/jenkins/.local/share/containers/storage/overlay/5370b65977de2ed522b13ebaeca0fd41501f6eab2c55513c5d927594c2c253e4/diff:/home/jenkins/.local/share/containers/storage/overlay/93749af418e72b7f9d1998cdf41d4007dc27065fe4d79a3a05abf4bf274a2fac/diff",
"MergedDir": "/home/jenkins/.local/share/containers/storage/overlay/5f2f3e72dd86114d232fed1d07a450bcb60c84c3d55123ed797c82941a8eb9a6/merged",
"UpperDir": "/home/jenkins/.local/share/containers/storage/overlay/5f2f3e72dd86114d232fed1d07a450bcb60c84c3d55123ed797c82941a8eb9a6/diff",
"WorkDir": "/home/jenkins/.local/share/containers/storage/overlay/5f2f3e72dd86114d232fed1d07a450bcb60c84c3d55123ed797c82941a8eb9a6/work"
}
},
"Mounts": [
{
"Type": "bind",
"Source": "/opt",
"Destination": "/opt",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/.m2",
"Destination": "/var/jenkins_home/.m2/",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/.ssh",
"Destination": "/var/jenkins_home/.ssh/",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/.gitconfig",
"Destination": "/var/jenkins_home/.gitconfig",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/.netrc",
"Destination": "/var/jenkins_home/.netrc",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/current/jobs/eap-7.4.x-build/builds/7/archive",
"Destination": "/parent_job/",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/home/jenkins/current/workspace/eap-7.4.x-testsuite",
"Destination": "/var/jenkins_home/workspace/eap-7.4.x-testsuite",
"Driver": "",
"Mode": "",
"Options": [
"rbind"
],
"RW": true,
"Propagation": "rprivate"
}
],
"Dependencies": [],
"NetworkSettings": {
"EndpointID": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "",
"Bridge": "",
"SandboxID": "",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": ""
},
"ExitCommand": [
"/usr/bin/podman",
"--root",
"/home/jenkins/.local/share/containers/storage",
"--runroot",
"/run/user/1000/containers",
"--log-level",
"warning",
"--cgroup-manager",
"cgroupfs",
"--tmpdir",
"/run/user/1000/libpod/tmp",
"--runtime",
"runc",
"--storage-driver",
"overlay",
"--events-backend",
"file",
"container",
"cleanup",
"--rm",
"1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac"
],
"Namespace": "",
"IsInfra": false,
"Config": {
"Hostname": "1a2b19d89150",
"Domainname": "",
"User": "1000:1000",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm",
"container=oci",
"HOME=/var/jenkins_home/",
"HOSTNAME=1a2b19d89150"
],
"Cmd": [
"/var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait.sh"
],
"Image": "localhost/automatons:latest",
"Volumes": null,
"WorkingDir": "/var/jenkins_home/workspace/eap-7.4.x-testsuite",
"Entrypoint": "",
"OnBuild": null,
"Labels": {
"architecture": "x86_64",
"build-date": "2022-03-16T16:53:14.681638",
"com.redhat.build-host": "cpt-1001.osbs.prod.upshift.rdu2.redhat.com",
"com.redhat.component": "ubi8-container",
"com.redhat.license_terms": "https://www.redhat.com/en/about/red-hat-end-user-license-agreements#UBI",
"description": "The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly.",
"distribution-scope": "public",
"io.buildah.version": "1.23.1",
"io.k8s.description": "The Universal Base Image is designed and engineered to be the base layer for all of your containerized applications, middleware and utilities. This base image is freely redistributable, but Red Hat only supports Red Hat technologies through subscriptions for Red Hat products. This image is maintained by Red Hat and updated regularly.",
"io.k8s.display-name": "Red Hat Universal Base Image 8",
"io.openshift.expose-services": "",
"io.openshift.tags": "base rhel8",
"maintainer": "Red Hat, Inc.",
"name": "ubi8",
"release": "236.1647448331",
"summary": "Provides the latest release of Red Hat Universal Base Image 8.",
"url": "https://access.redhat.com/containers/#/registry.access.redhat.com/ubi8/images/8.5-236.1647448331",
"vcs-ref": "3aadd00326f3dd6cfe65ee31017ab98915fddb56",
"vcs-type": "git",
"vendor": "Red Hat, Inc.",
"version": "8.5"
},
"Annotations": {
"io.container.manager": "libpod",
"io.kubernetes.cri-o.Created": "2022-05-02T20:48:21.078644068+08:00",
"io.kubernetes.cri-o.TTY": "false",
"io.podman.annotations.autoremove": "TRUE",
"io.podman.annotations.init": "FALSE",
"io.podman.annotations.privileged": "FALSE",
"io.podman.annotations.publish-all": "FALSE",
"org.opencontainers.image.base.digest": "sha256:e2cbaf307f898bb43ad5e0b67bd9325c585d5e1890de0dfe7d4832d634abd39e",
"org.opencontainers.image.base.name": "registry.access.redhat.com/ubi8/ubi:latest",
"org.opencontainers.image.stopSignal": "15"
},
"StopSignal": 15,
"CreateCommand": [
"podman",
"run",
"--name",
"automaton-slave-eap-7.4.x-testsuite-19",
"--userns=keep-id",
"-u",
"1000:1000",
"--add-host=olympus:10.88.0.1",
"--rm",
"-v",
"/home/jenkins/current//jobs/eap-7.4.x-build/builds/7/archive:/parent_job/:ro",
"--workdir",
"/var/jenkins_home/workspace/eap-7.4.x-testsuite",
"-v",
"/home/jenkins/current/workspace/eap-7.4.x-testsuite:/var/jenkins_home/workspace/eap-7.4.x-testsuite:rw",
"-v",
"/opt:/opt:ro",
"-v",
"/home/jenkins/.m2/:/var/jenkins_home/.m2/:rw",
"-v",
"/home/jenkins/.ssh/:/var/jenkins_home/.ssh/:ro",
"-v",
"/home/jenkins/.gitconfig:/var/jenkins_home/.gitconfig:ro",
"-v",
"/home/jenkins/.netrc:/var/jenkins_home/.netrc:ro",
"-d",
"localhost/automatons",
"/var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait.sh"
],
"Umask": "0022",
"Timeout": 0,
"StopTimeout": 10
},
"HostConfig": {
"Binds": [
"/opt:/opt:ro,rprivate,rbind",
"/home/jenkins/.m2:/var/jenkins_home/.m2/:rw,rprivate,rbind",
"/home/jenkins/.ssh:/var/jenkins_home/.ssh/:ro,rprivate,rbind",
"/home/jenkins/.gitconfig:/var/jenkins_home/.gitconfig:ro,rprivate,rbind",
"/home/jenkins/.netrc:/var/jenkins_home/.netrc:ro,rprivate,rbind",
"/home/jenkins/current/jobs/eap-7.4.x-build/builds/7/archive:/parent_job/:ro,rprivate,rbind",
"/home/jenkins/current/workspace/eap-7.4.x-testsuite:/var/jenkins_home/workspace/eap-7.4.x-testsuite:rw,rprivate,rbind"
],
"CgroupManager": "cgroupfs",
"CgroupMode": "host",
"ContainerIDFile": "",
"LogConfig": {
"Type": "k8s-file",
"Config": null,
"Path": "/home/jenkins/.local/share/containers/storage/overlay-containers/1a2b19d8915046a04773d7ac350c95d861bcb295d06e9ea2558178eaaa10a1ac/userdata/ctr.log",
"Tag": "",
"Size": "0B"
},
"NetworkMode": "slirp4netns",
"PortBindings": {},
"RestartPolicy": {
"Name": "",
"MaximumRetryCount": 0
},
"AutoRemove": true,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": [],
"CapDrop": [
"CAP_AUDIT_WRITE",
"CAP_MKNOD"
],
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": [
"olympus:10.88.0.1"
],
"GroupAdd": [],
"IpcMode": "private",
"Cgroup": "",
"Cgroups": "default",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "private",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": [],
"Tmpfs": {},
"UTSMode": "private",
"UsernsMode": "private",
"ShmSize": 65536000,
"Runtime": "oci",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": 0,
"OomKillDisable": false,
"PidsLimit": 0,
"Ulimits": [],
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"CgroupConf": null
}
}
]
#subsys_name hierarchy num_cgroups enabled
cpuset 6 4 1
cpu 8 126 1
cpuacct 8 126 1
blkio 11 126 1
memory 9 167 1
devices 3 126 1
freezer 7 4 1
net_cls 5 4 1
perf_event 4 4 1
net_prio 5 4 1
hugetlb 2 4 1
pids 10 159 1
rdma 12 1 1
12:rdma:/
11:blkio:/system.slice/sshd.service
10:pids:/user.slice/user-1000.slice/session-3.scope
9:memory:/user.slice/user-1000.slice/session-3.scope
8:cpu,cpuacct:/
7:freezer:/
6:cpuset:/
5:net_cls,net_prio:/
4:perf_event:/
3:devices:/system.slice/sshd.service
2:hugetlb:/
1:name=systemd:/user.slice/user-1000.slice/user@1000.service/user.slice/podman-637674.scope
931 921 0:86 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw,context="system_u:object_r:container_file_t:s0:c375,c611",mode=755,uid=100000,gid=100000
932 931 0:26 /user.slice/user-1000.slice/user@1000.service/user.slice/podman-637196.scope/2be1ee6e076d20b38a61a1e3289974662d646fb50489ccf22f7fa4e3dc082295 /sys/fs/cgroup/systemd ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
933 931 0:29 / /sys/fs/cgroup/hugetlb ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb
934 931 0:30 /user.slice /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices
935 931 0:31 / /sys/fs/cgroup/perf_event ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event
936 931 0:32 / /sys/fs/cgroup/net_cls,net_prio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_cls,net_prio
937 931 0:33 / /sys/fs/cgroup/cpuset ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset
938 931 0:34 / /sys/fs/cgroup/freezer ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer
940 931 0:35 /user.slice /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpu,cpuacct
941 931 0:36 /user.slice/user-1000.slice/session-50.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory
942 931 0:37 /user.slice/user-1000.slice/session-50.scope /sys/fs/cgroup/pids ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids
943 931 0:38 /user.slice /sys/fs/cgroup/blkio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio
944 931 0:39 / /sys/fs/cgroup/rdma ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,rdma
diff --git a/test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java b/test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java
index 369d4244533..a84cb70d7f5 100644
--- a/test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java
+++ b/test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java
@@ -43,6 +43,7 @@ import jdk.internal.platform.CgroupSubsystemFactory;
import jdk.internal.platform.CgroupSubsystemFactory.CgroupTypeResult;
import jdk.internal.platform.CgroupV1MetricsImpl;
import jdk.internal.platform.cgroupv1.CgroupV1Subsystem;
+import jdk.internal.platform.cgroupv1.CgroupV1SubsystemController;
import jdk.internal.platform.Metrics;
import jdk.test.lib.Utils;
import jdk.test.lib.util.FileUtils;
@@ -72,8 +73,10 @@ public class TestCgroupSubsystemFactory {
private Path cgroupv1MntInfoDoubleCpusets;
private Path cgroupv1MntInfoDoubleCpusets2;
private Path cgroupv1MntInfoColonsHierarchy;
+ private Path cgroupv1MntInfoPrefix;
private Path cgroupv1SelfCgroup;
private Path cgroupv1SelfColons;
+ private Path cgroupv1SelfPrefix;
private Path cgroupv2SelfCgroup;
private Path cgroupv1SelfCgroupJoinCtrl;
private Path cgroupv1CgroupsOnlyCPUCtrl;
@@ -166,6 +169,20 @@ public class TestCgroupSubsystemFactory {
"42 30 0:38 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:14 - cgroup none rw,seclabel,cpuset\n" +
"43 30 0:39 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:15 - cgroup none rw,seclabel,blkio\n" +
"44 30 0:40 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:16 - cgroup none rw,seclabel,freezer\n";
+ private String mntInfoPrefix =
+ "931 921 0:86 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw,context=\"system_u:object_r:container_file_t:s0:c375,c611\",mode=755,uid=100000,gid=100000\n" +
+ "932 931 0:26 /user.slice/user-1000.slice/user@1000.service/user.slice/podman-637196.scope/2be1ee6e076d20b38a61a1e3289974662d646fb50489ccf22f7fa4e3dc082295 /sys/fs/cgroup/systemd ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd\n" +
+ "933 931 0:29 / /sys/fs/cgroup/hugetlb ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,hugetlb\n" +
+ "934 931 0:30 /user.slice /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,devices\n" +
+ "935 931 0:31 / /sys/fs/cgroup/perf_event ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,perf_event\n" +
+ "936 931 0:32 / /sys/fs/cgroup/net_cls,net_prio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,net_cls,net_prio\n" +
+ "937 931 0:33 / /sys/fs/cgroup/cpuset ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpuset\n" +
+ "938 931 0:34 / /sys/fs/cgroup/freezer ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,freezer\n" +
+ "940 931 0:35 /user.slice /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,cpu,cpuacct\n" +
+ "941 931 0:36 /user.slice/user-1000.slice/session-50.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory\n" +
+ "942 931 0:37 /user.slice/user-1000.slice/session-50.scope /sys/fs/cgroup/pids ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,pids\n" +
+ "943 931 0:38 /user.slice /sys/fs/cgroup/blkio ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,blkio\n" +
+ "944 931 0:39 / /sys/fs/cgroup/rdma ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,rdma";
private String cgroupsNonZeroHierarchy =
"#subsys_name hierarchy num_cgroups enabled\n" +
"cpuset 9 1 1\n" +
@@ -217,6 +234,19 @@ public class TestCgroupSubsystemFactory {
"2:cpu,cpuacct:/\n" +
"1:name=systemd:/user.slice/user-1000.slice/user@1000.service/apps.slice/apps-org.gnome.Terminal.slice/vte-spawn-3c00b338-5b65-439f-8e97-135e183d135d.scope\n" +
"0::/user.slice/user-1000.slice/user@1000.service/apps.slice/apps-org.gnome.Terminal.slice/vte-spawn-3c00b338-5b65-439f-8e97-135e183d135d.scope\n";
+ private String cgroupv1SelfPrefixContent =
+ "12:rdma:/\n" +
+ "11:blkio:/system.slice/sshd.service\n" +
+ "10:pids:/user.slice/user-1000.slice/session-3.scope\n" +
+ "9:memory:/user.slice/user-1000.slice/session-3.scope\n" +
+ "8:cpu,cpuacct:/\n" +
+ "7:freezer:/\n" +
+ "6:cpuset:/\n" +
+ "5:net_cls,net_prio:/\n" +
+ "4:perf_event:/\n" +
+ "3:devices:/system.slice/sshd.service\n" +
+ "2:hugetlb:/\n" +
+ "1:name=systemd:/user.slice/user-1000.slice/user@1000.service/user.slice/podman-637674.scope";
private String cgroupv2SelfCgroupContent = "0::/user.slice/user-1000.slice/session-2.scope";
@Before
@@ -257,12 +287,18 @@ public class TestCgroupSubsystemFactory {
cgroupv1MntInfoColonsHierarchy = Paths.get(existingDirectory.toString(), "mountinfo_colons");
Files.writeString(cgroupv1MntInfoColonsHierarchy, mntInfoColons);
+ cgroupv1MntInfoPrefix = Paths.get(existingDirectory.toString(), "mountinfo-prefix");
+ Files.writeString(cgroupv1MntInfoPrefix, mntInfoPrefix);
+
cgroupv1SelfCgroup = Paths.get(existingDirectory.toString(), "self_cgroup_cgv1");
Files.writeString(cgroupv1SelfCgroup, cgroupv1SelfCgroupContent);
cgroupv1SelfColons = Paths.get(existingDirectory.toString(), "self_colons_cgv1");
Files.writeString(cgroupv1SelfColons, cgroupv1SelfColonsContent);
+ cgroupv1SelfPrefix = Paths.get(existingDirectory.toString(), "self_prefix_cgv1");
+ Files.writeString(cgroupv1SelfPrefix, cgroupv1SelfPrefixContent);
+
cgroupv2SelfCgroup = Paths.get(existingDirectory.toString(), "self_cgroup_cgv2");
Files.writeString(cgroupv2SelfCgroup, cgroupv2SelfCgroupContent);
@@ -393,6 +429,24 @@ public class TestCgroupSubsystemFactory {
assertEquals(memoryInfo.getMountRoot(), memoryInfo.getCgroupPath());
}
+ @Test
+ public void testMountPrefixCgroupsV1() throws IOException {
+ String cgroups = cgroupv1CgInfoNonZeroHierarchy.toString();
+ String mountInfo = cgroupv1MntInfoPrefix.toString();
+ String selfCgroup = cgroupv1SelfPrefix.toString();
+ Optional<CgroupTypeResult> result = CgroupSubsystemFactory.determineType(mountInfo, cgroups, selfCgroup);
+
+ assertTrue("Expected non-empty cgroup result", result.isPresent());
+ CgroupTypeResult res = result.get();
+ CgroupInfo memoryInfo = res.getInfos().get("memory");
+ assertEquals(memoryInfo.getCgroupPath(), "/user.slice/user-1000.slice/session-3.scope");
+ assertEquals("/sys/fs/cgroup/memory", memoryInfo.getMountPoint());
+ CgroupV1SubsystemController cgroupv1MemoryController = new CgroupV1SubsystemController(memoryInfo.getMountRoot(), memoryInfo.getMountPoint());
+ cgroupv1MemoryController.setPath(memoryInfo.getCgroupPath());
+ // issue to verify: path was not set because the cgroupPath does not start with mount root
+ assertNotNull(cgroupv1MemoryController.path());
+ }
+
@Test
public void testZeroHierarchyCgroupsV1() throws IOException {
String cgroups = cgroupv1CgInfoZeroHierarchy.toString();
@gaol
Copy link
Author

gaol commented May 23, 2022

Do you still have session 9055a14c4105? If so, can you confirm if this process in the container is really $HOSTPID=50293?

jenkins        1       0  0 04:43 ?        00:00:00 /bin/bash /var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait

Sorry, no, the container exists only for several minutes(all tests failed quickly), I am updating to the test script to sleep for 1 hour for long existence, but we will lose the java process information inside the container.

In our setup, the container was started outside of Jenkins environment.

Please do this on the host. It should report "NSpid: 1"

$ cat /proc/50293/status | grep NSpid

After I restarted a new job, I got the followings:

[jenkins@testjenkins ~]$ podman ps --ns
CONTAINER ID  NAMES                                   PID         CGROUPNS    IPC         MNT         NET         PIDNS       USERNS      UTS
e4da90e1cdfc  automaton-slave-eap-7.4.x-testsuite-30  67694       4026531835  4026532632  4026532630  4026532635  4026532633  4026532629  4026532631

cat /proc/67694/status |grep NSpid gives me: NSpid: 67694 1

The PID of the container is: 67694, inside of the container:

[jenkins@e4da90e1cdfc eap-7.4.x-testsuite]$ ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
jenkins        1       0  0 07:35 ?        00:00:00 /bin/bash /var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait
jenkins        8       0  0 07:35 pts/0    00:00:00 /bin/bash /var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/buil
jenkins       42       8  0 07:35 pts/0    00:00:00 /bin/bash /var/jenkins_home/workspace/eap-7.4.x-testsuite/harmonia/
jenkins       43       8  0 07:35 pts/0    00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=tee /usr/bin/tee /var/j
jenkins       95      42  0 07:35 pts/0    00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 60
jenkins      803       0  0 07:46 pts/1    00:00:00 /bin/bash
jenkins      823       1  0 07:47 ?        00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1

In your host output of proc/50293/cgroup and proc/50293/mountinfo from above, they are both using session-13.scope, so we don't see the symptom you reported (where the two files disagree).

11:memory:/user.slice/user-1000.slice/session-13.scope
868 851 0:38 /user.slice/user-1000.slice/session-13.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

What are disagreed are files inside the container, and they are (in this session:):

[jenkins@e4da90e1cdfc eap-7.4.x-testsuite]$ cat /proc/self/cgroup |grep memory
11:memory:/user.slice/user-1000.slice/session-3.scope
[jenkins@e4da90e1cdfc eap-7.4.x-testsuite]$ cat /proc/self/mountinfo |grep memory
912 901 0:38 /user.slice/user-1000.slice/session-28.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

Can you find the HOSTPID for the "java" process

jenkins       99      46  4 04:43 pts/0    00:00:01 /opt/oracle/jdk-17.0.2/bin/java -Dmaven.wagon.http.ssl.insecure=tru

This can be done on the host with:

cd /proc
grep NSpid [0-9]*/status | grep ' 99$'

Could you double check the /proc/<pid>/cgroup and /proc/<pid>/mountinfo files for this process disagree with each other?

In container:

cat /proc/99/cgroup
cat /proc/99/mountinfo
find /sys/fs/cgroup/memory -name tasks -exec grep -H 99 {} \;

In this session, the process is:

jenkins       95      42  0 07:35 pts/0    00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 60

In host:

[jenkins@testjenkins ~]$ grep NSpid.*95 /proc/*/status 2> /dev/null
/proc/1952/status:NSpid:	1952
/proc/195/status:NSpid:	195
/proc/3955/status:NSpid:	3955
/proc/68095/status:NSpid:	68095	43
/proc/68147/status:NSpid:	68147	95

So, I think he host pid of this process is: 68147, and the content:

[jenkins@testjenkins ~]$ cat /proc/68147/cgroup |grep memory
11:memory:/user.slice/user-1000.slice/session-29.scope
[jenkins@testjenkins ~]$ cat /proc/68147/mountinfo |grep memory
912 901 0:38 /user.slice/user-1000.slice/session-28.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

Inside of the container:

[jenkins@e4da90e1cdfc eap-7.4.x-testsuite]$ cat /proc/95/cgroup |grep memory
11:memory:/user.slice/user-1000.slice/session-29.scope
[jenkins@e4da90e1cdfc eap-7.4.x-testsuite]$ cat /proc/95/mountinfo |grep memory
912 901 0:38 /user.slice/user-1000.slice/session-28.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

On host:

find /sys/fs/cgroup/memory -name tasks -exec grep -H $HOSTPID {} \;

in this session, it is:

[jenkins@testjenkins proc]$ find /sys/fs/cgroup/memory -name tasks -exec grep -H 68147 {} \;
/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-29.scope/tasks:68147

Since java is launched by the jenkins processes (pid=46), I wonder if jenkins did anything to change the cgroup setting for the java process.

no, pid=46 in previous session is actually a wrapper of bash script, which does nothing related to the container and cgroup update.

@gaol
Copy link
Author

gaol commented May 23, 2022

The pstree in host:

[jenkins@testjenkins proc]$ pstree -ap 67694
wait.sh,67694 /var/jenkins_home/workspace/eap-7.4.x-testsuite/hera/wait.sh
  └─sleep,74232 --coreutils-prog-shebang=sleep /usr/bin/sleep 1

and the /proc/<Container PID>/cgroup and /proc/<Container PID>/mountinfo in host matches:

[jenkins@testjenkins proc]$ cat /proc/67694/cgroup |grep memory
11:memory:/user.slice/user-1000.slice/session-28.scope
[jenkins@testjenkins proc]$ cat /proc/67694/mountinfo |grep memory
912 901 0:38 /user.slice/user-1000.slice/session-28.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

But the processes inside the container (PID != 1) do not match.

@jerboaa
Copy link

jerboaa commented May 23, 2022

@gaol Is there a possibility to run this program on the affected container where you see this mismatch between /proc/self/cgroup and /proc/self/mountinfo? It does not rely on /proc/self/mountinfo and /proc/self/cgroup to match in the way the JDK currently does.

import java.io.IOException;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;

public class FindCgroupPath {
    static final String pid;
    
    static {
        String pidVal = null;
        try {
            pidVal = Files.readSymbolicLink(Paths.get("/proc/self")).getFileName().toString();
        } catch (IOException e) {
        }
        pid = pidVal;
    }

    public static void main(String[] args) throws IOException {
        if (args.length != 1) {
            System.err.println("Usage: FincCgroupPath <path/to/controller/mount>");
            System.exit(1);
        }
        String mount = args[0];
        System.out.println("PID is: " + pid + " walking: " + mount);
        Files.walkFileTree(Paths.get(mount), new SimpleFileVisitor<>() {

            @Override
            public FileVisitResult visitFile(Path file,
                    BasicFileAttributes attrs) throws IOException {
                if (file.getFileName().compareTo(Paths.get("cgroup.procs")) == 0) {
                    if (findPath(file)) {
                        return FileVisitResult.TERMINATE;
                    }
                }
                return FileVisitResult.CONTINUE;
            }

        });
    }
    
    public static boolean findPath(Path cgroupProcsFile) {
        try {
            System.out.println("Analyzing " + cgroupProcsFile);
            for (String line: Files.readAllLines(cgroupProcsFile)) {
                if (pid != null && pid.equals(line.trim())) {
                    System.out.println("Found process at path " + cgroupProcsFile);
                    System.out.println("Cgroup path is: " + cgroupProcsFile.getParent());
                    return true;
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return false;
    }

}

Run as:

$ java FindCgroupPath.java /sys/fs/cgroup/memory

@gaol
Copy link
Author

gaol commented May 24, 2022

Thanks, @jerboaa here is the report running in the affected container:

[jenkins@383e373c4282 jenkins_home]$ cat /proc/self/cgroup |grep memory
11:memory:/user.slice/user-1000.slice/session-3.scope
[jenkins@383e373c4282 jenkins_home]$ cat /proc/self/mountinfo |grep memory
916 905 0:38 /user.slice/user-1000.slice/session-31.scope /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,seclabel,memory

[jenkins@383e373c4282 jenkins_home]$ /opt/oracle/jdk-17.0.2/bin/java -version
openjdk version "17.0.2" 2022-01-18
OpenJDK Runtime Environment (build 17.0.2+8-86)
OpenJDK 64-Bit Server VM (build 17.0.2+8-86, mixed mode, sharing)

[jenkins@383e373c4282 jenkins_home]$ /opt/oracle/jdk-17.0.2/bin/java FindCgroupPath.java /sys/fs/cgroup/memory
PID is: 452 walking: /sys/fs/cgroup/memory
Analyzing /sys/fs/cgroup/memory/cgroup.procs

[jenkins@383e373c4282 jenkins_home]$ cat /sys/fs/cgroup/memory/cgroup.procs
1
609

@jerboaa
Copy link

jerboaa commented May 24, 2022

@gaol Thank you. This output:

[jenkins@383e373c4282 jenkins_home]$ /opt/oracle/jdk-17.0.2/bin/java FindCgroupPath.java /sys/fs/cgroup/memory
PID is: 452 walking: /sys/fs/cgroup/memory
Analyzing /sys/fs/cgroup/memory/cgroup.procs

Suggests that the current process isn't being part of the memory namespace. If it was, the output would be something like this:

bash-5.1$ /opt/jdk/bin/java FindCgroupPath.java /sys/fs/cgroup/memory
PID is: 2 walking: /sys/fs/cgroup/memory
Analyzing /sys/fs/cgroup/memory/cgroup.procs
Found process at path /sys/fs/cgroup/memory/cgroup.procs
Cgroup path is: /sys/fs/cgroup/memory

@gaol
Copy link
Author

gaol commented May 24, 2022

May I know if this is the wrong set up in your opinion ? where to check the memory limitation of the process inside of the container in current setup ? thanks !

@jerboaa
Copy link

jerboaa commented May 24, 2022

No, it should be fine. I'm guessing that some other controllers would be enabled. Either way, OpenJDK needs to handle this case properly. The question is which (if any of the relevant ones) is indeed enabled. You could find out with something like this:

for c in $(ls -d /sys/fs/cgroup/*); do if echo $c | grep -qE 'memory|cpu|blkio|pids'; then java FindCgroupPath.java $c; fi; done

@iklam
Copy link

iklam commented May 25, 2022

@gaol, since you are running on a cgroupv1 system, I think this problem can be worked around by running with podman ... --cgroupns=private ...

See https://docs.podman.io/en/latest/markdown/podman-run.1.html

--cgroupns=mode: Set the cgroup namespace mode for the container.
    host: use the host’s cgroup namespace inside the container.
    container:id: join the namespace of the specified container.
    private: create a new cgroup namespace.
    ns:path: join the namespace at the specified path.
If the host uses cgroups v1, the default is set to host. On cgroups v2, the default is private.

By default, if you don't use any memory settings, on a cgroupv1 system, podman puts the containerized process into the memory cgroup of the user on the host. Here's what I get:

U2110: ~$ cat /proc/self/cgroup | grep memory
7:memory:/user.slice/user-1000.slice/session-3.scope
U2110: ~$ cat /proc/self/mountinfo | grep memory
43 32 0:38 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,memory

U2110: ~$ podman run --rm -it --tty=true fedora bash
[root@dc24d11dcb5e /]# cat /proc/self/cgroup | grep memory
7:memory:/user.slice/user-1000.slice/user@1000.service
[root@dc24d11dcb5e /]# cat /proc/self/mountinfo | grep memory
1181 1174 0:38 /user.slice/user-1000.slice/user@1000.service /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,memory

Note that I used ssh to log onto the host U2110 and /user.slice/user-1000.slice/session-3.scope is the cgroup hierarchy used for this SSH session.

I think in you case, you use one SSH session to create the container, but use another SSH session to launch the Java process. For some unknown reason, podman uses the hierarchy of the SSH session (instead of /user.slice/user-1000.slice/user@1000.service in my case). This causes the JVM to fail.

If you use --cgroupns=private, the cgroup and mountinfo should look like this and will make Java happy.

U2110: ~$ podman run --rm -it --tty=true --cgroupns=private fedora bash
[root@d14b44b4d754 /]# cat /proc/self/cgroup | grep memory
7:memory:/
[root@d14b44b4d754 /]# cat /proc/self/mountinfo | grep memory
1181 1174 0:38 / /sys/fs/cgroup/memory ro,nosuid,nodev,noexec,relatime - cgroup cgroup rw,memory

@iklam
Copy link

iklam commented May 25, 2022

@jerboaa I think we should reconsider how we process the mountpoint in mountinfo. If it's not '/', and the path doesn't exist (not in the filesystem namespace of the current process), I don't think we should interpret it at all.

@gaol
Copy link
Author

gaol commented May 25, 2022

I think in you case, you use one SSH session to create the container, but use another SSH session to launch the Java process

Thanks @iklam
Yes, this is exactly what we do. 👍

Currently we use -v /sys/fs/cgroup:/sys/fs/cgroup:ro to bypass the failure, I will do experiment on the --cgroupns=private.

@gaol
Copy link
Author

gaol commented May 25, 2022

for c in $(ls -d /sys/fs/cgroup/*); do if echo $c | grep -qE 'memory|cpu|blkio|pids'; then java FindCgroupPath.java $c; fi; done

@jerboaa below are the report using the script above:

[jenkins@8b078952dce0 jenkins_home]$ for c in $(ls -d /sys/fs/cgroup/*); do if echo $c | grep -qE 'memory|cpu|blkio|pids'; then /opt/oracle/jdk-17.0.2/bin/java FindCgroupPath.java $c; fi; done
PID is: 248 walking: /sys/fs/cgroup/blkio
Analyzing /sys/fs/cgroup/blkio/cgroup.procs
Found process at path /sys/fs/cgroup/blkio/cgroup.procs
Cgroup path is: /sys/fs/cgroup/blkio
PID is: 272 walking: /sys/fs/cgroup/cpu
PID is: 296 walking: /sys/fs/cgroup/cpu,cpuacct
Analyzing /sys/fs/cgroup/cpu,cpuacct/cgroup.procs
Found process at path /sys/fs/cgroup/cpu,cpuacct/cgroup.procs
Cgroup path is: /sys/fs/cgroup/cpu,cpuacct
PID is: 320 walking: /sys/fs/cgroup/cpuacct
PID is: 344 walking: /sys/fs/cgroup/cpuset
Analyzing /sys/fs/cgroup/cpuset/cgroup.procs
Found process at path /sys/fs/cgroup/cpuset/cgroup.procs
Cgroup path is: /sys/fs/cgroup/cpuset
PID is: 375 walking: /sys/fs/cgroup/memory
Analyzing /sys/fs/cgroup/memory/cgroup.procs
PID is: 407 walking: /sys/fs/cgroup/pids
Analyzing /sys/fs/cgroup/pids/cgroup.procs
[jenkins@8b078952dce0 jenkins_home]$ /opt/oracle/jdk-17.0.2/bin/java -XshowSettings:system -version
Exception in thread "main" java.lang.NullPointerException
	at java.base/java.util.Objects.requireNonNull(Objects.java:208)
	at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:263)
	at java.base/java.nio.file.Path.of(Path.java:147)
	at java.base/java.nio.file.Paths.get(Paths.java:69)
	at java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$1(CgroupUtil.java:67)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:569)
	at java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:69)
	at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
	at java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:175)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:149)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:84)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getInstance(CgroupV1Subsystem.java:60)
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:116)
	at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:167)
	at java.base/jdk.internal.platform.SystemMetrics.instance(SystemMetrics.java:29)
	at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:58)
	at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
	at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:318)
	at java.base/sun.launcher.LauncherHelper.showSettings(LauncherHelper.java:173)

After using --cgroupns=private, the output becomes:

[jenkins@3cfd6c2aaac0 jenkins_home]$ for c in $(ls -d /sys/fs/cgroup/*); do if echo $c | grep -qE 'memory|cpu|blkio|pids'; then /opt/oracle/jdk-17.0.2/bin/java FindCgroupPath.java $c; fi; done
PID is: 214 walking: /sys/fs/cgroup/blkio
Analyzing /sys/fs/cgroup/blkio/cgroup.procs
Found process at path /sys/fs/cgroup/blkio/cgroup.procs
Cgroup path is: /sys/fs/cgroup/blkio
PID is: 238 walking: /sys/fs/cgroup/cpu
PID is: 262 walking: /sys/fs/cgroup/cpu,cpuacct
Analyzing /sys/fs/cgroup/cpu,cpuacct/cgroup.procs
Found process at path /sys/fs/cgroup/cpu,cpuacct/cgroup.procs
Cgroup path is: /sys/fs/cgroup/cpu,cpuacct
PID is: 286 walking: /sys/fs/cgroup/cpuacct
PID is: 310 walking: /sys/fs/cgroup/cpuset
Analyzing /sys/fs/cgroup/cpuset/cgroup.procs
Found process at path /sys/fs/cgroup/cpuset/cgroup.procs
Cgroup path is: /sys/fs/cgroup/cpuset
PID is: 340 walking: /sys/fs/cgroup/memory
Analyzing /sys/fs/cgroup/memory/cgroup.procs
PID is: 372 walking: /sys/fs/cgroup/pids
Analyzing /sys/fs/cgroup/pids/cgroup.procs
[jenkins@3cfd6c2aaac0 jenkins_home]$ /opt/oracle/jdk-17.0.2/bin/java -XshowSettings:system -version
Operating System Metrics:
    Provider: cgroupv1
    Effective CPU Count: 2
    CPU Period: 100000us
    CPU Quota: -1
    CPU Shares: -1
    List of Processors, 2 total: 
    0 1 
    List of Effective Processors, 2 total: 
    0 1 
    List of Memory Nodes, 1 total: 
    0 
    List of Available Memory Nodes, 1 total: 
    0 
    Memory Limit: Unlimited
    Memory Soft Limit: Unlimited
    Memory & Swap Limit: Unlimited

openjdk version "17.0.2" 2022-01-18
OpenJDK Runtime Environment (build 17.0.2+8-86)
OpenJDK 64-Bit Server VM (build 17.0.2+8-86, mixed mode, sharing)

@jerboaa
Copy link

jerboaa commented May 25, 2022

@jerboaa I think we should reconsider how we process the mountpoint in mountinfo. If it's not '/', and the path doesn't exist (not in the filesystem namespace of the current process), I don't think we should interpret it at all.

@iklam Let's discuss this in https://bugs.openjdk.java.net/browse/JDK-8286212. I have no idea how transient gists are. It's not clear what you mean. Do you mean the root field or the mount point field according to man procfs (i.e. field (4) or (5)? Because we always need to consider (5) on cgroups v1 IMO.

@iklam
Copy link

iklam commented May 31, 2022

@jerboaa I think we should reconsider how we process the mountpoint in mountinfo. If it's not '/', and the path doesn't exist (not in the filesystem namespace of the current process), I don't think we should interpret it at all.

@iklam Let's discuss this in https://bugs.openjdk.java.net/browse/JDK-8286212. I have no idea how transient gists are. It's not clear what you mean. Do you mean the root field or the mount point field according to man procfs (i.e. field (4) or (5)? Because we always need to consider (5) on cgroups v1 IMO.

Yes, I meant the root field (4).

@jerboaa
Copy link

jerboaa commented Jun 1, 2022

OK. I'll consider this when rebooting the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment