Skip to content

Instantly share code, notes, and snippets.

@mcastelino
Last active April 10, 2019 18:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mcastelino/e975cd26958554b4c46c7168067b66b0 to your computer and use it in GitHub Desktop.
Save mcastelino/e975cd26958554b4c46c7168067b66b0 to your computer and use it in GitHub Desktop.
Kata and Resource Management

Workload

When running a simple workload such as

apiVersion: v1
kind: Pod
metadata:
  name: guar-2kc
spec:
  runtimeClassName: kata-qemu
  containers:
  - name: busybee
    image: busybox
    resources:
      limits:
        cpu: 2
        memory: "400Mi"
    command: ["md5sum"]
    args: ["/dev/urandom"]
  - name: busybum
    image: busybox
    resources:
      limits:
        cpu: 3
        memory: "200Mi"
    command: ["md5sum"]
    args: ["/dev/urandom"]

cpusets

  • Kata CPU Set on pause container is a union of all containers, which is desired behaviour
kata$for i in `ls pod*/**/cpuset.cpus`; do echo $i && cat $i; done
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/cpuset.cpus
0-7
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/cpuset.cpus
1-2
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/cpuset.cpus
3-5
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/cpuset.cpus
0-7
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/cpuset.cpus
0-7
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/cpuset.cpus
0-7
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/cpuset.cpus
1-5
kata$



─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
runc$for i in `ls pod*/**/cpuset.cpus`; do echo $i && cat $i; done
pod3b75cd39-5b0c-11e9-8a48-525400eac274/cpuset.cpus
0-7
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/cpuset.cpus
1-2
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/cpuset.cpus
3-5
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/cpuset.cpus
0-7
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/cpuset.cpus
0-7
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/cpuset.cpus
0-7
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/cpuset.cpus
0-7
runc$for i in `ls pod*/**/tasks`; do echo $i && cat $i; done
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/tasks
18652
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/tasks
18714
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/tasks
18536
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/tasks
18631
18634
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/tasks
18702
18704
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/tasks
18525
18527
pod3b75cd39-5b0c-11e9-8a48-525400eac274/tasks
  • pause container crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a
  • But kata is showing whole bunch of tasks under pause container conmon, which does not make sense
  • Also a whole bunch of tasks under each container which again does not make sense
  • What was expected was all the tasks should have been under the pause container (
kata$for i in `ls pod*/**/tasks`; do echo $i && cat $i; done
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks
24985
24986
24987
24988
24989
24990
24992
24993
24994
24995
24996
24997
24998
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks
25183
25185
25186
25187
25188
25189
25190
25191
25192
25193
25194
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks
24964
24966
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks
25090
25092
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks
7830
19505
24584
24586
24602
24603
24604
24607
24608
24609
24610
24611
24612
24613
24614
24615
24616
25025
27555
27556
31294
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks
24605
24639
24641
24642
24644
24645
24646
24648
24649
24650
24651
24652
24979
24980
25105
25106
25107
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/tasks
@mcastelino
Copy link
Author

kata$for i in `ls pod*/**/tasks`; do echo $i && for j in `cat $i`; do ps auxw | grep $j;done; done;
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks
root     13041  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24985
root     24985  0.0  0.2 1004980 19352 ?       Sl   21:13   0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -exec-id 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956
root     13043  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24986
root     13045  0.0  0.0   6360   976 pts/0    S+   21:40   0:00 grep 24987
root     13047  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24988
root     13049  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24989
root     13051  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24990
root     13053  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24992
root     13055  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24993
root     13057  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24994
root     13059  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24995
root     13061  0.0  0.0   6360   980 pts/0    S+   21:40   0:00 grep 24996
root     13063  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24997
root     13065  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24998
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks
root     13068  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 25183
root     25183  0.0  0.2 858668 21992 ?        Sl   21:13   0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -exec-id 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be
root     13070  0.0  0.0   6360   972 pts/0    S+   21:40   0:00 grep 25185
root     13072  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 25186
root     13074  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 25187
root     13076  0.0  0.0   6360   904 pts/0    S+   21:40   0:00 grep 25188
root     13078  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 25189
root     13080  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 25190
root     13082  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 25191
root     13084  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 25192
root     13086  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 25193
root     13088  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 25194
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/tasks
root     13091  0.0  0.0   6360   976 pts/0    S+   21:40   0:00 grep 24964
root     24964  0.0  0.0  78328  2008 ?        Ssl  21:13   0:00 /usr/libexec/crio/conmon --syslog -c 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -u 2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956 -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/userdata -p /var/run/containers/storage/overlay-containers/2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/busybee/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     13093  0.0  0.0   6360   900 pts/0    S+   21:40   0:00 grep 24966
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/tasks
root     13096  0.0  0.0   6360   980 pts/0    S+   21:40   0:00 grep 25090
root     25090  0.0  0.0  78328  2008 ?        Ssl  21:13   0:00 /usr/libexec/crio/conmon --syslog -c 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -u 5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/userdata -p /var/run/containers/storage/overlay-containers/5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/busybum/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     13098  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 25092
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks
root      2846  0.0  0.0  78328  2020 ?        Ssl  21:06   0:00 /usr/libexec/crio/conmon --syslog -c 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -u 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata -p /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root      3034  0.0  0.0  78328  2020 ?        Ssl  21:06   0:00 /usr/libexec/crio/conmon --syslog -c 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -u 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata -p /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/etcd/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     13101  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 7830
root     13103  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 19505
root     13105  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24584
root     24584  0.0  0.0  78328   172 ?        Ssl  21:13   0:00 /usr/libexec/crio/conmon --syslog -c f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -u f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata -p /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     13107  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24586
root     13109  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24602
root     24602  100  2.7 3590552 226136 ?      Sl   21:13  26:33 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -uuid ada3582a-9766-4030-82e7-95427d95ad17 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=8992M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.6.1_agent_992b4987a32.img,size=134217728 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/run/kata-containers/shared/sandboxes/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a,security_model=none -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 -device driver=virtio-net-pci,netdev=network-0,mac=b2:78:0b:80:8b:a2,disable-modern=true,mq=on,vectors=4,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /opt/kata/share/kata-containers/vmlinuz-4.19.28-31 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket systemd.mask=systemd-journald.service systemd.mask=systemd-journald.socket systemd.mask=systemd-journal-flush.service systemd.mask=systemd-udevd.service systemd.mask=systemd-udevd.socket systemd.mask=systemd-udev-trigger.service systemd.mask=systemd-timesyncd.service systemd.mask=systemd-update-utmp.service systemd.mask=systemd-tmpfiles-setup.service systemd.mask=systemd-tmpfiles-cleanup.service systemd.mask=systemd-tmpfiles-cleanup.timer systemd.mask=tmp.mount -pidfile /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/pid -smp 1,cores=1,threads=1,sockets=1,maxcpus=8
root     24604  0.0  0.0      0     0 ?        S    21:13   0:00 [vhost-24602]
root     24606  0.0  0.0      0     0 ?        S    21:13   0:00 [kvm-pit/24602]
root     13111  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 24603
root     13113  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24604
root     24604  0.0  0.0      0     0 ?        S    21:13   0:00 [vhost-24602]
root     13115  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24607
root     24607  0.0  0.1 1215688 15420 ?       Sl   21:13   0:01 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -mux-socket /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock -sandbox f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a
root     13117  0.0  0.0   6360   904 pts/0    S+   21:40   0:00 grep 24608
root     13119  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24609
root     13121  0.0  0.0   6360   968 pts/0    S+   21:40   0:00 grep 24610
root     13123  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24611
root     13125  0.0  0.0   6360   984 pts/0    S+   21:40   0:00 grep 24612
root     13127  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24613
root     13129  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24614
root     13131  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24615
root     13133  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24616
root     13135  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 25025
root     13137  0.0  0.0   6360   908 pts/0    S+   21:40   0:00 grep 27555
root     13139  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 27556
root     13141  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 31294
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks
root     13144  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24605
root     13146  0.0  0.0   6360   984 pts/0    S+   21:40   0:00 grep 24639
root     24639  0.0  0.2 858668 22160 ?        Sl   21:13   0:00 /opt/kata/libexec/kata-containers/kata-shim -agent unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -container f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -exec-id f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a
root     13148  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24641
root     13150  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24642
root     13152  0.0  0.0   6360   972 pts/0    S+   21:40   0:00 grep 24644
root     13154  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24645
root     13156  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24646
root     13158  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24648
root     13160  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 24649
root     13162  0.0  0.0   6360   976 pts/0    S+   21:40   0:00 grep 24650
root     13164  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 24651
root     13166  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 24652
root     13168  0.0  0.0   6360   856 pts/0    S+   21:40   0:00 grep 24979
root     13170  0.0  0.0   6360   920 pts/0    S+   21:40   0:00 grep 24980
root     13172  0.0  0.0   6360   916 pts/0    S+   21:40   0:00 grep 25105
root     13174  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 25106
root     13176  0.0  0.0   6360   852 pts/0    S+   21:40   0:00 grep 25107
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/tasks

@mcastelino
Copy link
Author

clear@clr-01 ~/clr-k8s-examples $ kubectl describe po guar-2kc | grep "Container ID" -B 1
  busybee:
    Container ID:  cri-o://2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956
--
  busybum:
    Container ID:  cri-o://5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be

@mcastelino
Copy link
Author

mcastelino commented Apr 9, 2019

  • The cpu shares are a percentage of the parent
  • Hence it is impossible to setup the pause container to reflect the sum of cpu shares
  • So placing the VMM in the pause container is not the right choice
  • But as the other containers are mostly idle it does not matter today (i.e only shim if correctly done, which we do not :()
  • However the correct behavior is to set all the other containers to 2 as k8s tends to do
  • The pause should be equal to the parent shares as k8 tends to do
kata$ for i in `ls pod*/**/cpu.shares`; do echo $i && cat $i; done;
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/cpu.shares
5120
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/cpu.shares
2048
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/cpu.shares
3072
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/cpu.shares
1024
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/cpu.shares
1024
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/cpu.shares
1024
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/cpu.shares
3072
kata$




─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
runc$ for i in `ls pod*/**/cpu.shares`; do echo $i && cat $i; done;
pod3b75cd39-5b0c-11e9-8a48-525400eac274/cpu.shares
5120
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/cpu.shares
2048
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/cpu.shares
3072
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/cpu.shares
2
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/cpu.shares
1024
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/cpu.shares
1024
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/cpu.shares
1024
runc$

@mcastelino
Copy link
Author

Here again task are getting added to the wrong container, i.e. conmon

kata$ for i in `cat pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/tasks`;do ps auxw | grep $i;
done;
root      2846  0.0  0.0  78328  2020 ?        Ssl  21:06   0:00 /usr/libexec/crio/conmon --syslog -c 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -u 0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35 -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata -p /var/run/containers/storage/overlay-containers/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/0df3ec69da862320d2b8947aa2481f92de3274f1fb3cffd8594a26d1e6627b35.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root      3034  0.0  0.0  78328  2020 ?        Ssl  21:06   0:00 /usr/libexec/crio/conmon --syslog -c 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -u 96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce -r /usr/bin/runc -b /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata -p /var/run/containers/storage/overlay-containers/96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce/userdata/pidfile -l /var/log/pods/kube-system_etcd-clr-01_af3e4a507ec0af8c2233ee5bf0783073/etcd/0.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     26919  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 7830
root     26921  0.0  0.0   6360   976 pts/0    S+   22:04   0:00 grep 19505
root     24584  0.0  0.0  78328   172 ?        Ssl  21:13   0:00 /usr/libexec/crio/conmon --syslog -c f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -u f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -r /opt/kata/bin/kata-qemu -b /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata -p /var/run/containers/storage/overlay-containers/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/userdata/pidfile -l /var/log/pods/default_guar-2kc_5884dc6c-5b0c-11e9-90bc-525400cfa589/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a.log --exit-dir /var/run/crio/exits --socket-dir-path /var/run/crio --log-level error
root     26923  0.0  0.0   6492   852 pts/0    S+   22:04   0:00 grep 24584
root     26925  0.0  0.0   6360   856 pts/0    S+   22:04   0:00 grep 24586
root     24602  100  2.7 3590552 228184 ?      Sl   21:13  50:32 /opt/kata/bin/qemu-system-x86_64 -name sandbox-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a -uuid ada3582a-9766-4030-82e7-95427d95ad17 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=8992M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.6.1_agent_992b4987a32.img,size=134217728 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/run/kata-containers/shared/sandboxes/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a,security_model=none -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 -device driver=virtio-net-pci,netdev=network-0,mac=b2:78:0b:80:8b:a2,disable-modern=true,mq=on,vectors=4,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /opt/kata/share/kata-containers/vmlinuz-4.19.28-31 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests
net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket systemd.mask=systemd-journald.service systemd.mask=systemd-journald.socket systemd.mask=systemd-journal-flush.service systemd.mask=systemd-udevd.service systemd.mask=systemd-udevd.socket systemd.mask=systemd-udev-trigger.service systemd.mask=systemd-timesyncd.service systemd.mask=systemd-update-utmp.service systemd.mask=systemd-tmpfiles-setup.service systemd.mask=systemd-tmpfiles-cleanup.service systemd.mask=systemd-tmpfiles-cleanup.timer systemd.mask=tmp.mount -pidfile /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/pid -smp 1,cores=1,threads=1,sockets=1,maxcpus=8
root     24604  0.0  0.0      0     0 ?        S    21:13   0:00 [vhost-24602]
root     24606  0.0  0.0      0     0 ?        S    21:13   0:00 [kvm-pit/24602]
root     26927  0.0  0.0   6492   852 pts/0    S+   22:04   0:00 grep 24602
root     26929  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 24603
root     24604  0.0  0.0      0     0 ?        S    21:13   0:00 [vhost-24602]
root     26932  0.0  0.0   6492   916 pts/0    S+   22:04   0:00 grep 24604
root     24607  0.1  0.1 1215688 15388 ?       Sl   21:13   0:03 /opt/kata/libexec/kata-containers/kata-proxy -listen-socket unix:///run/vc/sbs/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/proxy.sock -mux-socket /run/vc/vm/f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/kata.sock -sandbox f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a
root     26934  0.0  0.0   6492   920 pts/0    S+   22:04   0:00 grep 24607
root     26936  0.0  0.0   6360   916 pts/0    S+   22:04   0:00 grep 24608
root     26938  0.0  0.0   6360   908 pts/0    S+   22:04   0:00 grep 24609
root     26940  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 24610
root     26942  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 24611
root     26944  0.0  0.0   6360   968 pts/0    S+   22:04   0:00 grep 24612
root     26946  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 24613
root     26948  0.0  0.0   6360   852 pts/0    S+   22:04   0:00 grep 24614
root     26950  0.0  0.0   6360   916 pts/0    S+   22:04   0:00 grep 24615
root     26952  0.0  0.0   6360   916 pts/0    S+   22:04   0:00 grep 24616
root     26954  0.0  0.0   6360   856 pts/0    S+   22:04   0:00 grep 25025
root     26956  0.0  0.0   6360   916 pts/0    S+   22:04   0:00 grep 27555
root     26958  0.0  0.0   6360   916 pts/0    S+   22:04   0:00 grep 27556
root     26960  0.0  0.0   6360   856 pts/0    S+   22:04   0:00 grep 31294

@mcastelino
Copy link
Author

What are all the zombies.

Are they the kata-runtime OCI calls, called by conmon

@mcastelino
Copy link
Author

mcastelino commented Apr 9, 2019

What amount memory. Kata pods run wild.

  • We should probably place QEMU under the pod cgroup
kata$ for i in `ls pod*/**/memory.limit_in_bytes`; do echo $i && cat $i; done;
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/memory.limit_in_bytes
9223372036854771712
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/memory.limit_in_bytes
9223372036854771712
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956/memory.limit_in_bytes
9223372036854771712
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be/memory.limit_in_bytes
9223372036854771712
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-conmon-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/memory.limit_in_bytes
9223372036854771712
pod5884dc6c-5b0c-11e9-90bc-525400cfa589/crio-f105771f71cfaeed86175bc2bc10f9925c75d12b749231716dfe9f86e640ff0a/memory.limit_in_bytes
9223372036854771712
kata$



─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
bash: cd: too many arguments
runc$ cd /sys/fs/cgroup/memory/kubepods/
runc$ clear
runc$ for i in `ls pod*/**/memory.limit_in_bytes`; do echo $i && cat $i; done;
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/memory.limit_in_bytes
419430400
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/memory.limit_in_bytes
209715200
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/memory.limit_in_bytes
9223372036854771712
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-69ec0ae8491657436bc591fdf4715f2de5e6a0a3e5e1d8be1c0d9484bfbb8d8b/memory.limit_in_bytes
9223372036854771712
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-a5d317a244c87b3d6c718fe28f64aa1d6fc4e97ba47633d74bab16587bd1b5ac/memory.limit_in_bytes
9223372036854771712
pod3b75cd39-5b0c-11e9-8a48-525400eac274/crio-conmon-bade5c0a9030dab147e4c66ae68cfddcf8c996dbc38451396c11391090357ae9/memory.limit_in_bytes
9223372036854771712
pod3b75cd39-5b0c-11e9-8a48-525400eac274/memory.limit_in_bytes
629145600

@mcastelino
Copy link
Author

But stats are not working

clear@kata ~/clr-k8s-examples $ kubectl top pod
W0409 22:45:10.077232    8078 top_pod.go:259] Metrics not available for pod default/guar-2kc, age: 1h31m31.077220927s
error: Metrics not available for pod default/guar-2kc, age: 1h31m31.077220927s

clear@kata ~/clr-k8s-examples $ sudo crictl stats
CONTAINER           CPU %               MEM                 DISK                INODES
23df1b6ce8b36       0.26                17.32MB             24.58kB             7
425a34b76ccf4       0.00                12.14MB             53.25kB             14
70f28a1850484       0.03                14.24MB             48.93kB             16
8fb1525823c40       0.00                314.1MB             378.9kB             149
96eb687b2c446       1.27                54.77MB             49.15kB             15
9858ff7a9416f       0.20                15.18MB             45.06kB             14
a23c753b6d5bc       0.00                73.75MB             65.54kB             18
a53b2d30192cc       0.00                14.35MB             53.34kB             16
ac6249a8ada16       2.19                264.1MB             36.86kB             9
b71df30c4f400       0.22                15.2MB              45.06kB             14
c8f663ca46956       0.84                51.7MB              61.44kB             16
f08912f063eb0       0.00                167.9kB             80.17kB             29
clear@kata ~ $ kubectl describe po --all-namespaces | grep "Container ID:"
    Container ID:  cri-o://2cc1e6e2ae40b7c94dac72d68c1fff6b6d9e8058f8e26c4bd5e03ac9318b3956
    Container ID:  cri-o://5be201403ea55bb4d5cb8de2904bfb7f4251a5fafce45886ae639841fd2833be
    Container ID:   cri-o://f08912f063eb085c82f0d4628f975e3129d81802228856fe0eff6f3f2c56e9b3
    Container ID:  cri-o://a23c753b6d5bc562f880e58bc4665a16499d51fdb803e57e9bad99aec64524d2
    Container ID:  cri-o://a53b2d30192cc4b7b61e84b5c4f95c44272dd3892a947fc419d3a5c42d47fefb
    Container ID:  cri-o://b71df30c4f40073cbbbea7ca68832c58104b734115f98add2e5e37699861d60d
    Container ID:  cri-o://9858ff7a9416f6d4a7492417ad31fa6294fb989fc29356a9aec8c9d918b9a0b3
    Container ID:  cri-o://96eb687b2c4467fb893ef300c6ff3cf66a57ef92d0c464f4e71bf4e4718a31ce
    Container ID:  cri-o://8fb1525823c40758150a61a43b8b3a389f0632e2bbe5c894e8f4ce5c5f19a631
    Container ID:  cri-o://ac6249a8ada164050230fac7eb2c4c97d4ae036838f5fde536468dc6ee0456c9
    Container ID:  cri-o://c8f663ca469564d22895abc93fbcd71c8d3df4119ada77dcac0d43ad3f7ec444
    Container ID:  cri-o://425a34b76ccf401b4aaa56f4b05ba2fa3b960ba0c603490395acfc1b7bc0bec4
    Container ID:  cri-o://23df1b6ce8b360c2b50c95dfa39c39cad6b3b0d5be763732c7c5ad377d34dc23
    Container ID:  cri-o://70f28a18504849b0a5b603dc3762a8f19934a4c084e16eaa840a3968f56d45d8

@mcastelino
Copy link
Author

cri-o 1.13 will not work

https://sourcegraph.com/github.com/cri-o/cri-o@release-1.13/-/blob/lib/container_server_linux.go#L17:27

// libcontainerStats gets the stats for the container with the given id from runc/libcontainer
func (c *ContainerServer) libcontainerStats(ctr *oci.Container) (*libcontainer.Stats, error) {
	// TODO: make this not hardcoded
	// was: c.runtime.Path(ociContainer) but that returns /usr/bin/runc - how do we get /run/runc?
	// runroot is /var/run/runc
	// Hardcoding probably breaks ClearContainers compatibility
	factory, err := loadFactory("/run/runc")
	if err != nil {
		return nil, err
	}
	container, err := factory.Load(ctr.ID())
	if err != nil {
		return nil, err
	}
	return container.Stats()
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment