-
Let’s run a test container, this container has an application that listens on a given port, but that’s not important for now:
podman run -d --rm --name reversewords-test quay.io/mavazque/reversewords:latest
-
We can always get capabilities for a process by querying the /proc filesystem:
# Get container's PID CONTAINER_PID=$(podman inspect reversewords-test --format {{.State.Pid}}) # Get caps for a given PID grep Cap /proc/${CONTAINER_PID}/status
-
We get the capability sets in hex format, we can decode them using
capsh
tool:capsh --decode=00000000800405fb
-
We can use podman inspect as well:
podman inspect reversewords-test --format {{.EffectiveCaps}}
-
Stop the container:
podman stop reversewords-test
-
Run our test container with a root uid and get it’s capabilities:
podman run --rm -it --user 0 --entrypoint /bin/bash --name reversewords-test quay.io/mavazque/reversewords:ubi8 grep Cap /proc/1/status
-
We can see thread's permitted and effective capability sets populated, let's decode them:
capsh --decode=00000000800405fb
-
Exit the container:
exit
-
Same test but running the container with a nonroot UID:
podman run --rm -it --user 1024 --entrypoint /bin/bash --name reversewords-test quay.io/mavazque/reversewords:ubi8 grep Cap /proc/1/status
-
We can see thread's permitted and effective capability sets cleared, we can exit our container now:
exit
-
We can requests extra capabilities and those will be assigned to the corresponding sets:
podman run --rm -it --user 1024 --cap-add=cap_net_bind_service --entrypoint /bin/bash --name reversewords-test quay.io/mavazque/reversewords:ubi8 grep Cap /proc/1/status
-
Since Podman supports ambient capabilities, you can see how we got the NET_BIND_SERVICE cap into the ambient, permitted and effective sets.
-
We can exit the container now:
exit
-
We can control in which port our application listens by using the APP_PORT environment variable. Let’s try to run our application in a non-privileged port with a non-privileged user:
podman run --rm --user 1024 -e APP_PORT=8080 --name reversewords-test quay.io/mavazque/reversewords:ubi8
-
Stop the container with Ctrl+C and try to bind to port 80 this time:
podman run --rm --user 1024 -e APP_PORT=80 --name reversewords-test quay.io/mavazque/reversewords:ubi8
-
This time it fails, remember that since we're running as nonroot, permitted and effective capability sets were cleared (so NET_BIND_SERVICE present on podman's default cap set is not available).
-
We know that the capability NET_BIND_SERVICE allows unprivileged processes to bind to ports under 1024, let’s assign this capability to the container and see what happens:
podman run --rm --user 1024 -e APP_PORT=80 --cap-add=cap_net_bind_service --name reversewords-test quay.io/mavazque/reversewords:ubi8
-
This time it worked because the NET_BIND_SERVICE cap was added to the ambient, permitted and effective sets.
-
You can stop the container using Ctrl+C.
-
We added the NET_BIND_SERVICE capability to our binary when we built the image:
setcap 'cap_net_bind_service+ep' /usr/bin/reverse-words
-
Let's take a look inside the container:
podman run --rm -it --entrypoint /bin/bash --user 1024 -e APP_PORT=80 --name reversewords-test quay.io/mavazque/reversewords-captest:latest getcap /usr/bin/reverse-words
-
The capability is added to the effective and permitted file capability sets.
-
Let's review the thread capabilities:
grep Cap /proc/1/status
-
As you can see, the effective and permitted sets are cleared. But the inheritable and bounding do have the NET_BIND_SERVICE.
-
Let's run our app:
/usr/bin/reverse-words &
-
We were able to bind to port 80, the binary had the file capability required to do that and it was present on the inheritable and bounding sets, to the thread adquired the capability on its effective set. We can check the effective and permitted sets:
grep Cap /proc/<app_pid>/status
-
We can exit the container now.
exit
-
Does this mean that we can bypass thread capabilities? - Let's see:
podman run --rm -it --entrypoint /bin/bash --user 1024 --cap-drop=all -e APP_PORT=80 --name reversewords-test quay.io/mavazque/reversewords-captest:latest
-
Check the cointainer thread capabilities:
grep Cap /proc/1/status
-
All sets are zeroed, let's try to run our app:
/usr/bin/reverse-words
-
The kernel blocked the execution, since NET_BIND_SERVICE capability cannot be acquired.
-
That answers the question, NO. Now we can exit the container:
exit