Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 30 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save guns/1dc1742dce690eb560a3a2d7581a9632 to your computer and use it in GitHub Desktop.
Save guns/1dc1742dce690eb560a3a2d7581a9632 to your computer and use it in GitHub Desktop.
From self[at]sungpae.com Mon Nov 8 16:59:48 2021
Date: Mon, 8 Nov 2021 16:59:48 -0600
From: Sung Pae <self[at]sungpae.com>
To: security@docker.com
Subject: Permissive forwarding rule leads to unintentional exposure of
containers to external hosts
Message-ID: <YYmr4l1isfH9VQCn@SHANGRILA>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
protocol="application/pgp-signature"; boundary="QR1yLfEBO/zgxYVA"
Content-Disposition: inline
X-PGP-Key: fp="4BC7 2AA6 B1AE 2B5A C7F7 ADCF 9D1A A266 D2BC 9C2D"
X-TUID: Avm8Mn+0Qq5s
--QR1yLfEBO/zgxYVA
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hello,
The documentation for "docker run --publish" states:
> Note that ports which are not bound to the host (i.e., -p 80:80 instead of
> -p 127.0.0.1:80:80) will be accessible from the outside. This also applies
> if you configured UFW to block this specific port, as Docker manages his own
> iptables rules.
https://docs.docker.com/engine/reference/commandline/run/#publish-or-expose-port--p---expose
The statement above is accurate, but terribly misleading, since traffic
to the container's published ports from external hosts will still be
forwarded due to an explicit forwarding rule added to the DOCKER chain:
# iptables -nvL DOCKER
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- !docker0 docker0 0.0.0.0/0 172.17.0.2 tcp dpt:80
An attacker that sends traffic to 172.17.0.2:80 *through* the docker
host will match the rule above and successfully connect to the
container, obviating any security benefit of binding the published port
on the host to 127.0.0.1.
What's worse, users who bind their published ports to 127.0.0.1 operate
under a false sense of security and may not bother taking further
precautions against unintentional exposure.
## Proof of Concept
Here is a simple proof of concept:
1. [VICTIM] Start a postgres container and publish its main port to
127.0.0.1 on the host.
victim@192.168.0.100# docker run -e POSTGRES_PASSWORD=password -p 127.0.0.1:5432:5432 postgres
2. [ATTACKER] Route all packets destined for 172.16.0.0/12 through the
victim's machine.
attacker@192.168.0.200# ip route add 172.16.0.0/12 via 192.168.0.100
3. [ATTACKER] Discover open ports on the victim's internal docker networks.
attacker@192.168.0.200# nmap -p5432 -Pn --open 172.16.0.0/12
Starting Nmap 7.92 ( https://nmap.org ) at 2021-11-05 15:00 CDT
Nmap scan report for 172.17.0.2
Host is up (0.00047s latency).
PORT STATE SERVICE
5432/tcp open postgresql
4. [ATTACKER] Connect to the victim's container.
attacker@192.168.0.200# psql -h 172.17.0.2 -U postgres
Password for user postgres:
## Scope of Exposure
Port publishing in docker and docker-compose is a popular way to expose
applications and databases to developers in a cross-platform development
environment.
Web searches for the pitfalls of "--publish", as well as discussions
with other developers, suggest that Docker users who are aware of the
security implications of port publishing also believe that specifying an
IP address to bind on the host will effectively constrain access to the
service they are attempting to share. This is a reasonable conclusion
that can be drawn from the documentation, but the reality is that simply
publishing a port exposes a container to external machines regardless of
the IP address bound on the host.
Github contains tens of thousands of projects that publish container
ports to "127.0.0.1:xxx:xxx":
* https://github.com/search?q=docker+run+%22-p+127.0.0.1%3A%22&type=code
* https://github.com/search?q=docker+run+%22--publish+127.0.0.1%3A%22&type=code
* https://github.com/search?p=5&q=%22127.0.0.1%3A5432%3A5432%22&type=Code
* https://github.com/search?q=%22127.0.0.1%3A15432%3A5432%22&type=code
* https://github.com/search?q=%22127.0.0.1%3A3306%3A3306%22&type=Code
* https://github.com/search?p=5&q=%22127.0.0.1%3A8080%3A80%22&type=Code
* And many more!
Here is a sampling of commit messages that specifically mention the
security rationale behind publishing to "127.0.0.1":
https://github.com/rubyforgood/abalone/commit/764a619babc7ac05fe9fe6edc63e9128a2c86af3
> Forward the "db" service's port to the host's loopback interface, so
> that a developer could choose to use docker-compose only for a container
> to run the database while running all the Ruby processes on their host
> computer. "127.0.0.1:5432:5432" was chosen over "5432:5432" so that the
> PostgreSQL would not be available to all other computers on the host
> computer's network (say, a coffee shop wifi).
https://github.com/MayankTahil/pref/commit/f3056408867a227e9ff6b338c51ef37d605f5dad
> [SECURITY] Limit port export to localhost
>
> It's prevents leak private developed projects vie Eth & Wi-Fi interfaces.
> You now must use `localhost` host or use host mapped directly to 127.0.0.1
https://github.com/open-edge-insights/eii-core/commit/7a85ab8ed818af73a83489554eb5737394a4cf0c
> Docker Security: Port mapping and default security options
>
> Changes:
>
> 1) Provide secuiry options in docker-compose file related to selinux and resticted privilages
> 2) Set HOST_IP as Environment Variable in Compose startup
> 2) Bind all ports to either 127.0.0.1 or Host IP
## Mitigation
While the unintentional exposure of published container ports can be
mitigated by constraining access to containers in the DOCKER-USER chain,
my observation is that most Linux users do not know how to configure
their firewalls and have not added any rules to DOCKER-USER. The few
users that do know how to configure their firewalls are likely to be
unpleasantly surprised that their existing FORWARD rules have been
preceded by Docker's own forwarding setup.
In light of this, an effective mitigation should:
1. Restrict the source addresses and/or interfaces that are allowed to
communicate with the published container port.
For example, "docker run -p 127.0.0.1:5432:5432" creates the
following rule in the DOCKER chain:
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- !docker0 docker0 0.0.0.0/0 172.17.0.2 tcp dpt:5432
It should, however, restrict the source ip address range to
127.0.0.1/8 and the in-interface to the loopback interface:
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- lo docker0 127.0.0.1/8 172.17.0.2 tcp dpt:5432
The values of "127.0.0.1/8" and "lo" can be retrieved from the
interface on which 127.0.0.1 is defined. For instance, if a machine
has an IP address of 192.168.0.100 on a /24 network on eth0 and the
user runs "docker run -p 192.168.0.100:5432:5432", we would expect to
see the following:
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT tcp -- eth0 docker0 192.168.0.0/24 172.17.0.2 tcp dpt:5432
2. Default to "127.0.0.1" when a bind address is not supplied to "--publish".
This is a breaking change, but it should have been the default from
the beginning.
## Conclusion
Docker port publishing is an *extremely* popular feature, and at
present, virtually all users that use containers with published ports
are exposed to attackers that have noticed the oversight outlined in
this email.
I have not noticed any discussion online of attackers using custom
routes to gain access to containers, but it is an obvious attack, and
perhaps unfortunately, I posted a comment about this vulnerability in a
related Github issue:
https://github.com/moby/moby/issues/22054#issuecomment-962202433
Thank you for your attention to this.
Sung Pae
https://github.com/guns
--QR1yLfEBO/zgxYVA
Content-Type: application/pgp-signature; name="signature.asc"
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEES8cqprGuK1rH963PnRqiZtK8nC0FAmGJq+IACgkQnRqiZtK8
nC3TvBAAka0sVXX4X2k8BIzVUoojrM1OkOBzAZl76cdI1Zmv4P6sp/zmkR7iE5eV
lUQ57cLwnalbbn9e0QyVA2/jcuB96cx8bKL8jy+JnJ0IuQ4VUYEWGTkLORIojDRJ
I8imGY83Bz4fyffoMUxG3DBeuJJCOHIHFbcoijI4xYPz2ujY3KR0vC0UYxcZLv92
bD1thh/bFXaPOPBHlVCUB9hFq1/JZ27XaH9GZ7X7TeuOp25JriU1h3U/A6gsGTkK
OBOjRVJV30tDnsVZa8TBvL27JfLGyRvGACpnOhaozSvVgePERDBeMsH6bjDNzWEs
mb9QIsxA6brZdJdH1uXJDM36nhG1eT3OM7jrZzI76+7FT2yzrQcKsk2Oes0t9ZCq
wyZbVoZGExam2bPiWvu9XJVb9TPwKpxXpLPyiuFZrlrOaBfDV1ZqFMSBXZJxFOuu
PEysiYTUpq+FufGJxH5JqWERLh79TV/f+654DG/UtOas+A7Rjy6hF9OsDXDWzpz/
lo7w3OKaXqvNZ2ysL8ihHp963fFLPkhMn2JAOBsoFa3s/hCCBwFJjxHnzF9gNRhZ
cr9f3wlk6IVJMSARPJsZCD+g5uaU1gzDbndem3SlMLjkJ4D6rLoZ3zhmkddjKhsv
CLBL6R7nEmuBcb4e97EVmCnYR8221uXmqvc2bQwPpeTLeGMG5BQ=
=+qVS
-----END PGP SIGNATURE-----
--QR1yLfEBO/zgxYVA--
@tomryanx
Copy link

This is awesome, but doesn't the proposed handling of docker run -p 192.168.0.100:5432:5432 in line 170 go too far?

The established norm when binding to loopback is that the entire outside world cannot reach it1

The established norm when binding to an external interface is that the entire outside world can reach it1

Do I need a lesson in container networking, or does your proposal say that we'd expect only local-net addresses to be able to reach a deliberately-exposed port?

Footnotes

  1. subject to firewalls etc ofc 2

@pier4r
Copy link

pier4r commented Jun 23, 2022

Thank you for the information but I am not sure how critical this is.

This seems critical if the host that runs the docker container (192.168.0.100 in your example) allows traffic from external machines to go through. And again it is critical if one leaves the docker network the default network (and I wouldn't suggest that) without further management of the DOCKER_USER firewall chain (where one can state "if packets come from not allowed sources, block").

What for me is not self evident is (and thus making the entire thing not that critical):

  • why should I route traffic from external machines while at the time time running docker containers:
  • why should I use the default docker network for my workload;
  • why shouldn't I set proper iptables rules in DOCKER-USER;

Also even if I allow external machines to go through my firewall because why not, the entire thing is solved by an iptables rules that state "if a packet coming from the external interface wants to reach the following network , drop it" (this can be added in nat prerouting). One may say "but in this way you kill outbound connections!" Nope. If a packet from outside the host wants to reach an internal network (that is not exposed externally) on the host, before any DNAT rules or whatever translation rules, then the packet can be dropped. Although @peterwwillis has a nicer solution for it.

@guns
Copy link
Author

guns commented Jun 24, 2022

@tomryanx

The established norm when binding to an external interface is that the entire outside world can reach it

This is only true for a host on the Internet with a single "external" interface. A machine can join multiple networks, and an application can choose to listen for connections on a subset of those networks that are not reachable from the Internet.

For example, a typical home router has an IP address on the Internet (e.g. 93.184.216.34), and it also has an IP address on the local network it creates (e.g. 192.168.0.1/24). A user that starts an application on this router and binds it to 192.168.0.1:1234 can reasonably expect that only other machines on the 192.168.0.0/24 network will be able to connect to the service.

The conventional way to ask an application to accept connections from all hosts is to specify the special bind address 0.0.0.0. You can search for INADDR_ANY to read up on this topic.

@guns
Copy link
Author

guns commented Jun 24, 2022

@pier4r

  • why should I route traffic from external machines while at the time time running docker

The trouble here is that even if you start with an empty FORWARD chain with the policy set to DROP (i.e. block all forwarding attempts in either direction), dockerd inserts its own rules into the FORWARD chain that explicitly allow external machines to access your containers. This behavior is dangerous because it defeats a previously secure firewall setup.

  • why should I use the default docker network for my workload;

This issue is unrelated to the internal docker network. In fact, I stumbled on this exposure while working on a container on a custom network.

why shouldn't I set proper iptables rules in DOCKER-USER

You should. I have the following iptables rules on my work machine:

-A DOCKER-USER -o br-+    -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound  -j RETURN
-A DOCKER-USER -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound  -j RETURN
-A DOCKER-USER -i br-+                                               -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER -i docker0                                            -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER                                                                                            -j REJECT

https://github.com/guns/haus/blob/master/share/iptables/iptables.script#L151-L158

The problem is that approximately dozens of users worldwide know that this is necessary when using --publish 127.0.0.1:port:port. A default behavior that violates assumptions about binding to 127.0.0.1 and undermines secure firewalls is an issue that requires fixing even if can be mitigated by the user.

@tomryanx
Copy link

@tomryanx

The established norm when binding to an external interface is that the entire outside world can reach it

This is only true for a host on the Internet with a single "external" interface. A machine can join multiple networks, and an application can choose to listen for connections on a subset of those networks that are not reachable from the Internet.

For example, a typical home router has an IP address on the Internet (e.g. 93.184.216.34), and it also has an IP address on the local network it creates (e.g. 192.168.0.1/24). A user that starts an application on this router and binds it to 192.168.0.1:1234 can reasonably expect that only other machines on the 192.168.0.0/24 network will be able to connect to the service.

I think you are confusing unroutable addresses with what happens when specific addresses/interfaces are bound. Having multiple interfaces or addresses etc has nothing to do with it.

I suggest you read up about how IP (internet protocol) routing works in its simplest form. It might be good to start with RFC1918 because it covers in detail the address spaces of which your example 192.168.0.0/24 is a subset. There is no presumption that only link-local clients are allowed to connect.

In your example, the reason the home network reasonably expects the public internet not to be able to reach a listening server is that RFC1918 addresses are not routable from the public non-1918 internet.

If you had other networks - for example a 192.168.1.0/24 network - on the inside of your external router, and working routes (either explicit or default) between those networks, and you had a listener at 192.168.0.1, you would need a firewall to intercept traffic to prevent it being accessible from 192.168.1.0/24.

I hope this ramble helped ;)

@guns
Copy link
Author

guns commented Jun 24, 2022

@tomryanx

It might be good to start with RFC1918 because it covers in detail the address spaces of which your example 192.168.0.0/24 is a subset. There is no presumption that only link-local clients are allowed to connect.

I chose to illustrate with a router because of your comment about the "entire outside world", but it appears this has confused the discussion. My point is that a machine can belong to two different networks whose traffic can be isolated from each other, and therefore binding to one IP address on one network versus binding to all IP addresses on all networks has some utility. It just happens that in my example packets bound for private addresses are not allowed on public networks.

If you had other networks - for example a 192.168.1.0/24 network - on the inside of your external router, and working routes (either explicit or default) between those networks, and you had a listener at 192.168.0.1, you would need a firewall to intercept traffic to prevent it being accessible from 192.168.1.0/24.

Yes, you're correct. In this example you'd need a firewall rule to restrict access to hosts in 192.168.0.0/24.

So maybe you can clarify your statement:

The established norm when binding to an external interface is that the entire outside world can reach it

If my machine joins two networks and I properly isolate traffic from each, my expectation when binding to one IP address is that only traffic from that associated network will reach the listening socket. Is this untrue?

This discussion relates to --publish in that restricting --publish 192.168.0.1:1234:1234 to the interface and network to which that IP address belongs mirrors the equivalent setup required to bind and isolate a listening socket on 192.168.0.1:1234.

You asked if this is going too far, and I believe it is not, because a user that wants to accept traffic from all hosts can use --publish 0.0.0.0:1234:1234.

@pier4r
Copy link

pier4r commented Jun 24, 2022

@guns

The trouble here is that even if you start with an empty FORWARD chain with the policy set to DROP (i.e. block all forwarding attempts in either direction), dockerd inserts its own rules into the FORWARD chain that explicitly allow external machines to access your containers. This behavior is dangerous because it defeats a previously secure firewall setup.

Ok, then I'll need to test this. I can see it can happen but if I am not mistaken the docker added rules don't come early enough, I'll have to check.

@peterwwillis
Copy link

peterwwillis commented Jun 24, 2022

The problem is that approximately dozens of users worldwide know that this is necessary when using --publish 127.0.0.1:port:port. A default behavior that violates assumptions about binding to 127.0.0.1 and undermines secure firewalls is an issue that requires fixing even if can be mitigated by the user.

I agree. I consider myself to be an advanced user and even I didn't think docker would expose services that I'd assume were bound to the lo interface, but are actually accessible remotely by default. And it's insidious because if you just portscan your machine you wouldn't see it either.

I might try to open a PR for their documentation page on iptables to clarify the behavior and a stopgap fix

@randomstuff
Copy link

-A DOCKER-USER -o br-+    -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound  -j RETURN
 -A DOCKER-USER -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound  -j RETURN
-A DOCKER-USER -i br-+                                               -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER -i docker0                                            -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER                                                                                            -j REJECT

Is that enough? Could an attacker on the local network mess up with conntrack (sending forged SYN, SYN+ACK, ACK)? Is it not necessary to add something like:

iptables -t raw -A PREROUTING -m rpfilter --invert -j DROP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment