Strategies for Defeating NAT configuration issues: STUN/TURN/ICE.
(decided to write this down to have all this in one place, in a way at least I can understand, I hope it helps others which had the same questions as me when they came across STUN/TURN/ICE acronyms and found it all confusing, not knowing what was needed first and what problems each solved)
NAT devices, basically keep tables that map external IP:PORT to internal IP:PORTs and when IP packets come in and out, they rewrite these packets so that they IP:PORTs are changed (initially with the goal of reusing a limited number of public addresses, but lately to restrict direct communication between machines inside restricted networked environments).
This works fine for certain protocol standards, but it's not so good when you have a custom protocol that carries inside IP:PORT information as the NAT device won't look inside the packet, and in many cases, we don't want anybody looking inside the packet, for instance when the data is encrypted, there's just no way for the NAT as advanced as it may be to do anything about it.
So one of the simplest techniques to communicate 2 nodes that are not visible to each other on the internet (can't accept incoming connections due to firewall/router policies) consists of contacting a third node which doesn't have those issues (let's call it a "Public Node") and which let's both nodes node about each other.
The goal, is to find out what IP and UDP Port has been allocated by the firewall so that each node can communicate outside the internal network.
A set of similar techniques and a protocol to implement these has been standarized as STUN (Session Traversal Utilities for NAT), and they have been formalized on RFC 5389
I'd like to make a simple implementation based on STUN, just to make sure this works in most cases, as it seems that it doesn't help 100% and there are other techniques that may be used after you've sucessfully contacted STUN servers and learn about possible peers, for instance, the port your IP as been assigned may only be accesible for a short interval of 10 seconds, so nodes must be constantly re-punching so the STUN server can keep an up to date table and let everyone talk (as far as I understand).
In some cases we may be faced with an asshole Symmetric NAT device, this device defeats STUN approaches by making it impossible for STUN clients to know what their public ports are, as they randomly generate a new port on every request made by the source, even if these requests go to the same destination... bastards.
The only solution here is to have the STUN server return its own port as if it were the port of the other guy, and just sit there relaying packets between both, this is a bandwidth,memory and CPU costly solution. This technique/protocol has been standarized as TURN -> Traversal Using Relays around NAT
Not only this technique (which works in most cases) is expensive, it will cause latency issue, so it was necessary to develop a technology that would combine the benefits of both STUN and TURN.
IETF mmusic working group to the rescue (October 2003) submitted an approach that basically tries both STUN and TURN at once, gathering several tuples of IP:PORTs as candidates to establish communications, probing them until data can start flowing. It's also known to work with TCP (used for interactive sessions like Whiteboard sharing apps for example) and implementations seems to be very robust, scaling to several hundreds of users.
RFC 5245 for a standard called ICE (Interactive Connectivity Establishment (ICE))
However it seems a lot of what ICE covers has to do with VoIP, so I think for now we should be good with just implementing STUN and TURN on OpenBazaar servents.
Most public nodes would by default want to act as STUN servers, so that NATed nodes can know about each other's IP:PORTS and then establish connectivity on their own, as a last resource they'd act as TURN servers relaying traffic for others up to a point that doesn't compromise their performance, in which case they should forward nodes to other public nodes that might be able to relay traffic for them.
In fact, I think that if we're smart, we could use several TURN servers to communicate nodes, basically using the path of least resistance, sort of thinking about this like an electric grid, and abstracting packets as if they were electrons flowing through it.