jamii/Mist.md

## Mist.md

      
    Raw
  

              Mist.md
            
          
    About

Building decentralised services is an exercise in accidental complexity. Connectivity is the first hurdle. Users move around, go on- and off-line and lurk behind NAT. For centralised systems we have ICE. For decentralised systems there is no standard solution.
Mist will take an abstract address representing a user, locate their machine, punch a hole and return a local UDP proxy to talk to that user. It will be cross-platform, provide a simple api and be easy to use with existing UDP programs.
As a bonus, the implementation also provides a p2p pubsub system.
The design is based heavily on Jeremy Miller's telehash protocol (although simplified and streamlined) and draws ideas from libswift. It is also based on lessons learned from my telehash implementation.
Technical overview

Mist consists of a UDP multiplexer, a Kademlia overlay and a simple pubsub protocol.
The UDP multiplexer (based on a similar design proposed for libswift) allows running multiple udp protocols on a single port for easy NAT traversal and help protect against UDP spoofing (via a handshake to agree channel numbers). It is also responsible for sending heartbeat messages to keep punched holes open.
Kademlia is a p2p routing overlay used, among other things, for decentralised bittorrent trackers. Mist replaces Kademlia's STORE(key, value) and FIND_VALUE(key) commands with PUBLISH(key, value) and SUBSCRIBE(key), where both keys and values are arbitrary strings of bytes. The Kademlia overlay handles routing published messages to matching subscribers.
NAT traversal

UDP hole punching requires the use of a third party to which both peers can already connect. This third party informs each peer of the others public address so that they can initiate a connection. In a centralised service this would be a publicly addressable server.
In the Kademlia overlay peers are always introduced to each other by a mutual connection. In this way chains of connectivity are built between random peers. When we want to connect to a specific peer we can use the pubsub system to find them as follows:
ALICE          BOB
-----          ---

SUBSCRIBE("alice@example.com")

               SUBSCRIBE("bob@example.com")
           
               PUBLISH("alice@example.com", "this is bob@example.com")
         <----
RECEIVE("this is bob@example.com", ip:port (Bob's address as seen by the forwarding node))

PUBLISH("bob@example.com", "this is alice@example.com")
         ---->
               RECEIVE("this is bob@example.com", ip:port (Alice's address as seen by the forwarding node)))

Alice and Bob now know each others public addresses which allows them to perform UDP hole punching.
Note that this handshake is not even remotely secure. The forwarding node is free to report whatever address they like. However due to fanout in the pubsub system all 20 nodes responsible for that key would have to conspire. In addition, Alice and Bob can use public key authentication after punching their holes to ensure that they have obtained the correct address.
FAQ

Why the JVM?


High level - no memory management means less opportunity for security problems.


Useful libraries eg UPnP, ICE, PGP


Portability (Windows, Linux, Mac, browsers, mobile devices)


Why clojure?


Easy concurrency


Sane polymorphism


Observation without cooperation (debugging erlang can be incredibly frustrating because it requires inserting debugging logic into every process)


Sheer joy


Differences from telehash:


The _ring/_line protocol which telehash uses to protect against udp spoofing is replaced by the channel mechanism in the udp multiplexer


Instead of json messages, payloads will be plain byte strings. This reduces complexity and makes it easier for legacy protocols to be carried by the pubsub system (eg directly publishing OSC commands to enable multiplayer Overtone).


We omit the tap mechanism, which allows filtering messages according to a simple dsl, and just match on the pubsub key. Again this reduces complexity and allows messages to be plain byte strings.