natevw/socket_hack.md

## socket_hack.md

      
    Raw
  

              socket_hack.md
            
          
    TCP multiplexing protocol

A high-level approach to multiplexing multiple TCP connections over a single socket. Probably ignores a lot of queueing theory.
Background

The CC3000 WiFi chip may be especially unreliable if more than one socket is in use. So the idea is to maintain a single connection with a proxy server, which will handle the actual simultaneous outbound connections on a client's behalf. (Inbound connections are lower implementation priority.)
Terminology


device — the client which wants to workaround its onboard network limitations
target — a remote server with which a device would like to communicate
channel — generic rebranding of ± TCP socket, used for any purpose
connection — a duplex channel between a device and a single target
tunnel — the duplex channel over connections are to be multiplexed
proxy — the server which manages connections on behalf of a device

A proxy communicates with targets on behalf of a device, multiplexing the resulting connections over a single tunnel.
device <==tunnel==> proxy <--> target1
                          <--> target2

Connections appear above as -- lines, represented as == when multiplexed together.

Goals


handle multiple connections better than raw CC3K
support some level of per-connection backpressure
ease and speed of implementation
mask tunnel reconnects/interruptions/handoffs?
accept incoming connections?

Protocol Sketch

Protocol probably message based, simply [message-size message-payload] at the framing level. message-payload would be of the form command command-payload command comes from an enumerated list of available requests/responses/messages and we could (if desirable) simply ignore unknown commands.
Most command-payload (besides tunnel-level commands) would be take the form connection-id [payload]. connection-id is a numeric socket descriptor, and presence/length of payload is determined by the underlying command's semantics. (We could simply reserve 0x0000 or 0xFFFF for messages pertaining to the tunnel itself if more convenient — perhaps then the id should come before the command, enabling a bit more of a "routing" strategy internal to the parser…)
Commands

Tunnel-level:

hello — payload is a session cookie or something similarly suitable for plaintext transmission, to help mitigate proxy abuse
scram — bad credentials (or have connection-id for tunnel and emit same error as below? or simply close tunnel's channel?)

All these commands are associated with a connection-id followed in many cases by a payload:

connect — payload is network-order 32-bit addr followed by 16-bit port (usually from device, proxy could send for incoming at some point too) TBD: pass host so proxy can do DNS lookup
accept — no payload? some form of "our side's" bound addr+port? (usually from proxy once connection is set up, but future incoming in mind)
read — payload is the amount of bytes the sender is prepared to handle (both device and proxy will send this!)
data — payload is a chunk of received data, sent at any time the total read requested is larger than the amount of data sent
done — no payload. sender has no more data and uses this to signal eof/close of their end
error — payload is an error code. all errors fatal, i.e. any future messages pertaining to this connection-id should be ignored by both parties

Note that read/data/done are intentionally symmetric between device and proxy. While it is more likely that the device will be exerting the most backpressure, it is possible that the proxy will in some cases not be desiring data from the device as well.
[v2 "incoming connections" idea: bind_port (TCP proxy) vs bind_host (HTTP proxy) options.]
Questions

Prior art — is there already such a protocol we should be using? (VPNs do this stuff, right?) If we are inventing one, should it be patterned after the node.js or after the POSIX API? (See next question.)
Backpressure — proxy sends data to device only on read (streams v2)? device tells proxy to pause/resume connections (streams v1)? device periodically updates per-connection buffer status to proxy (streams vINTERNALS)? current answer: device periodically issues read(N) which gives proxy permission to send chunks up to N bytes total at any time in the future
Reconnect — The idea here is that if the tunnel channel explodes, the device could (within a reasonable window) open a new one and keep going. This may not be too hard to support — the main concern would be making sure device and proxy are agreed on current state of all connections (including what data was/wasn't sent and received, which would not be so trivial after all…!!). So is this actually needed for reasonable CC3K reliability, or would it just be tilting at a windmill TCP didn't bother to challenge?