larsbutler/raw.md

## raw.md

      
    Raw
  

              raw.md
            
          
Simple, modern networking broker
Built-in support to libzrt
http server

nginx
fastcgi


libzrt
1 http frontend + 1 daemon backend
factor some object-query functionality into this
need good multiplexing file/stream transfer
no discovery needed

in the ZeroCloud case, we already know the cluster topology


two main functions:

register

register(channel_id)
unregister(chanel_id)


transfer

send(ip:port, channel_id_src, channel_id_dest, data, size)
recv(ip:port, channel_id_src, channel_id_dest, size) -> data


channel_ids are just opaque identifiers, or a "label" for messages


NOT a queue

only contains 0..1 messages at a given time


block receivers until message is available

use case: MapReduce

Mapper should not be able to fill memory with all mapped data if the reducer is not ready to consume it


Nothing should happen until somebody wants to consume something
Channels are UNIdirectional

we are limited to this because of pipe semantics
CSP supports bidirectional channels

but it's not deterministic
"select" operator

"read from any channel in the list"
data = select(list)
not supported now by ZeroVM, but could be
recv_any([ip_list], [channel_list])


broker-to-broker communication

http: PUT/GET /broker/channel_id
register lets brokers learn about each other


proxy-query sends list of brokers and channels in each execution request

proxy knows it because it always has the complete cluster topology


broker also passes the list of other brokers ip:port tuples to the ZeroVM process/thread
Iterface for broker <--> ZeroVM communication

register
unregister
send
recv


No routing

ip:port for each remote end of channel is known on job start
broker connects there directly, or reuses existing connection to that ip:port


Might need multiple redundant connections

Not for each ZeroVM instance, though

(This would be a good way to run out of file descriptors.)


For QoS
Possible one connection for each message-size class, e.g.:

64 bytes
128 bytes
512 bytes
1024 bytes


1024 bytes


needed so that long data transfers don't stall short ones
Or, just "< 1024 bytes" and ">= 1024 bytes"

2 of each connection
round-robin between them
this is the typical strategy people implement to speed up page loads in HTTP


Push vs. Pull

We can decide based on message size
Push:

good for latency
application calls write(fd, data)
fd translated into channel_id
channel_id translated to remote IP
broker issues send
when send comes to remote broker:

reads channel_id
translates it into ZeroVM process/thread ID


if recv was issued, the send is accepted
if not, tell the other party (sender) to wait/buffer


Pull:

good for long transfers
only need a buffer with size "small_message" (see message sizes above)
accepts first send() unconditionally

or else a deadlock will happen often


broker: I got a send()
broker: Did I get a recv() for that?
If not, buffer it.
If no buffer, block it.


Other notes:

ZeroVM must support sending channels over channels. This would fix David Holland's inter-instance communication problem without breaking determinism.
Have a look at process calculi