juliangruber/00-readme.md

## 00-readme.md

      
    Raw
  

              00-readme.md
            
          
    L1<>L2 architecture

Conventions


Naming

L1: L1 machine
l1: Individual l1 worker process


The overview graph lists all involved components. In future graphs irrelevant components are omitted for readability, which doesn't mean they have been removed

Problems to solve


 L2->l1 routing
 l1->L2 routing


## 01-overview.md

      
    Raw
  

              01-overview.md
            
          
    Overview


      flowchart TB
  subgraph L1
    master
    subgraph workers
      l1
      l1'
    end
  end
  L2
  L2'
  subgraph L2s
    L2
    L2'
  end
  L2s --> |register|workers
  workers --> |request CID|L2s
  L2s --> |send CAR|workers
  client --> |request CID|workers
  workers --> |send CAR|client

    
      Loading

  
      sequenceDiagram
  participant Client
  participant l1s
  participant L2s
  L2s->>l1s: Register
  Client->>l1s: Request CID
  l1s->>L2s: Request CID
  L2s->>l1s: Send CAR
  l1s->>Client: Send CAR

    
      Loading

  
## 02-problem-a.md

      
    Raw
  

              02-problem-a.md
            
          
    Problem: L2->l1 routing

The l1 worker that receives the client request needs to be the one that also receives the CAR file from the L2. If L2s make new HTTP requests for their response, then the Node.js cluster module assigns it to a random l1 worker process.
We run into this problem because L2s likely living in home networks aren't directly reachable by the outside / can't act as servers. Therefore the L2 is the only one that can open new connections to the L1. This means that the L1 can't act as an HTTP client for example, requesting data from the L2. The L1 can request data, but unless we figure out how the L2 can reply directly, it will hit the L1's load balancer and get assigned a random l1, which didn't receive the original HTTP request from the client.

  
      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
  end
  L2 --> |register|l1
  l1 --> |request CID|L2
  L2 --> |send CAR|l1'
  client --> |request CID|l1
  l1 --> |send CAR|client
  
  linkStyle 2 stroke:red
  linkStyle 4 stroke:red

    
      Loading

  
## 03-workers-share-files.md

      
    Raw
  

              03-workers-share-files.md
            
          
    Proposal: Workers share files

Worker processes share files with each other

significantly more complexity
more worker load


      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
  end
  L2 --> |register|l1
  l1 --> |request CID|L2
  L2 --> |send CAR|l1'
  l1' --> |share CAR|l1
  client --> |request CID|l1
  l1 --> |send CAR|client
  
  linkStyle 3 stroke:yellow
  linkStyle 5 stroke:green

    
      Loading

  
      sequenceDiagram
  participant Client
  participant l1
  participant l1'
  participant L2
  L2->>l1: Register
  Client->>l1: Request CID
  l1->>L2: Request CID
  L2->>l1': Send CAR
  l1'->>l1: Share CAR
  l1->>Client: Send CAR

    
      Loading

  
## 04-workers-advertise-ports.md

      
    Raw
  

              04-workers-advertise-ports.md
            
          
    Proposal: Workers expose ports

Workers get their own port, so that L2 can reach the right worker (not behind load balancer).

needs one extra port per cpu core
complex dynamic nginx configuration
complex dynamic docker configuration
complex dynamic firewall configuration


      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
  end
  L2 --> |register|l1
  l1 --> |request CID with port|L2
  L2 --> |send CAR to worker port|l1
  client --> |request CID|l1
  l1 --> |send CAR|client
  
  linkStyle 2 stroke:yellow
  linkStyle 1 stroke:yellow
  linkStyle 4 stroke:green

    
      Loading

  
## 05-nginx-sticky-routing-good.md

      
    Raw
  

              05-nginx-sticky-routing-good.md
            
          
    Proposal: Nginx sticky routing

Make nginx use sticky routing

doesn't work with free nginx because the incoming client request and the L2's request are from different sources and so stickyness can't be applied per client


      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
  end
  L2 -->|register|l1
  l1 --> |request CID|L2
  L2 --> |send CAR|l1
  client --> |request CID|l1
  l1 --> |send CAR|client

    
      Loading

  
TODO


 If we use the hash routing strategy then we can match L1s and L2s by CID URL path segment.


## 06-next-strategies.md

      
    Raw
  

              06-next-strategies.md
            
          
    Next strategies to try

Single TCP Connection

If there is a single TCP connection between l1 and L2, then there's no load balancing between the two and the l1 that asks L2 is also the one that gets the response.

 libp2p multiplexer

I couldn't get the JS version to run unfortunately


 binary websockets

Worked in contrived example
CAR responses will be chunked and prefixed by fixed-length request ID
This is the current state of the L1<>L2 integration PR, and seems to work


 reverse http

protocol implications (same issue with http/1 and http/2)


Custom routing

If we replace the round robin routing in the Node.js cluster module with our own logic, we can ensure that a L2 response always arrives at the right l1, by inspecting the request URL.

 custom nodejs cluster implementation

likely quite complex


## 061-solution.md

      
    Raw
  

              061-solution.md
            
          
    Solution to L2->l1 routing

Use binary WebSockets, which allow for a persistent connection between L2 and l1, bypassing any load balancing.
Protocol

L2 registration

0{L2ID} 

0 + arbitrary length l2 id (utf-8)
L1 request

{requestId[36]}{cid}

36 bytes request Id (utf-8) + cid (utf-8)
L2 response

1{requestId[36]}{chunk?}

1 + 36 bytes request Id (utf-8) + chunk?
If the chunk is empty that means the CAR file has been served completely.
TODO


 Use protocol buffers


## 062-websocket-performance.md

      
    Raw
  

              062-websocket-performance.md
            
          
    Websocket performance

Unfortunately WebSockets are significantly slower than raw HTTP:
$ time node bench.mjs ws
node bench.mjs ws  5.96s user 0.66s system 104% cpu 6.329 total
$ time node bench.mjs http
node bench.mjs http  0.33s user 0.47s system 64% cpu 1.251 total
import WebSocket, { WebSocketServer } from 'ws'
import http from 'node:http'

const chunk1MB = Buffer.alloc(1024 * 1024).fill(1)

if (process.argv[2] === 'ws') {
  const wss = new WebSocketServer({ port: 3000 })
  wss.on('connection', ws => {
    ws.on('message', d => {
      if (d.length === 0) {
        process.exit()
      }
    })
  })
  const ws = new WebSocket('ws://127.0.0.1:3000')
  ws.on('open', () => {
    let todo = 1024
    const send = () => {
      if (--todo > 0) {
        ws.send(chunk1MB)
        setTimeout(send, 0)
      } else {
        ws.send(Buffer.alloc(0))
      }
    }
    send()
  })
} else {
  const server = http.createServer((req, res) => {
    let todo = 1024
    const send = () => {
      if (--todo > 0) {
        res.write(chunk1MB)
        setTimeout(send, 0)
      } else {
        res.end()
      }
    }
    send()
  })
  server.listen(3000)
  const req = http.get('http://127.0.0.1:3000', res => {
    res.on('data', () => {})
    res.on('end', () => {
      process.exit()
    })
  })
}

  
## 063-reverse-http.md

      
    Raw
  

              063-reverse-http.md
            
          
    Reverse HTTP

For reference, my attempt at a reverse http implementation (ptth):
import http from 'http'
import stream from 'stream'

const server = http.createServer((req, res) => {
  console.log('l1 received request')
  console.log('l1 tries turning into the client now')
  console.log('l1 request cid')
  req.socket.on('data', d => console.log('l1 receive', d.toString()))
  const cidReq = http.request({
    createConnection: () => req.socket,
    // createConnection: () => stream.Duplex.from({ writable: res, readable: req }),
    path: '/cid/CID'
  }, cidRes => {
    console.log('l1 received cid response')
  })
  cidReq.end()
})

server.listen(3000, () => {
  console.log('l1 make request')
  const req = http.request('http://localhost:3000', {
    method: 'POST'
  }, res => {
    console.log('l2 received response')
  })
  req.on('connect', () => {
    console.log('l2 established connection')
    console.log('l2 tries turning into the server now')
  })
  req.end()
})
$ node http.mjs
l1 make request
l1 received request
l1 tries turning into the client now
l1 request cid
node:events:505
      throw er; // Unhandled 'error' event
      ^

Error: Parse Error: Expected HTTP/
    at Socket.socketOnData (node:_http_client:494:22)
    at Socket.emit (node:events:527:28)
    at addChunk (node:internal/streams/readable:315:12)
    at readableAddChunk (node:internal/streams/readable:289:9)
    at Socket.Readable.push (node:internal/streams/readable:228:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23)
Emitted 'error' event on ClientRequest instance at:
    at Socket.socketOnData (node:_http_client:503:9)
    at Socket.emit (node:events:527:28)
    [... lines matching original stack trace ...]
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23) {
  bytesParsed: 0,
  code: 'HPE_INVALID_CONSTANT',
  reason: 'Expected HTTP/',
  rawPacket: Buffer(64) [Uint8Array] [
     71,  69,  84,  32,  47,  99, 105, 100,  47,  67,  73,  68,
     32,  72,  84,  84,  80,  47,  49,  46,  49,  13,  10,  72,
    111, 115, 116,  58,  32, 108, 111,  99,  97, 108, 104, 111,
    115, 116,  58,  56,  48,  13,  10,  67, 111, 110, 110, 101,
     99, 116, 105, 111, 110,  58,  32,  99, 108, 111, 115, 101,
     13,  10,  13,  10
  ]
}
Suggestions by Miro:

Regarding Reverse HTTP code snippet - I think the client (L2) needs to start with an HTTP/1.1 Upgrade request. The server (L2) responds with HTTP status 101 to confirm the upgrade. Only after this initial exchange you are in control of the TCP connection and can start setting up a reverse HTTP.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Protocol_upgrade_mechanism
Protocol upgrade spec: https://datatracker.ietf.org/doc/html/rfc7230#section-6.7
Registry of protocol names: https://datatracker.ietf.org/doc/html/rfc7230#section-8.6 and https://www.iana.org/assignments/http-upgrade-tokens/http-upgrade-tokens.xhtml
WebSocket’s use of protocol upgrade - for inspiration: https://datatracker.ietf.org/doc/html/rfc6455#section-1.2
Now the list of registered protocols for upgrade is very short (basically HTTP, HTTPS, websocket), it’s a question is whether ngix supports custom protocols.


## 08-problem-b.md

      
    Raw
  

              08-problem-b.md
            
          
    Problem: l1->L2 routing

Now that we have solved the load balancing aspect of L2->l1 routing, another problem arises:
Each L2 is connected to one l1. This means that l1s don't always have a working connection to the L2 that should be
selected by the consistent hashing algorithm, and can't help themselves but request the CID from the wrong L2, which will
then also hydrate itself.

  
      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
    L2'
  end
  L2 --> |register|l1
  L2' --> |register|l1'
  l1 --> |request CID|L2
  client --> |request CID|l1
  l1 --> |send CAR|client
  L2 --> |send CAR|l1
  
  linkStyle 2 stroke:red
  linkStyle 4 stroke:red
  linkStyle 5 stroke:red

    
      Loading

  
## 09-proxy.md

      
    Raw
  

              09-proxy.md
            
          
    Proposal: Single proxy

Add a singular proxy process that sits in between all l1s and their connected L2s. This proxy can forward request,
selecting the right L2s to contact, and also forward responses to the right l1.

  
      flowchart TB
  subgraph L1
    l1
    l1'
    proxy
  end
  subgraph L2s
    L2
    L2'
  end
  L2 --> |register|proxy
  L2' --> |register|proxy
  l1 --> |request CID|proxy
  proxy --> |request CID|L2
  client --> |request CID|l1
  l1 --> |send CAR|client
  L2 --> |send CAR|proxy
  proxy --> |send CAR|l1

    
      Loading

  
## 10-hash-routing.md

      
    Raw
  

              10-hash-routing.md
            
          
    Proposal: Hash routing

Route L1/L2 requests by CID path segment (GET /ipfs/CID & POST /data/CID), this way connecting the right l1s with L2s. This requires us to dith the node cluster approach and let nginx load balance between them.
Downside: l1s need to synchronize with each other, to all know of all connected L2s, and ask the with l1 to ask for a CID from the right L2.

  
      flowchart TB
  subgraph L1
    l1
    l1'
  end
  subgraph L2s
    L2
    L2'
  end
  L2 --> |register|l1
  L2' --> |register|l1
  l1 --> |request CID|L2
  client --> |request CID|l1
  l1 --> |send CAR|client
  L2 --> |send CAR|l1

    
      Loading

  
map $uri $myvar{
    default $uri;
    # pattern "Method1" + "/" + GUID + "/" + target parameter + "/" + HASH;
    "~*/Method1/(.*)/(.*)/(.*)$" $2;
    # pattern "Method2" + "/" + GUID + "/" + target parameter;
    "~*/Method2/(.*)/(.*)$" $2;
}


upstream backend {
    hash $myvar consistent;
    server s1:80;
    server s2:80;
}

  
## 11-single-node-process.md

      
    Raw
  

              11-single-node-process.md
            
          
    Proposal: Single Node.js process

If we switch from Node.js process per CPU core to a singular Node.js http process, then we don't need a routing layer for l1s at all.
This means incoming HTTP requests don't need to be routed to the correct l1, and also L2 responses don't need to be routed to the correct l1. This removes the need for persistent connections between l1s and L2s as well, so we can go back to the NLDJSON + HTTP POST protocol.
Downside: If we need to distribute workload across multiple CPU cores, we need to look into strategies like worker pools with shared array buffers, on top of which streaming data verification (and more work) is implemented.
For now, this shouldn't be a concern though.
Protocol

L2 registers, L1 requests data

GET htts://L1_URL/register/L2_ID
{"requestId":"REQUEST_ID","cid":"CID"}\n
{"requestId":"REQUEST_ID","cid":"CID"}\n
{"requestId":"REQUEST_ID","cid":"CID"}\n
\n                                      <- heartbeat
...

"requestId" is created by the client connecting to the L1, passed down to the L2. It will be attached by L1 and L2 to log lines created as part of the request response cycle, for ease of debugging.
L2 sends data

POST https://L1_URL/data/CID

Architecture


      flowchart TB
  subgraph L1
    l1
  end
  subgraph L2s
    L2
    L2'
  end
  L2 --> |register|l1
  L2' --> |register|l1
  l1 --> |request CID|L2
  client --> |request CID|l1
  l1 --> |send CAR|client
  L2 --> |send CAR|l1

    
      Loading

  
Potential improvements

Report L2 health through bidirectional ndjson protocol

POST htts://L1_URL/register/L2_ID
> {"ok":true,"cpu":0.8,"memory":0.2}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n
< {"requestId":"REQUEST_ID","cid":"CID"}\n