ambientlight/rfc_offline_first_backend_arch.md

## rfc_offline_first_backend_arch.md

      
    Raw
  

              rfc_offline_first_backend_arch.md
            
          
    RFC: Organic backend architecture for offline-first application via logux

Goal

Lay out the base ground-work for distributed backend system architecture designed to support offline usecases that involve synchronization of client application during the events when client applications are offline for a periods that exceed days or weeks. Additionally the architecture is meant to support easy-to-replicate offline servers, which act as ad-hoc (edge) synchronization endpoints that are capable of supporting a full range of available business-specific functionality, thus making clients unaware of unexisting internet connectivity between ad-hoc and cloud-hosted server.
Non-goal

This RFC is not meant to describe any application-specific business logic (another one for this) such as features provided by solution hosted in such backend system, entity-relationship diagrams modeled in DBs etc.
Basis

The base area of exploration / experimentation lies around Logux which conceptually can be viewed as redux action stream synchronized via keep-alive websocket connection. One assumption here is logux can be a convenient protocol to operate with data structures that do not create merge conflicts by design, known as CRDTs (conflict-free replicated data type), originally described in https://hal.inria.fr/inria-00609399v1/document , these CRDTs have not yet made way into logux open-source tooling which is where such experimentation / exploration should start.
Another topic to explore involves digging into scalability potential of such logux-centered solutions, existing server implementation comes in two flavours: as websocket server and as a reverse-proxy that wraps REST endpoint. (I had a feeling that commnication over websocket might not be always required as long as live-collobaration application scenarios are out of the question, but special care might need to be taked care of if there is logic bound to pinging clients that is used in timestamp syncing) But experimentation with pull-based clients that do not re
quire websockets can make such solutions potentially easier to scale. As far as often-offline clients are concerned and business logic does not rely on live-collaboration, synching can be done via APN / FCM / or service-worker push notifications triggers initiated hooks resulting in subsequent pulls hidden under You have new updates ribbon on the clients, as all of these need to visible to end-users. As there is no RESTful http client implementation of logux-like protocol it would be great to experiment to see if logux fits well with such rest-based communication. Another point to explore involves a possibility of such system having large and complex object graphs that get hard to reason with on the clients side they require a lot of explicit CRUD-like things (huge amount of state-transitions, maybe you saw previously the situation when you open redux-dev-tools and the whole thing freezes because actions got dispatched too frequently) so even if logux-clients erase the boilerplate in the same way as various GraphQL clients do in bridging APIs with state updates, that doesn't solve the case when huge action chains get accumulated on the clients and then are needed to be synched at the same time. (Imagine the case of few thousands of offline clients that were offline for weeks suddenly becoming online at the same time, in such case such systems needs to scale out instantly for storm-term peeks or slow down clients intentionally)
The last point to mention here involves a logux-centric persistence. I am not throwing any assumptions here due to my superficial knowledge for all this here, but it can be anything, a SQL / NoSQL, triple-store, property graph or anything that we can imagine. My first wild idea builds on top of concept where server-nodes have the same partition with databases but brought to micro-container/micro-vm environments such as AWS Lambda. A hypothetical somehow wild idea involves having in-memory SQLLite inside lambda that are initiated with partial view of the whole datastore heuristically, as the clients communicate with those lambdas they accumulate more data just as caches do, and then synch this data through centarilized pubsub broker among themselves as well as central data store. There are lots of unanswered questions here, how to partion all this? also assuming in-memory queries are ok, we will anyway incurring extra costs when we have query-misses in those lambdas, as they need to fail-through to the central nodes. To note, this can done more traditionally with container-based archs on top of Kubertenes where more compute-intensive nodes can store heavier and larger partitions, but the point is having it downscaled into micro-vms such as lambdas.
TODOS


Implement few CRDTs, test them with logux server, see how it looks and feels like.
Hack around the logux core, or just make a naive implementation of the protocol, see how it goes without websockets.
Simulate absurd action chains on the clients and then synch many of them at the same time for such solutions, if we split those in chunks lambdas will scale out instantly, if there is some form of 4 there, database nodes will scale out as well.
Experiment with this wild in-memory database nodes synchronization across lambdas with failthrough to central DB node.

Disclaimer

You might feel I am overengineering here, but that's the purpose of this first exploration around logux.