duijf/REVIEW.md

## REVIEW.md

      
    Raw
  

              REVIEW.md
            
          
    Review

Was asked for a review of the mergeful. These are my findings.
Approach and scope

I don't think I'll look into the correctness of the code too much; I'll
probably make more general observations, but who knows. What follows is an
unfiltered list of my thoughts as they occurred to me.
Findings


The code seems to expose a pure domain layer for the synchronization problem
and not make any assumptions about networking or other such things.
Serialization of types is left up to the user. Generic instances are
exposed. That's nice.
Unused dependencies? Aeson, containers, mtl, time. Plans to offer some basic
serialization after all?
Everything seems to be exported: always good :) Making code completely
private should never happen (although an Internal module can be nice)
Merge strategies would need some expansion. The only available strategy is
to discard problems. Add a combinator (a -> a -> a) to merge two values?
Should probably also have a monadic variant?
Keeping the clientState during a merge seems like a bad default. That
would block clients as long as the item isn't removed from the server.
This means that a single client can lock others out. The opposite would lead
to data loss though. Maybe there is no good default.
Maybe found a bug: The case where two clients send a delete at the same time
currently ends up in a desync for one client. Couldn't you keep the server
empty in mergeful/src/Data/Mergeful/Item.hs:L292 and return
ItemSyncResponseSuccesfullyDeleted?
The protocol seems to break down when a client talks with multiple
servers/sources of truth. Maybe it's a good idea to add an identifier to the
server that the client fetches on start and subsequently sends with every
request? That would allow the protocol to detect this.
MergeMismatch should probably include more diagnostic information for
error messages and such.
If you want to take this to production: the protocol needs a version to
allow for changes/transitions. The server should probably also be able to
tell the client that their protocol version is too old.

Summary

I like it; especially the fact that you kept the types simple, pure, and didn't
go all in on type level state machine fanciness. Should be a cool showcase of
domain modelling in Haskell.
Would be fun to see if it can generalize over collections of things, or to
extend this to support P2P sync+gossip. I guess a simple version of that could
be achieved with a static list of members with a fixed size vector clock.
Dynamic membership would require interval tree clocks.