Was asked for a review of the mergeful
. These are my findings.
I don't think I'll look into the correctness of the code too much; I'll probably make more general observations, but who knows. What follows is an unfiltered list of my thoughts as they occurred to me.
- The code seems to expose a pure domain layer for the synchronization problem and not make any assumptions about networking or other such things.
- Serialization of types is left up to the user. Generic instances are exposed. That's nice.
- Unused dependencies? Aeson, containers, mtl, time. Plans to offer some basic serialization after all?
- Everything seems to be exported: always good :) Making code completely
private should never happen (although an
Internal
module can be nice) - Merge strategies would need some expansion. The only available strategy is
to discard problems. Add a combinator
(a -> a -> a)
to merge two values? Should probably also have a monadic variant? - Keeping the
clientState
during a merge seems like a bad default. That would block clients as long as the item isn't removed from the server. This means that a single client can lock others out. The opposite would lead to data loss though. Maybe there is no good default. - Maybe found a bug: The case where two clients send a delete at the same time
currently ends up in a desync for one client. Couldn't you keep the server
empty in
mergeful/src/Data/Mergeful/Item.hs:L292
and returnItemSyncResponseSuccesfullyDeleted
? - The protocol seems to break down when a client talks with multiple servers/sources of truth. Maybe it's a good idea to add an identifier to the server that the client fetches on start and subsequently sends with every request? That would allow the protocol to detect this.
MergeMismatch
should probably include more diagnostic information for error messages and such.- If you want to take this to production: the protocol needs a version to allow for changes/transitions. The server should probably also be able to tell the client that their protocol version is too old.
I like it; especially the fact that you kept the types simple, pure, and didn't go all in on type level state machine fanciness. Should be a cool showcase of domain modelling in Haskell.
Would be fun to see if it can generalize over collections of things, or to extend this to support P2P sync+gossip. I guess a simple version of that could be achieved with a static list of members with a fixed size vector clock. Dynamic membership would require interval tree clocks.