-
-
Save nandilugio/0f1113880afe8074e5745e576b4ff32d to your computer and use it in GitHub Desktop.
## Sync framework ############################################################ | |
# To be added to models that can be part of a synced group of entities | |
# AR implementation. Other could be used if API is preserved. | |
module Syncable | |
attr_accessor :sync_group_id, :syncer | |
after_update do | |
SyncGroup.find(sync_group_id).touch! if changed.include?(@_ssynced_attrs) | |
end | |
def find_by_sync_group_id(id) | |
end | |
def self.synced_with(syncer_class) | |
syncer = syncer_class | |
end | |
def self.synced_attrs(attrs) | |
@_synced_attrs = attrs | |
end | |
# Registers the synced group of local instances mapping to a NS entity instance | |
# AR implementation. Other could be used if API is preserved. | |
class SyncGroup < ActiveRecord::Base | |
attr_accessor :sync_group_id, :syncer, :touched_at, :synced_at | |
def touch! | |
touched_at = Time.now | |
end | |
def mark_synced! | |
synced_at = Time.now | |
end | |
end | |
# Base for integrations | |
class Syncer | |
# "From now on, sync these instances as a X on the remote" | |
def sync(*attrs) | |
thing = find(*attrs) | |
group = group_dependencies(thing) | |
ns_thing = translate(thing) | |
create!(ns_thing, group) | |
end | |
# Sync all instances that have been updated since last sync | |
def update_all_pending! | |
group_ids = SyncGroup | |
.where(syncer: self.class) | |
.where("touched_at > synced_at") | |
.pluck(:id) | |
group_ids.each do |group_id| | |
update_group!(group_id) | |
end | |
end | |
private | |
def group_dependencies(thing) | |
group = new_group | |
sync_group_members(thing).each do |syncable_member| | |
syncable_member.sync_group_id = group.id | |
end | |
group | |
end | |
def new_group | |
SyncGroup.create!(syncer: self.class) | |
end | |
def update_group!(sync_group_id) | |
thing = find_by_sync_group_id(sync_group_id) | |
ns_thing = translate(thing) | |
update!(ns_thing) | |
end | |
# The following operations (CRUD) can be refactored to be done asynchronously: | |
# - Sidekiq: doesn't ensure order: pass thing.id and translate to ns_thing in | |
# worker to ensure last version is synced? probably not enough if we need to | |
# keep order of events. | |
# - Implement a queue: low throughput, postgres is enough. A sidekiq worker | |
# (unique? self-enqueuing?) can consume it to avoid a single-point-of-failure. | |
def create!(ns_thing) | |
create_remote(ns_thing) | |
group.mark_synced! | |
end | |
def update!(ns_thing) | |
update_remote(ns_thing) | |
group.mark_synced! | |
end | |
end | |
## Local model ################################################################# | |
class Thing | |
has_one :thing_dep | |
include Syncable | |
synced_with NetSuiteSomethingSyncer # See below (*) | |
synced_attrs :x, :y | |
end | |
class ThingDep | |
belongs_to :thing | |
include Syncable | |
synced_with NetSuiteSomethingSyncer # See below (*) | |
synced_attrs :x, :y | |
end | |
# (*) Thing and its deps map to a Something entity in NS | |
## Netsuite integration ######################################################## | |
# The following encapsulate NetSuite specific logic: | |
# - Network communication | |
# - Finding info on local model | |
# - Translation of that info to NetSuite's corresponding model | |
# Only needed if CRUD operations are done the same way for all NS objects | |
class NetSuiteSyncer < Syncer | |
def create_remote(ns_thing) | |
ns_thing.add | |
end | |
def update_remote(ns_thing) | |
ns_thing.update | |
end | |
end | |
# Thing and its deps map to a Something entity in NS | |
class NetSuiteSomethingSyncer < NetSuiteSyncer | |
def find(thing) | |
thing.includes(ThingDep) | |
end | |
def find_by_sync_group_id(id) | |
Thing.find_by_sync_group_id(id) | |
end | |
def translate(thing) | |
# Could use a NetSuiteSomethingTraslator | |
NetSuiteGem::Entity::Something.new({x: thing.x, y: thing.y + 1}) | |
end | |
def sync_group_members(thing) | |
[thing, thing.thing_dep] # Syncables | |
end | |
end | |
## Business logic (controllers,etc.) ########################################### | |
class ThingCreationService | |
def create(*attrs) | |
thing = Thing.new(*attrs) | |
thing.thing_dep = ThingDep.new(*attrs) | |
thing.save! | |
mail_user(thing) | |
NetSuiteSomethingSyncer.new.sync(thing) | |
end | |
end | |
# NetSuiteSomethingSyncer#sync registers instance for syncing and does NS create | |
# We could easily modify the proposal to defer the NS create | |
thing = ThingCreationService.create(...) | |
# Marks group as dirty | |
thing.thing_dep.update_attributes!(x: 345) | |
# This could be a recurrent worker, a self-enqueuing one, we could enqueue | |
# on-update, etc. | |
class NetSuiteSyncWorker | |
def perform | |
# Performs NS update | |
NetSuiteSomethingSyncer.new.update_all_pending! | |
end | |
end |
I was definitely missing background to fully grasp your proposal xD. Glad we concur on all this, my example in pseudo-code tried to clarify all the concepts. Some things:
- Regarding my point 3) before, even if we have entities in netsuite like
SalesOrder
that maps to multiple models in our app (because it includesBillingInfo
etc.), can't we see this as composition? Meaning, in this example, we have one "primary" entity in our app that maps toSalesOrder
, which isOrder
. And in the mapping function of that entity there are multiple fields that we gather from other models. Same as in my pseudo-code example of User, here we would have:
def netsuite_representation(order)
{
ref: order.ref,
billing_country: order.billing_info.country,
# etc...
}
end
This makes the sync group complexity unnecessary? Or are there more use cases?
Also, we have entities that do not map in conceptual meaning
Maybe then the solution is to change our app as well, introducing FiscalIdentity
again! Then both models (ours and netsuite's) will match more closely and this extra complexities in syncing will dissapear.
- About the syncing strategy
True. Do you feel this proposal fulfills those needs?
Yes, no problem with that. I see you use 2 timestamps instead of one boolean to control the syncing status, implementation detail, it's the same. But use meaningful names "last_netsuite_synced_at" instead of "synced_at" xD.
-
About 5). Yes, it's true we don't need md5 hashes if we use specific observers only on the adequate fields. However, purely speaking you're assuming that a change in the input of the ETL function will result in a different output, which could not be always true. But in practice I would definitely not implement any hashing strategy to avoid extra syncing if we use specific-observers.
-
I will elaborate a bit on my old work of "dr_manhattan" to see it we can use it here.
Regarding my point 3) before, even if we have entities in netsuite like SalesOrder that maps to multiple models in our app (because it includes BillingInfo etc.), can't we see this as composition? ...
This makes the sync group complexity unnecessary? Or are there more use cases?
Yes, the transformation or #translate
function will do exactly that, but in order to get the touched_at
timestamp updated when that BillingInfo
dependency changes, we need to keep track of those.
In fact there is something I don't 100% like there but see no magic we can do to solve cleanly: The developer has the responsibility to be explicit about the dependencies that can change the output of the transformation function so there are 2 things that have to be manually synchronized while developing: the dependencies declared in #sync_group_members
and the ones used in #translate
.
I've also just noticed that the inclusion of Syncable
already offers the info #sync_group_members
returns, so specifying that by hand is probably redundant. Maybe some magic there is not that difficult to implement but need to think about this a bit. Anyway, these are implementation details only. We'll solve these if we agree on developing this.
Maybe then the solution is to change our app as well, introducing FiscalIdentity again! Then both models (ours and netsuite's) will match more closely and this extra complexities in syncing will dissapear.
Yes, in fact @Laura is bringing it back from the dead 👻 It's named FiscalCustomer
now :p. See #4403.
But use meaningful names "last_netsuite_synced_at" instead of "synced_at" xD.
Nah man, this solution transcends NetSuite ;p heheh. In fact, it could be used for SalesForce or whatever.
I will elaborate a bit on my old work of "dr_manhattan" to see it we can use it here.
Looking forward to see it!
Aja! ladies first @Laura:
Totally true. That was the original idea. Code now reflects that
Also true, forgot! Updated the code :)
Good point. If we happen to need this, a multi-group version of the
Syncable
module should be easy to do. 👍Yup. That's discussed in the "About asynchronous implementation and queues" section of the 1st comment.
Yes, that's why I've left that out of the proposal for now. If the sync operation is only one-directional (local -> remote) and no state/event sequence is to be maintained, the proposed solution above should suffice. I've enclosed the update in a "worker" for clarity above.
Not sure what you mean here. Are you referring to the initialization of the netsuite entity instance so it retrieves the defaults from the remote end? if so, I guess that's a problem that we could attack on the "translation" function.
It can perform that initialization before creating the instance. We could even try to cache that call for defaults or even manually (and perhaps dangerously) configure those same defaults locally, but assuming syncing is going to be done asynchronously, we can simply fail and retry on that request.
Anyway, that's a problem particular to NetSuite and should not affect the structure of the proposal here.
Now @Roger:
also
Yes, this is an important point. That transformation function / ETL corresponds to the
#translate
method above. That's what I've meant with "not to have this map living in our database in any way". We need the data to be what it is (our model), and there's no need to persist the result of the transformation since it can always be obtained. Unless of course, we want to maintain the event sequence, in which my proposal is to persist that transformation output in a command added to a queue.Agree, yes +1
Well, @Laura please confirm this please but if I'm not mistaken
SalesOrder
maps to anOrder
and things like it'sBillingInfo
,ShippingInfo
, etc. If any of those deps change, theSalesOrder
should get the update. That's why we need aSyncGroup
(orSyncMap
? ;p).LineItem
s would probably fall in that same category.Not sure if we already have an example of the inverse (one local to many remotes) but I guess that could be easily managed by the implementation of the CRUD methods for that particular syncer.
Also, we have entities that do not map in conceptual meaning. A
Customer
is not defined exactly the same in all scopes. Marketing wants a customer entity that groups all possible orders and contacts a particular client organization has. That's probably what we want in SalesForce and the DW (well, we want everything on the DW ;p). But NetSuite needs all "Fiscal Identities" (dejavú!) billed by us. @Laura is refactoring our model to have both matches but I wonder if that couldn't be also abstracted as an ETL function.True. Do you feel this proposal fulfills those needs?
If we'd like to have an idempotent sync process and be able to resync from any state, calculating hashes for every instance can be very expensive, and that cost will increase linearly with the contents of the DB. Relying on the timestamps of the group and the command queue we get those benefits without that cost.
(see above replies)
(above)
True true!! renamed the var to
touched_at
. Some implementation around that was missing. Thx @Laura!"Replaced", or "implemented by"? ;p
Yes. I think we're on the same page. The details @Laura pointed out where probably misleading enough for you to feel we were not but please re-read the updated code above with this in mind and comment again if you still feel this proposal is not aligned with your ideas.
NOTE: Also trimmed some corners on the 1st comment. Search for "EDIT: " there :)