elliot42/gist:831b93883c9b33c1c53ddb855fab41cc

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    Managing side effects and coupling

Any system that's not a pure calculation has effects: one thing
causes another thing to happen, in causal temporal sequence.
There's four broad categories of how to do this:

Synchronously and tightly coupled
Synchronously and loosely coupled
Asynchronously and tightly coupled
Asynchronously and loosely coupled

Synchronous, tightly coupled effects

This is calling something directly, so that foo directly causes
bar by calling it.
def foo()
  puts "0"
  bar()
end

def bar()
  puts "1"
end
This is very simple but tightly coupled code:

foo necessarily needs to know about bar
bar needs to happen directly after foo;
foo cannot return until bar completes.

Code coupling is an obvious and well known problem: that in many cases
foo should not know about / include any references to the concept of
bar.  But still, code often becomes tightly coupled because like
this because foo needs to cause bar, and the easiest way is to
hardcode it in there.
A more complex problem that occurs in synchronous, tightly coupled
side-effecting code is that you end up with deeply connected call
chains.
A deeply connected call chain looks like this:
def foo()
  puts "0"
  bar()
end

def bar()
  puts "1"
  baz()
end

def baz()
  puts "2"
end
This has a lot of complexity of jumping through many deep hoops of
code.  Remember in this example that bar and baz are not part of
foo; they're just other separate things that are supposed to happen
after foo.  But foo itself cannot stand alone and return until
all of its chain of effects completes.
This becomes especially significant when you consider failures and
interruptions.  Exceptions in bar and baz can break foo.
The fundamental vulnerability in this situation is that foo needs
to directly reference bar, which can be both conceptually wrong
and systemically a source of complexity and bugs.  Synchronous
decoupling mechanisms can solve part of this prblem.
Synchronous, loosely coupled effects

Suppose you want to have foo happen, and then have bar happen.
There are at least two ways to do this in a decoupled manner, i.e
where foo then bar happens, but without foo knowing about it.
They both boil down to the caller having the secondary bar effect,
but through a layer of indirection that isolates foo from bar.
Managers

The first way to get foo then bar, without foo knowing about
bar, is to have a higher-level party coordinate their ordering:
def foo()
  puts "0"
end

def bar()
  puts "1"
end

def manager()
  foo()
  bar()
end
This is actually fairly clean, but requires defining a wrapper manager
around everything that you want to compose.  Furthermore manager becomes an
ill-defined concept because it's just "everything that happens in the
sequence"--imagine the test cases for manager, it's just the combination of
all the cases from foo and bar, which is often a big unrelated mess of
effects.
Also, technically speaking, this is sort of just moving the coupling
into manager, which must know about everything (even though foo does
not).
Observers/callbacks/events

The second major way to get synchronous, decoupled side effects is
with a pattern that is called observers, callbacks or effects.
class MyCallback
  def call
    puts "1"
  end
end


# style 1: callbacks internal to `foo`
def foo(callbacks)
  puts "0"

  callbacks.each do |callback|
    callback.call()
  end
end

foo([MyCallback.new])


# style 2: callbacks external to `foo`
def foo()
  puts "0"
end

def baz(callbacks)
  foo()
  
  callbacks.each do |callback|
    callback.call()
  end
end

baz([MyCallback.new])
Although these look extremely similar, you can argue that style 1,
where foo calls its effects even though it doesn't know what they
are, is "dependency inversion:" foo knows it has dependencies,
but it calls them in a generic manner without being hard-coupled
to how exactly they work.
Style 2, on the other hand, could arguably be "inversion of control,"
where foo does not even know it has effects, both foo and
the effects are triggered by the "framework" (baz in this case).
In otherwords, the functions in style 2 do not call each other,
they are all passively called by other code that is in control
(which actually makes this structurally similar to a "Manager"),
and it's this other code taking responsibility for the coordination
that allows the callees themselves to not know about each other.
In effect, all of the decoupling mechanisms above are the same: all of
the callbacks run without foo knowing anything about what they are
or how they work.
Synchronous decoupling gets you very far

As we can see, there's some very straightforward and effective ways to
sequence effects happening one after another, without each of the
effects needing to know about or be defined in terms of each other.
Where does this break down, or where would you ever need anything
different?
The primary answer is in synchronicity.  In all of the synchronous
examples above, the chain of effects starts because some caller knows
to call right now and set off the effects right now, essentially
within the same call stack.
However, not all effects happen in the same known call stack as an
existing caller.  This is where we can shift to asynchronous code,
and then see how that code can be tightly or loosely coupled.
Asynchronous, tightly coupled effects

Asynchronous code is not intrinsically decoupled code.  A simple example
is Sidekiq jobs:
class BarJob < ActiveJob::Base
  def perform
    puts "1"
  end
end

def foo
  BarJob.perform_later
end
Despite BarJob happening at a totally separate time, on potentially
a totally separate machine, nevertheless foo had to directly know
about it and call it.
Other types of async message-passing code can also have this coupling,
e.g. an AJAX POST call despite being completely asynchronous still
has to know and explicitly depend upon its callee.
Asynchronous, loosely coupled effects

Asynchronous code can be decoupled using just by keeping a single constraint
in mind: bar happens in a completely process at a later time, and thus needs
to be told or discover that foo happened.
This sounds a little general or mysterious, but it has a simple
solution in practice.  In practice, the most simple solution
is for foo to make a durable record of itself, that bar
will discover at a later time:
def foo
  puts "0"
  FooRecord.create!()
end

# style 1: "reactive"-code receives messages in a real-time
# evented/streaming fashion.  Assume you had some server that
# would receive and then respond to messages POSTed from `foo`
def bar
  receieve_messsage do |foo|
    puts "1"
  end
end


# style 2: "log"-style code handles log entries indefinitely later
# after they were written
# Assume you had some async job that would fire up every once in a while
# and read off the log of what had previously been written.
def bar
  FooRecord.each do |foo|
    puts "1"
  end
end
Note that you can do either style with an external framework
to further decouple bar from how bar is triggered:
def foo
  puts "0"
  FooRecord.create!()
end

def bar
  puts "1"
end

def baz
  FooRecord.each do |foo|
    bar()
  end
end
You'll notice that in the async case, it's not that there's no
coupling between foo and bar, but rather than foo knowing to
cause bar, bar operates on its own and either waits for evidence
that foo occured, or periodically looks up evidence about whether
foo has occurred.  Across process boundaries, you do seem to need
these bits of communication/coordination, otherwise with no
coordination then it's impossible for foo to cause bar directly or
indirectly.
In a certain sense, asynchronous decoupling looks quite similar to
synchronous decoupling, when the decoupling is done through inversion
of control in both cases.  In both sync decoupled and async decoupled,
some other party/framework handles linking foo and bar, without
them being hard-coded to cause each other.  In the asynchronous case,
additional machinery (the durable log entry) is required to reliably
coordinate across processes, time and machines--the DB or disk storing
the log serves as the coordinator between two otherwise completely
separate processes.
It's possible to make logs in a variety of different ways, from very
specific SQL-table-as-log entry, to generic log platform, etc., but
fundamentally they're all just using storage to durably communicate/
coordination across time and process boundaries.
Asynchronous vs. synchronous

So if you can decouple both synchronous and asynchronous code, what's
the difference between synchronous and asynchronous?  For better or
worse:


Asynchronous

More general; it can cover both synchronous and
asynchronous use cases
Albeit at a higher machinery cost (although this machinery
can largely be written once and reused--it just consists
of writing a log entry, and iterating through log entries)
Perhaps more naturally resilient in a large distributed system
since cross-process communication will naturally want to be
durable and restart-proof


Synchronous

More simple and default by far
Less machinery
Potentially more vulnerable to crashes if it's not constantly
storing state and recovering from crashes


Finally, perhaps the most central difference is literally that some
code cannot be written synchronously/"proactively", it must be
written asynchronously/reactively, e.g. a server waiting to receive
messages, because the reaction effects cannot occur
straightforwardly in the same call stack as the caller.
In whatever these required cases may be, then clearly one would have
to switch over to writing the best async code one could.  In other
cases perhaps then the safer default would be to stick with
synchronous code, unless for example one was in a programming
environment that was default/pervasively async and concurrent--Go and
Erlang certainly are built for first-class idiomatic usage of
concurrent CSP as the standard programming paradigm.  Ruby of course
is not.
The cases above should broadly cover the landscape of common
possibilities, so hopefuly this makes the decision matrix a little
simpler moving forward.