PabloMansanet/dependencies.md

## dependencies.md

      
    Raw
  

              dependencies.md
            
          
    A common pattern we use at BF is dependency injection through C++ abstract classes. This makes testing easy, but I have been wondering for a while if we can do better in how we implement the underlying principles. The conversation about Rust and mutability has helped me verbalize why I think C++ is limited when it comes to solving this problem.
Why dynamic dispatch?

At BF, we solve the problem of dependency injection with dynamic dispatch, even when the particular injection can be resolved at compile time. Here's a silly (and not great for reasons I'll explain later) example of what we sometimes do:

GpioInterface.hpp

enum class Pin {
   A,
   B
};

class GpioInterface {
public:
   virtual ~GpioInterface() {};
   virtual bool Read(Pin) const = 0;
   virtual void Write(bool, Pin) = 0;
};

Driver.hpp

#include "GpioInterface.hpp"

class Driver
{
public:
   Driver(GpioInterface&);
   void DoThingWithGpio(); // Writes pin A
private:
   GpioInterface& mGpio;
};
Later, Driver will be instantiated with a concrete Gpio or a MockGpio depending on context. This works fine, but  I think it's far from ideal. Here are some of the problems it causes:

Bloated constructors even when defaults are obvious.
Extra layer of indirection with small, but nontrivial runtime cost (vtable + indirection on all method calls).
No inlining (which results in significantly higher performance loss).
Security problems (vtables are exploitable).
Safety problems (we all know what happens when you forget a virtual destructor).
Boilerplate.
IDE hindering (as it can't reason what type you're dealing with beyond the interface).

Why not static dispatch?

I've thought before of using static dispatch (templates) to solve this problem. To expand the example:

Driver.hpp

template <class GPIO>
class Driver {
public:
   Driver(GPIO& gpio):
      mGpio(gpio)
   {}

   void DoThingWithGpio()
   {
      mGpio.Write(true, Pin::A);
   }

private:
   GPIO& mGpio;
};
This solves many of the problems above (and performs a lot better thanks to proper inlining) but has its own issues:

All code for the driver needs to be at header level, which causes visibility problems (exposed internals) and hurts compilation times.
Template instantiation errors. Enough said.
Code size explosion if a templated class with multiple type parameters needs to be instantied with several different type combinations.
Even though it solves vtable indirection and inlining, it doesn't solve all levels of indirection: we are still storing a reference that needs to be followed before calling any mGpio method, with the performance and safety problems it causes.

But, in my opinion, the biggest problem of all is the fact that C++ is (pardon the bad analogy) "duck typed" at the template level. To see what I mean, here's a really simple templated function:
template<class T>
T add(T left, T right)
{
	return left + right;
}
C++, as far as I know, has no way to explicitly state what "contract" a type needs to supply to satisfy a template. The function above seems very simple, but the reasoning the compiler has to make isn't. This won't instantiate for any types that don't implement "operator+". As the implicit contracts become more complicated, using templates for dependency injection would hurt habiltability since it won't be immediately obvious what's required of the types. Cue SFINAE being used as an actual desirable thing and it becomes unreadable. Disgusting, huh?
Note: there is some std::type_traits magic that, together with static_assert, could make these requirements more explicit, but it's limited on what it can do and not enforced at the language level. I'm also aware of C++20 concepts and constraints, but I also think they're limited in ways that I can't really fit here.
A last point to take into account: I've thought for a  long time that we misuse dependency injection when the goal is only testing and there is no need to share the dependency between multiple classes or to resolve types at runtime. The example above could just as easily not take a GPIO on construction and instead have its variants supplied in type form. e.g:

Driver.hpp

template <class GPIO>
class Driver {
public:
   void DoThingWithGpio()
   {
      mGpio.Write(true, A);
   }

private:
   GPIO mGpio;
};
and then, to specify variants in a way that doesn't store references to arbitrary bits of memory everywhere:
// For testing
Driver<MockGpio> mDriver;

// For realz
Driver<Gpio> mDriver;

The risk of the above, of course, is that there would be multiple Gpio object
variants, hopefully but not necessarily behaving when it comes to accessing
the hardware.
Rust

Both approaches, static and dynamic, are possible in Rust, though it requires more work to satisfy the borrow checker. However, they are both also more idiomatic, and the choice of which to use comes down to the requirements of the application (knowledge at runtime vs compile time of the specific type).
The kind of problem we are trying to solve at Bluefruit would best be solved with static dispatch. Here's how the example driver above could look in Rust:

gpio.rs (equivalent to the interface in C++; traits perform roughly the same
role as abstract classes, and I say roughly because there's no vtable or indirection unless you opt in)

pub enum Pin {
    A,
    B
}

pub trait Gpio {
    fn read(&self, pin: Pin) -> bool;
    fn write(&mut self, pin: Pin, value: bool);
}

driver.rs:

struct Driver<G: Gpio> {
    gpio: G,
}

impl<G: Gpio> Driver<G> {
    fn do_thing_with_gpio(&mut self) {
        self.gpio.write(Pin::A, true);
    }

    fn new(gpio: G) -> Self {
        Self { gpio }
    }
}
And then, to instantiate variants (where mock_gpio and gpio are concrete structs
that implement the trait Gpio):
// For testing
let driver = Driver::new(mock_gpio);

// For realz
let driver = Driver::new(gpio);
Don't be confused by the "new"; it has nothing to do with the heap. That's just
conventional name for an explicit constructor that initializes a Driver on the stack.
This solution solves all the bullet points from the C++ ones above. There's no
header/source separation in Rust, so the static dispatch version doesn't
increase compilation time or affect habitability. Everything can be inlined,
testing is as easy as passing a different struct that implements Gpio, and all
contracts to be fulfilled by the generic type are clear (Due to the <G: Gpio>
trait bound) which means that error messages make sense.
If there is a need for dynamic dispatch (e.g. dependency injection is needed at
runtime for whatever reason) it can be done easily as well with trait objects,
but I won't go into how here. The important bit of this post comes from a very
valid question asked by Matthew/Andi in the #rust channel. Paraphasing:
How do we handle injecting the same object in multiple places?

Let's say that a certain driver needs to write to pin A, and another class needs
to write to pin B. The way we've designed the Gpio class, we could solve it by
constructing a single Gpio and injecting it on both places. This is trivial to
do in c++:
Gpio gpio;
DriverA driverA(&gpio); // will only write to channel A
DriverB driverB(&gpio); // Will only write to channel B

However, In rust we'd quickly run into problems:
let gpio = Gpio;
let driver_a = DriverA::new(gpio);
let driver_b = DriverB::new(gpio); // Error! Gpio owned by driver_a
We could try to rewrite our class so it takes a reference to Gpio on
construction, like this:
fn new(gpio: &G) -> Self {
  Self { gpio }
}
But we'd run into pain with lifetime specifiers, and ultimately it wouldn't work
since we'd need a mutable reference for the write call, in which case we could
try this:
fn new(gpio: &mut G) -> Self {
   Self { gpio }
}
Even if we get the lifetime specifiers correct for this one, it wouldn't work
either because there can't be two mutable references to the same object. So what
gives? Is it just impossible to inject a class in two places?
Well... It's not, but Rust will ask you to explain exactly what you mean. For
example, you'll
need to wrap it inside a Mutex (provides synchronized access), a reference counter
(Rc or Arc, depending on your threading needs), a Cell/RefCell (deferring borrow
checking to runtime)... etc. Ultimately, there is a tool for every possible approach,
including raw unsafe pointers to replicate the C++ approach of simply holding a
reference. Please don't do that last one.
The thing is, when you see yourself wondering what combination of
ArcMutexRefcell you need to use, that's the borrow checker's way to tell you
that you're doing something wrong with your design. I have to admit I have been
a bit underhanded with the example I've used, because there is a much better way
to design that GPIO driver in both C++ and Rust. Consider this interface variant:
enum class Pin {
   A,
   B
};

class GpioInterface {
public:
   virtual ~GpioInterface() {};
   virtual bool Read() const = 0;
   virtual void Write(bool)= 0;
   virtual Pin GetPin() const = 0;
};
Here, a Gpio driver corresponds to a single pin and there is no need to share a
class between multiple owners. Doing this, the only need for dependency
injection is testing, and not sharing. Out of habit, in C++ we'd probably use
an interface anyway in order to satisfy our testing needs, but the Rust
equivalent falls into place without any need for injection:
impl<G: Gpio> DriverA<G> {
    fn do_thing_with_gpio(&mut self) {
        self.gpio.write(true);
    }

    fn new() -> Self {
        Self { gpio: G::new(Pin::A) }
    }
}
And to initialize the drivers:
let driver_a = DriverA<Gpio>::new();
let driver_b = DriverB<Gpio>::new();
let driver_a_testing = DriverB<MockGpio>::new();
Note how the knowledge of what channel to use lies in the driver now, and it's
passed down on construction of the Gpio struct, which will be a thin wrapper
around a single pin (it can statically check that it's a singleton over the
hardware resource, but that's a story for another day).
I reckon that most situations where we feel we need multiple injection outside
of testing can be solved in this way, by delineating responsibilities and
ensuring that a single object owns a single resource. Modules that
really require multiple write access (such as a logger on top of a flash storage driver) will be few and far between, and they can either be given their own context
properly communicated via message passing, or wrapped in the appropriate
synchronization/safety primitives.
Since I've started using Rust, I've been disliking the way we use dependency
injection more, and regardless of whether we use Rust or not, I want to
challenge the assumptions that get us to write an interface class for anything
we simply need to mock, regardless of whether it needs to be shared or resolved
at runtime. It is very dangerous to have references to remote parts of the
codebase living in every class we instantiate, and the availability of every
object everywhere thanks to its interface can make us lazy designers. Rust or
not, let's do better! :)