jehugaleahsa/MAC.md

## MAC.md

      
    Raw
  

              MAC.md
            
          
    Modern Architectural Concepts

Core Concepts

Dependency Injection

At the core of most modern software architecture is dependency injection. Rather than creating dependencies on-the-fly, all dependencies are injected into a class upon construction. This allows representations to be swapped out at runtime, making it easier to enable features based on a configuration file. This also enables the isolation of code, making it easier to test.
Dependency injection also puts a burden on the developer, making it clear when a class has too many dependencies and, therefore, responsibilities. This often acts as an early warning system that there is a design flaw. If taken seriously, DI can lead to smaller, more cohesive classes that are easier to test.
Below, you’ll find a section on dependency injection concepts that make modern architectures possible.
Onion Architecture

Historically, software has been broken into layers based on making it easier to swap out one layer with another implementation. Realistically, systems rarely see their entire data layer replaced or more than a single relational database. Additional user interfaces, however, do get tacked on from time-to-time. Due to their different rates of change, more emphasis has been placed on keeping business logic and UI concerns separate and less so on keeping out data layer concerns. That’s why we have MVC, MVVM and zillion other acronyms for UI separation and practically none for database separation. It is not uncommon for typical business logic to make frequent calls out to the database throughout a process or for business logic to appear in stored procedures, whereas performing logic in the UI is considered a cardinal sin by most developers.
With the rise of SOA and a plethora of data stores (e.g., NoSQL databases), there’s been a rise in the desire to keep the business logic independent of the data layer. The recent move toward microservice architectures, which call for smaller, reusable libraries, has also driven the need for more independent business logic.
The 3-tiered architecture applied the restriction that upper layers can know about the lower layers, not the other way around. However, it does not go into too much detail about how the lower layers should encapsulate their functionality. For that reason, 3-tiered architectures tend to expose too many details about the data layer to the business layer, making it unsuitable for modern architectures.
These days, users are rarely the only parties interacting with software. Now, the same system must respond to requests coming from other systems, via REST APIs or message queues. Systems not only have to send output back to users, but also inform other systems about important changes. In such a system, it becomes a convenient generalization to think of users as just another system. Even the database is just another system that needs updated.
Going the other direction, the database acts as a source of information. Another generalization is that the database can be treated like another type of input: it enriches the information coming from the user or other systems. Good design dictates that this extra information be in place before the core business logic executes. Otherwise, the business logic is cluttered with regular calls out to the database. Since this data needs to be in place before calling the business logic, this data enrichment process cannot be below the business layer.
The only natural (although, not obvious) thing to do is put the business logic at the center. The outer layers are responsible for querying the database before calling the business logic. The business logic calls out to services to update other systems. Queries go in, updates go out. Putting business logic in the center is called the Onion Architecture. While a simple concept, achieving it is extremely challenging. It is only possible by means of dependency injection, since the dependencies must be implemented in the outer layers. Below, I list some common, core actors involved in making the Onion Architecture even possible.
Unit of Work

One of the problems with interacting with multiple parties is that you must keep everyone in sync. Imagine an online ordering system that is processing an order. To do so, it must save the order to the database, send a confirmation email, schedule a shipment for pickup, update the inventory system, alert billing, etc. What would happen if an error occurred mid-way through the process? Would an email get sent to the user confirming their shipment and it not happen? Would inventory get out of sync? Would the user get their order for free? Obviously, these types of inconsistencies should not happen.
The Unit of Work pattern partially addresses these concerns. Message queuing solves for the rest. Historically, the Unit of Work pattern applied mostly to databases. Conceptually, a system implements the Unit of Work pattern by keeping track of changes to objects throughout a process. When the process completes, it looks at all the changes and converts them into a corresponding set of INSERT, UPDATE and DELETE commands that it sends to the database.
It is easy to confuse this with a database transaction, with rollback semantics. The difference is that Unit of Work avoids database interaction until the very end and it often uses a transaction to execute the update commands. The Unit of Work pattern can dramatically reduce the time in which a transaction is kept alive, since it only needs to be alive at the end of a process, when the commands are executed.
Most ORMs are designed to implement the Unit of Work pattern and take care of avoiding foreign-key constraint violations. Unfortunately, most ORMs do not expose their underlying change-tracking mechanism to be used by other code. This is unfortunate because other code, like logging which fields changed in an entity, could benefit from an out-of-the-box change tracking mechanism. Furthermore, it would be useful to track changes to objects that don’t necessarily correspond to database entities.
Consider a business process that sends an email to a user. Imagine, instead of the code immediately sending an email, it just took a note to send one later. Only after the database changes were saved would it actually send the email. By then, there’s a better guarantee that the process will succeed. Here, the note-taking mechanism is really an extension of the Unit of Work pattern. As business logic progresses, it “records” which messages need to be sent to other systems and the database. Once the business logic is done, the recordings are converted into actual messages (like the SQL commands created by ORMs) and are sent to the corresponding systems.
In software systems that need to communicate changes to multiple systems, guaranteeing messages get delivered, and only get sent after the business logic completes, is paramount. Often, such systems communicate using messaging queues which nearly guarantees a message will eventually be delivered. In some environments, database changes and message queuing can reside in a shared transaction. Even email can be put behind a message queue to ensure delivery. Transactions need to be short-lived to maintain application responsiveness, too, and Unit of Work helps by limiting the scope of the transaction until after the business logic completes.
Core Actors

Controller

The ASP.NET MVC and WebAPI frameworks expose HTTP end-points via controllers: classes that hide the details of converting HTTP requests into method calls. Request details are converted to method parameters (primarily identifiers and view models) or are exposed as values in the HttpContext class.
A general guideline is that controllers should perform no logic whatsoever, not even validation or error handling. Instead, controllers should immediately delegate work to classes in the adapters layer, specifically the adapters themselves.
Adapter

Adapters take on the role of inspecting user input and deciding what business logic to execute, such as submitting a request or updating a record, as well as what output to return to the UI.
Adapters often work closely with repositories and mappers to convert view models into the values that are passed to the business objects. View models are often composed of other view models or implement interfaces, to maximize the reuse of the mappers (see below).
Adapters are usually responsible for validating that the request makes sense and that all the necessary information is provided. However, it is important that they avoid performing validation that falls more in the realm of business rules. Generally, adapters should limit their validation to ensuring that only enough information is provided to execute the user action.
It is easy to accidentally include business logic or mapping logic in an adapter. A good guideline is to ask yourself how much code would be duplicated if the adapter needed to work with a completely different view model. Any code that isn’t view model validation or setting up business objects probably belongs elsewhere.
Entity

Business objects are only useful when they work with data. This data is usually hydrated from a relational database, but may come from other sources. The business logic should be isolated from data layer concerns, so this may require some gymnastics. Before the era of ORMs, it was nearly impossible to consistently expose relational data as a graph of interrelated objects. With ORMs, things are much easier: it’s to the point where applications can safely work directly with entities coming from the ORM. Entities are simple getter/setter objects corresponding to a row in a database table, with the additional of navigation properties to other entities.
OOP advocates warned against working directly with entities in the business logic. Instead, they proposed that entities be wrapped by or mapped to actual business objects. Instead of allowing any business object to modify entity properties, all interactions should be hidden behind method calls. In some cases this might be overkill; I would suggest first creating business objects for performing core functionality and only moving toward more granular business objects as needed.
View Models and Service Models

When sending data to other systems or to the user interface, the system will often expose that information using a simple getter/setter object. To avoid recalculating information in the user interface, the backend performs calculations and stores the results as simple primitive values in the view model. Whereas entities are usually minimal and normalized, models tend to be bulkier and denormalized.
Other systems represent information in different formats, so they expose their own API models (or service models). Typically, systems will identify records using different fields, have their own enumerations and use different formats, such as JSON or XML.
Mapper

With all these different representations of data, mappers are needed to convert business entities to view models and service models, and vice versa. My experience has been that they should work with fully-formed entities or business objects and do little interaction with the data layer. Mappers often employ other mappers, as models are often composed of other models.
Below is a common interface for mappers. It is often tempting to reuse the same mapper class for a single data/business entity to multiple UI and service models. However, I've found a separate class for each external system works best.
public interface IMapper<TFrom, To>
{
    TTo Map(TFrom source);

    void Update(TFrom destination, TTo source);
}
Notice that this interface would support converting an entity to a view model, but not vice versa. Instead a view model would be used to "update" an entity. During a "create" operation, the initial entity would be created simply via new or using a builder (see below), then updated.
It is important to design view models so as not to be recursive. For example, an Order view model might include a list of Order Item view models, each with a property going back to the parent Order. Be care to avoid this if using auto-mapping tools. Even when mapping manually, it’s very unlikely that the view model needs to be defined going in both directions. Remember, models are typically serialized to JSON or XML.
Notifier

In classic software architectures, the business layer sat on top of the data layer. In complex software, the data layer is often more than a single database. Rather, a single application may interact with a relational database, a handful of 3rd party APIs and more. Each may involve their own protocols, libraries and data models. Following the classic approach, the business logic was directly dependent on every new API that was introduced.
Just as the business logic should not deal with UI concerns, the same is true for data layer concerns. In the classic architecture, the business logic was usually responsible for converting business entities into data layer entities. Often, developers would create mappers that converted business entities into data layer entities to avoid over-exposing the business logic. These mappers were usually called from repositories. So, in classic architectures, mappers existed on both sides of the business logic.
A fundamental shift with the Onion architecture is that the business layer is at the core of the application. Outside the business layer is a services layer, exposed as repository interfaces. This means the mappers that convert business entities into service models are sitting side-by-side with the mappers that convert business entities into view models.
I call the repositories that communicate with other systems on behalf of the business logic Notifiers. Often, they work with mappers to convert the business entities into service models. Once the mapper performs the conversion, they call into the data layer via another repository. Calling both classes “repositories” or "adapters" would be confusing, that is why the business logic-friendly repository is called a notifier. The Onion architecture puts these classes side-by-side and it easy to draw parallels among them.
Often, it is undesirable to immediately notify an external system, in case an error occurs. Instead, notifications are tracked until a checkpoint occurs, at which point all pending changes are sent out to the external systems at once. Notifiers, therefore, can help to implement the Unit of Work pattern.
Repository

A repository is simply a façade that exposes interactions with the data layer behind a finite set of methods. Repositories called directly by the business logic should hide the underlying technology. Repositories that interact with a service should be defined in terms of the underlying technology.
A common mistake is to define repositories with methods taking complex types, such as delegates, or that return query builders (e.g., IQueryable) that inadvertently leak information about the underlying implementation. When designing repositories that interact with the business logic, a good question to ask is, “would the interface need to change if this were a REST API vs a database?”
It is also important for good exception handling that errors occur in the data layer. Query builders are convenient to avoid creating large repository classes. However, doing so results in errors happening in the business logic, where it may not be capable of coping with a technology-specific exception types.
It might not be entirely obvious, but relying on the Unit of Work pattern implies that the system should somehow track business entities being modified and coordinate changes to other systems automatically. For that reason, very rarely should repositories expose “update” methods. The same is not necessarily true for “insert” and “delete” operations; it depends on the context.
Builder

Historically, classes contain constructors to make sure that the class is initialized before it is first used. However, entities are simple classes, often without constructors. ORMs often leave navigation properties for collections null if they are not initialized, rather than setting them to empty collections. This allows us as developers to distinguish between being “uninitialized” and “having no related entities”.
Things are different when you are adding a new entity to the database. Here, we aren’t loading the entity from the database; it’s brand new. In that case, navigation properties for collections should be initialized to empty collections, rather than left null. Here we know there are no related entities, so an empty collection makes sense.
Going beyond navigation properties, many entities have “created on” dates or other properties that should be initialized upon creation. Other classes may be needed to set these properties, and these dependencies won’t be available in a simple entity’s constructor.
Repeating the logic for initializing a bare-bones entity is easy to repeat if you’re not careful. A good practice is to encapsulate this code in a Builder class. Builder classes can be simple methods taking whatever arguments are necessary to do an up-front initialization. Builders can be far more complex, implementing the actual Builder design pattern, where objects are built up and configured across multiple calls.
Builders work very well with mappers. Rather than having separate methods to get view models for create vs update, just pass the output of the builder to the mapper for the create view model. When the user actually submits a new entity, use the builder again and map the user input (view model) to the entity. This avoids duplicating the code that sets default values, putting it all in the builder. This also helps to limit mappers to two methods: one for creating a view model given an entity; another for updating an entity given a view model.
Dependency Injection Patterns

This section details ways dependency injection is used to solve challenging problems when building large-scale applications. These patterns rely on DI’s ability to manage the lifetime of objects. Most dependency injection frameworks allow you to configure how long an object stays alive. Common lifetimes include the lifetime of the entire application, the lifetime an HTTP request or the lifetime of a thread. Otherwise, objects are created anytime they are injected.
Recording and Synchronizing with Trackers

The Unit of Work pattern requires state – either tracking changes to entities or recording messages to be sent later. In large systems, business objects are connected to each other by a complex web of method calls. Passing around objects to record changes would be a serious burden. Worse, it exposes state management to classes that otherwise don’t care. So how do you share state among various objects without directly passing it around?
I like to create classes that implement two interfaces: Recorder and Synchronizer; I typically call them Trackers. The Recorder interface is responsible for capturing messages that need to be synchronized later. The Synchronizer loops through any messages captured with the Recorder interface, sends the message and then clears out messages so they don’t accidentally get sent twice. Classes needing to track changes inject the Recorder interface. Top-level code that needs to synchronize those messages inject the Synchronizer interface.
This only works if the Recorder and Synchronizer are implemented with the same underlying Tracker object throughout a lifetime. DI makes it easy to enforce that the same entity gets injected at various injection sites.
Again, it’s important to note that it’s best not to directly call “update” methods during normal business logic execution. Whenever possible, it’s better to detect changes after the business logic executes and record updates then. Therefore, it’s better to use Recorders as part of change detection rather than calling them directly throughout the business logic. Other times this is impractical and explicit calls to Recorders are necessary.
Limiting Memory Footprint

Tracking changes comes in the form of storing things in memory. Processes that involve 10,000’s of records can quickly fill up memory. Worse, synchronizing those changes to a database or service can take a long time if they all need synchronized at the end. Longer synchronization exposes the code to timeout errors and could result in locking a database for an unacceptably long time.
Most web applications don’t suffer this problem, since typical user requests are short-lived. ETL processes, on the other hand, are commonly long-running and involve a vast amount of data and processing. Most of these processes can benefit from breaking up records into “chunks”. For each chunk, a fresh DI lifetime should be created, creating new Unit of Work objects. This regularly shrinks the memory footprint and keeps the database responsive.
Curiously, the same pattern makes it easier to multi-thread applications. Creating a new DI lifetime for each object is bad for performance, as is dedicating an entire thread. Instead, breaking work up into chunks and processing those chunks in parallel often leads to more throughput.
Caching

Caching always sounds like an easy win. There’s typically a handful of database queries that return data that doesn’t change very often. Common candidates are the current user information and lookup tables. Unfortunately, caching often leads to more surprises than it’s usually worth and deciding on a cache invalidation strategy takes some experience.
Caching within a limited lifetime usually doesn’t pose as much of a problem. A good strategy is to create a read-only repository interface and a caching class that implements it. The caching class takes the actual repository as a dependency. Whenever you desire cached values, you inject the read-only interface. If you need to add new lookup values, inject the editable interface.
If a cache can be invalidated throughout the DI lifetime, the caching class may need to implement the editable repository interface. For example, if a new lookup value is added during a request, you might need to add it to the cache or simply refresh the cache the next time the repository is called. One thing to point out, however, is that your business logic should not be concerned with saving records to the database. Lookup tables are often searched using identifiers, which may not be generated until the new record is inserted into the database, which basically requires a database hit. Try to first process any new lookup records and then kick off the remainder of the logic. If possible, simply don’t allow creating lookup values as part of other processes.
Model Building Automation

Even though there are libraries available to define mappers, I generally prefer to write them by hand. This can be a lot of work for a large system and it’s easy to mistype. For the most part, implementing mappers is pretty boring: simply copying properties from one object to another. However, you can save yourself a lot of effort by reusing models as much as possible. It is usually beneficial to capture common, related fields and put them in a model.
Reusing other models means relying on other mappers. This can make the dependency list quite long. Worse, in complex systems, view models can be semi-recursive or two mappers depend on each other, depending on which methods are called. The DI framework will likely complain about a recursive dependency. A simple solution is to switch to the service locator pattern. Here, every mapper has the same dependency: the service locator.
This pattern can be simple: simply ask for the mapper you need by name. Other times, you might find it beneficial to define an IMapper<TEntity, TModel> interface. An abstract base class can be useful for defining helper methods that retrieve the necessary mapper and call its mapping method, simply given a source property. A common guideline when implementing this pattern is that a null should be handled by returning null.
To handle those “extra” properties that get tacked on occasionally, put them in a separate, wrapping model that houses the extra properties. Another option is to use inheritance. Either way, create a new mapper for the new model. Another option is to always include unneeded properties and simply leave them uninitialized, depending on arguments passed to the mapping method. Just remember when using an IMapper interface that there must be a unique DI binding for a given (TEntity, TModel) pair.