Skip to content

Instantly share code, notes, and snippets.

@jaceklaskowski
Last active March 31, 2024 03:06
Show Gist options
  • Save jaceklaskowski/d267bf4176822293e95e to your computer and use it in GitHub Desktop.
Save jaceklaskowski/d267bf4176822293e95e to your computer and use it in GitHub Desktop.
Rough Notes about CQRS and ES

Rough Notes about CQRS and ES

Once upon a time…

I once took notes (almost sentence by sentence with not much editing) about the architectural design concepts - Command and Query Responsibility Segregation (CQRS) and Event Sourcing (ES) - from a presentation of Greg Young and published it as a gist (with the times when a given sentence was heard).

I then found other summaries of the talk and the gist has since been growing up. See the revisions to know the changes and where they came from (aka the sources).

It seems inevitable to throw Domain Driven Design (DDD) in to the mix.

The idea of the document is to collect required information on the topic and instead of keeping it hidden from the public (and possibly making beginner mistakes without noticing it), share it pro public bono. Who knows how it may help others?

Sources

Terminology

  • software architecture = how to design, implement and build the application: patterns, practices, methodologies, tools and technologies

Notes

Twitter stream

  • CQRS and ES are just one way to implement a domain model. DDD is about collaboration and shared visions

CQRS Documents by Greg Young

CQRS Documents by Greg Young

  • every method should either be
    • a command that performs an action
    • a query that returns data to the caller
    • not both
  • methods should return a value only if they are referentially transparent and hence possess no side effects
  • In CQRS objects are split into two objects, one containing the Commands and one containing the Queries
  • Applying CQRS on the CustomerService would result in two services
    • CustomerWriteService
      • MakeCustomerPreferred(CustomerId)
      • ChangeCustomerLocale(CustomerId,
      • CreateCustomer(Customer)
      • EditCustomerDetails(CustomerDetails)
    • CustomerReadService
      • GetCustomer(CustomerId)
      • GetCustomersWithName(Name)
      • GetPreferredCustomers()
  • two separate services = a read side and a write side or the Command side and the Query side
    • the Command side and the Query side have very different needs
    • the Query side will only contain the methods for getting data
  • Events are a well-known integration pattern and offer the best mechanism for model synchronisation
  • many systems did not store current state
    • especially true in high performance, mission critical, and/or highly secure systems.
  • Domain Event = An event is something that has happened in the past
  • All events should be represented as verbs in the past tense such as CustomerRelocated, CargoShipped, or InventoryLossageRecorded
    • stick with the usage of verbs in the past tense when creating Domain Events
  • Commands have an intent of asking the system to perform an operation
  • Events are a recording of the action that occurred
    • All of these things are in the past tense, they have already happened and cannot be undone.
  • Deleting information?
    • Impossible as previously jump into the time machine and say that an event never happened
    • model explicitly as a new transaction
    • There are also architectural benefits to not deleting data. The storage system becomes an additive only architecture, it is well known that append-only architectures distribute more easily than updating architectures because there are far fewer locks to deal with.
  • the storage of events
  • Sharding = Horizontal Partitioning
    • the same schema will exist in many places and some key within the data will be used to determine in which of the places the data will exist
  • A Rolling Snapshot is a denormalization of the current state of an aggregate at a given point in time.
    • represents the state when all events to that point in time have been replayed
    • a heuristic to prevent the need to load all events for the entire history of an aggregate.
    • The problem that exists is that there may be a very large number of events between the beginning of time and the current point
    • only play the events from that point in time forward in order to load the Aggregate
    • By having the state of that graph at that point in time replaying all the events prior to that snapshot can be avoided
  • other types of queries that are becoming more and more popular in business, they focus on the how.
  • store what the system actually did as opposed to what the current state of data is.
  • it is easiest to build the Event Storage in an existing technology such as a RDBMS

Using an RDBMS as event sourcing storage

Using an RDBMS as event sourcing storage

  • The event store should not need to know about the specific fields or properties of events
  • Serialize everything down to a blob and store it that way
  • the table "Events" stores the related data as a CLOB (i.e. JSON or XML)
  • one generic "Events” table

CQRS Info

CQRS Info

  • Querying a log of events
  • Two separate types of datastores in an ES-savvy application:
    • Command = command model => log database = journal
    • Query = query model => query datastores
  • Event-sourced CQRS - commands reflected in a journal and propagated to query data stores via events
  • Command = increment vs Event = incremented
    • past tense, irrefutable, already happened
      • ConcertCreated
      • TicketsBought
  • delta-based
    • how much done
    • NOT how much left = no derived info
  • event metadata
    • who/when
  • Avro and Protobuf for serialization while saving to journal

Greg Young - CQRS and Event Sourcing - Code on the Beach 2014

Greg Young - CQRS and Event Sourcing - Code on the Beach 2014

  • You can use CQRS without Event Sourcing, but with Event Sourcing you must use CQRS
  • audit log and be able to prove its correctness
    • a deterministic system with an audit log
    • common in many regulated industries
    • compare time periods to time periods deterministically
  • your account balance is the first level derivative off of the facts on your account
  • current state is transient = it’s the first-level derivative
    • not about memory
    • I can delete and rebuilt it
  • events = facts
  • 8:51 event sourcing is all about the storing facts and at any time you’ve state structural models that are first level derivatives off of your facts and they’re transient
  • 10:28 Structure changes more often than behaviour
  • 10:32 Your use cases of a system tend to be reasonably stable over a long period of time.
  • 11:14 Accountants do not do this unless they worked for Enron
  • 11:21 You do not erase something in the middle of your ledger. This is highly illegal.
  • 11:26 If you took in a class in accounting you've probably been told that accountants don’t use pencils - they use pens. The same can be said about event sourcing.
  • 11:36 When we talk about event sourcing you can never ever update an event. And you can never delete an event.
  • 12:17 If I can take all of your data and I can put it on microSD you don’t have a big data
  • 12:23 if I can put all of your data on microSD you do not need to worry ever about deleting any data (providing your system has been running for more than over a year)
  • 12:55 If you’re going to be faster than Moore’s law then you don’t have to worry about the amount of data you have. Your data will just continually get cheaper and cheaper and cheaper for you to store.
  • 13:38 For most of you do not worry about deleting events.
  • 15:06 Just like in a ledger you’re not allowed to ever go back and update something. You can only add new things. You can do corrections.
  • 16:24 In most places I find it’s the business to drive the use of event sourcing not the technology reasons
  • 16:32 And the reason why people are using it is that it's the only model that does not lose information
  • 17:40 How did you make the decision to destroy your data? You personally didn't feel it’s valuable? Or maybe you didn’t think about it?
  • 17:50 data is massively valuable
  • 35:29 Rolling snapshots = snapshot the state as it was at the point in time for the projection. With this, you can replay from a snapshot forward.
  • 35:51 Event sourcing is functional data storage mechanism
  • 36:02 Current state is a left fold of previous behaviours
  • 36:10 A snapshot is a memoization of your left fold
  • Events are immutable
  • 36:59 You can never edit a projection
  • 39:58 You have a choice. You can either wear a fireman hat or you can wear a cowboy hat. It’s a good strategy.
  • 40:45 There are a lot of domains that are naturally event sourced.
  • 41:14 And we wonder why doctors have hard times understanding CRUD-based systems. When they’re natural mental model is the appending of facts.
  • 42:00 There are vast number of business problems that are naturally event sourced.
  • 42:17 Overall, event sourcing is a beautiful transactional model - it’s append-only, immutable.
  • 43:37 Append-only, immutable logs are absolutely brilliant for a lot of things
  • 43:49 So when I event sourced my system, how do I answer the question I wanna see all the users with the first name ‘Greg’. Do I replay the entire event log for every user in the system to figure out at the end of it if the first name is Greg? That’s gonna be spectacular. Basically imagine that every single query you do has to be a table scan of every event your system has ever done. That’d be AWESOME! And this is when we come to CQRS.
  • 44:36 CQRS basically says that you don’t want one system - reading and writing are different and you should make different decisions for reads and for writes. CQRS at its core is probably the dumbest pattern ever imagined.
  • 44:54 CQRS actually comes from CQS - Command and Query Separation which is from Bertrand Meyer
  • 45:05 Object-oriented software construction I would recommend getting the 2nd edition
    • I warn you as the book seems to be event sourced - it looks like it’s append-only and he just keeps adding to it on every edition.
  • 45:30 What CQS states is very simple - there are two types of methods: the first type has a void return type - it’s called a command. It’s allowed to mutate state. It’s not a pure function. The second type has a non-void return type - it is not allowed to mutate state - it’s called a query.
  • 46:31 It makes it much easier to reason about your code. And the reason it was so important to do this inside Eiffel - which is the language he writes - is because it has something called contracts. And contracts are normally written on top of pure functions. And what happens if I called your contract twice? Should that somehow alter your behaviour? That’d be weird. What if I realized I don’t need to call your contract and now you worked differently because I didn’t call your contract? That’s weird.
  • 47:02 Martin Fowler wrote that CQS is not a principle. It’s instead a rather reasonable suggestion.
  • 47:13 If you know Martin you can imagine him saying that in his British voice.
  • 47:56 CQS becomes much more important when we start talking about things in a distributed system. And the reason it becomes so much more important is because we start talking about things like idempotency. While queries are going to be naturally idempotent, commands are not going to be naturally idempotent.
  • 48:16 CQRS goes one step further than this.
    • 48:19 and btw how many of you have seen something in CQRS that said underneath it “Did you mean CARS?” And this is because back in 2007 or so if you typed CQRS in Google it’d say “Did you mean CARS?”
  • 48:44 CQRS basically says we’re going to have two objects now - we’re going to take the one object to fill in with commands and queries and we do two objects out of it - one for all the commands and one for all the queries. And actually they called it a pattern. How weird is that? It’s such a simple concept. But it’s an enabling pattern.
  • 49:12 Queries tend to have a different perception of your data. Almost always a query is focused on a screen. And what the screen looks like? Does a screen have anything at all to do with managing your transactional invariants? Probably not. If they do have similarities it’s accidental. There’s no causative relationship between this. Queries are screen-related because you want to do one call across the wire to come back because you're perceived as being faster then.
  • 50:00 In most systems, queries are what you need to scale. Most systems I looked at do an order of 1 to 2 orders of magnitude more queries than they do processing of commands. I’ve seen them all the way up to 4-5 orders of magnitude.
  • 51:08 queries are generally what we need to scale.
  • 51:10 Most people what they’re doing today they’re building up a domain model on top of a database and when it comes to scaling they talk about scaling everything. We don’t need to do that. We can talk how to scale our queries and not talking about scaling our commands.
  • 51:27 There’s really an interesting property about queries that makes them specially easy to scale. Commands are hard to scale. Queries are very easy to scale because almost all queries can operate with relaxed consistency. Queries can be eventually consistent. When it comes to process commands I really need my current state in order to do this reliably and it gets very complex if I do it in an eventually consistent way for validation. Queries can almost always be eventually consistent.
  • 52:16 Guess what, you’re already eventually consistent. You just don’t know it.
  • 52:43 So you’re already eventually consistent. You're just not taking advantage out of it.
  • 52:53 Queries become super, super easy to scale.
  • 53:05 And BTW, I do this talk a lot in Europe and I always feel bad for English as the second language people because the write side is on the left and very often they get confused.
  • 53:19 On one side we’ve got our domain objects with application services and normal DDD-style stuff. On the other side we’ve just got the thin read layer that goes directly to database with no ORM or anything crazy like that and just returns DTOs quering the database.
  • 54:00 (With Read/Query layer) you’re linearly scalable now. You can geographically distribute them. And it’s almost always queries that are the interesting part of this.
  • 54:12 To be fair, if anyone here happens to be working in finance, yes, I know, you get many many more writes than reads. And there are systems like this, but there are far fewer
  • 55:38 When we talk about this, it’s very uncommon that you have a single read model. Normally you have multiple read models for doing different kinds of queries.
    • For UI - document database as it aligns well to how much screen looks like.
    • OLAP querying
    • Full-text indexing with Lucene
    • All are projections
    • You can have as many projections as you want in Event Sourced system.
  • 57:12 Remember that that’s not one read model you’d have - most systems require two/three/four different read models to actually work well.
  • 57:20 And there’s a massive accidental complexity to try to use a single read model.
  • 57:29 State transitions are important concepts of our domains. The result of an operation - what that means - is an important concept - it’s a fact.
  • 57:41 Overall, getters and setters in domain models are code smell. If you start seeing lots of them, start thinking of what you’re doing.
  • 57:55 You cannot under any circumstances have a single model that does everything for you and does it well. it doesn’t exist. There’re different types of models and different models do good at different things.
    • On the slide: A single model cannot be appropriate for reporting, searching, and transactional behaviours
  • 58:34 Just like event sourcing doesn’t work well for everything. You can’t do query of the current state in a purely event sourced system. You need some piece of transient state to be able to query with it. It will help you and save you a lot of accidental complexity.
  • 59:43 My aggregates are fully consistent.
  • 1:02:45 I would not even consider doing a snapshot until I hit about a thousand or maybe more.

Don’t Create Aggregate Roots

Don’t Create Aggregate Roots

  • aggregate roots - less intuitive parts of domain-driven design
  • inevitable question of validation
    • how do we prevent the invalid entity from being saved?
  • the technical guidance: always get an entity. At least one.
  • add no objects to the session or unit of work explicitly – rather, have some other already persistent domain entity create the new entity and add it to a collection property.
  • We’re beginning to get an inkling that almost every activity that results in the creation of an entity or storing of additional information can be traced to a transition from a previous business state.
  • In any transition, the previous state is the aggregate root.
  • Validation of string lengths, data ranges, etc is not domain logic and is best handled elsewhere + The same goes for uniqueness.
  • If your service layer is newing up some entity and saving it – that entity isn’t an aggregate root in that use case. As we saw above, in the original creation of the Visitor entity by the Referrer, the visitor class wasn’t the aggregate root. Yet, in the user registration use case, the Visitor entity was the aggregate root.
  • Aggregate roots aren’t a structural property of the domain model.
  • don’t go saving entities in your service layer – let the domain model manage its own state.
  • The domain model needs no references to repositories, services, units of work, or anything else to manage its state.
    • with this, you’ll also be able to harness the technique of fetching strategies to get the best performance out of your domain model by representing your use cases as interfaces on the domain model like IRegisterUsers (implemented by Visitor) and IBringVisitors (implemented by Referrer).

Aggregate Root – How to Build One for CQRS and Event Sourcing

Aggregate Root – How to Build One for CQRS and Event Sourcing

  • An invariant describes something that must be true within your design, at all times. The only exception is during a transition to a new state.
    • For example, an employee cannot take more annual leave than they have. This invariance could involve an employee object and a list of holiday objects. The sum of the days in the holiday objects must not exceed the total days holiday allocated.
  • Invariants help us to discover our Bounded Context.
  • A bounded context groups together a model that may have one or many objects. These objects will have invariants within and between them.
    • In the holiday example, the bounded context could include a representation of an employee and their leave record.
  • The job of enforcing the invariants within a bounded context is that of the Aggregate Root which is also an Entity.
    • to control and encapsulate access to it’s members in such a way as to protect it’s invariants.
    • An Aggregate Root is an Entity and will therefore have an Id.
    • Aggregates should have little or no dependencies on outside services.
  • An entity is an object that differs by ID.
    • Take for example, a telephone company. Within that context, a ‘line’ identified by a telephone number would be an entity.
    • The context is important.
    • Within a CRM application, the equivalent of a ‘line’ would not be an entity. The ‘line’ may be a ‘Value Object’ attached to a customer who is in turn an entity.
  • Imagine how much simpler a class is to design and reason about if it is purely doing it’s thing (i.e. Single Responsibility).
  • No extra clutter of persistence or similar distractions. At some point however, the outside system will need to know the aggregates state. This is where the GetUncommittedChanges comes into use (to clear up the changes).
  • Once an aggregate has completed a command it will store up the changes in the _changes list. These changes are in the form of an Event or description of what has just happened. On completion of the command, the outside system would then request any uncommitted changes. That list would then saved. On success the MarkChangesAsCommitted method used and the state transition is complete.
  • Processing Command and Generating Events
    • On receipt of a command, the first task of the aggregate is to ensure it can run the command.
      • In our holiday example above, we would do two things to ensure the invariant remained true. First we would sum the total days holiday the employee has already taken. We would then add the days requested in the command to the total. The final step would check the total did not exceed the maximum allocated days. In other words, before it can generate a state change event, it must ensure that all invariant would still hold true.
      • BookHoliday (command) +  HolidayBooked (event)
    • The First Phase is not for User Validation
      • Tasks like required fields, data ranges and string lengths are not domain logic.
      • User input validation should happen before the system generates a command.
      • Once you have determined the command can be run, you generate an event message. This will containing the data that describes the change.
        • In the holiday request example, we might create a HolidayRequested event. At a minimum, it would contain the aggregate id, the version it was applied at, and then context specific data. In this case the dates from and to (assuming you can only take whole days). At this point the state of the aggregate has not changed.
    • The Second Phase – Applying the Change
      • Now that we have an event, the aggregate can apply the state transition to itself.
        • To do this it would call the ApplyChange method (these two methods are defined within the aggregate root).
      • You will notice that it passes the ApplyChange through to a private method. This method has a parameter indicating that it is a new event. This indicator allows us to differentiate between historical events and new events. We capture new events in the _changes list in order for an outside service to grab them and save to disk.
  • Loading From History
    • Before any processing of commands, the aggregate must be brought up to date. The external service retrieves the list of previous events.
      • Given the aggregate is an entity it is a trivial ‘get by id’ or ‘SELECT * FROM X WHERE ID = …’ database query.
      • The outside service would then pass in the events to the LoadFromHistory method which in turn would apply each event to it’s self. Key here is that these are not new events and should not get added to the _changes list.This is only possible because all commands are pre-checked. On success the aggregate generates the events. You can therefore be confident that the events, when played back in order, will return the aggregate to its current state.
  • small and relatively simple abstract class is at the heart of CQRS and event sourcing.
    • Providing a clean mechanism for protecting invariants and encapsulating functionality. 
 

@joeblew99
Copy link

+1

@signalpillar
Copy link

Jacek, thank you for putting all this together, amazing job.

@emvidi
Copy link

emvidi commented Dec 11, 2020

Thank you for sharing.

@1gor
Copy link

1gor commented Feb 21, 2021

Thank you for making this list of summaries.

@chano21
Copy link

chano21 commented Feb 9, 2022

great :)

@softwarecoolie
Copy link

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment