Skip to content

Instantly share code, notes, and snippets.

@rponte
Last active April 3, 2024 21:45
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.
Save rponte/9477858e619d8b986e17771c8be7827f to your computer and use it in GitHub Desktop.
THEORY: Distributed Transactions and why you should avoid them (2 Phase Commit , Saga Pattern, TCC, Idempotency etc)

Distributed Transactions and why you should avoid them

  1. Modern technologies won't support it (RabbitMQ, Kafka, etc.);
  2. This is a form of using Inter-Process Communication in a synchronized way and this reduces availability;
  3. All participants of the distributed transaction need to be avaiable for a distributed commit, again: reduces availability.

Implementing business transactions that span multiple services is not straightforward. Distributed transactions are best avoided because of the CAP theorem. Moreover, many modern (NoSQL) databases don’t support them. The best solution is to use the Saga Pattern.

[...]

One of the most well-known patterns for distributed transactions is called Saga. The first paper about it was published back in 1987 and has it been a popular solution since then.

There are a couple of different ways to implement a saga transaction, but the two most popular are:

  • Events/Choreography: When there is no central coordination, each service produces and listen to other service’s events and decides if an action should be taken or not;
  • Command/Orchestration: when a coordinator service is responsible for centralizing the saga’s decision making and sequencing business logic;
@iseki0
Copy link

iseki0 commented Nov 12, 2023

But as you said, saga is a distribution transaction pattern also. So... what is "avoid distribution transaction"?

@rponte
Copy link
Author

rponte commented Nov 17, 2023

This article has a very didact explanation of how to implement the different types of delivery semantics, as we can see below:

Offset Manager

Each message in Kafka is associated with an offset - an integer number denoting its position in the current partition. By storing this number, we essentially provide a checkpoint for our consumer. If it fails and comes back, it knows from where to continue. As such, it is vital for implementing various processing guarantees in Kafka:

  • For at-most-once, we need to save $offset + 1 before processing $offset. If our consumer fails before successfully process $offset and restarts, it will continue from $offset + 1 and not reprocess $offset.
  • For at-least-once, we need to successfully process $offset before saving $offset + 1. If our consumer fails before saving $offset + 1 and restarts, it will continue from and reprocess $offset.
  • For exactly-once using an external transactional storage - we need to process $offset and save $offset + 1 within one transaction and roll back if anything goes wrong.

@rponte
Copy link
Author

rponte commented Jan 19, 2024

@rponte
Copy link
Author

rponte commented Jan 24, 2024

Meu RT quando caiu a ficha:

excelente pergunta do @tomazfernandes_ 👏🏻👏🏻

uma solução robusta, tolerante a falhas e barata de implementar que seja retriable, idempotente e abrace consistência eventual eh justamente Outbox Pattern 🤤🤤

viva o ACID dos RDBMS + aquele job em background que roda a cada 1min 🥳

@rponte
Copy link
Author

rponte commented Apr 3, 2024

Tweet by @DominikTornow

The ingredients of failure tolerance & transparency are more basic than you would think:

1⃣ Guarantee retries via persisting the invocation
2⃣ Guarantee correctness of retries via idempotent steps

Outlined in Fault Tolerance via Idempotence

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment