Skip to content

Instantly share code, notes, and snippets.

@tingstad
Last active April 7, 2022 12:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tingstad/1cf9359200dfb596527cb07da74f6ca9 to your computer and use it in GitHub Desktop.
Save tingstad/1cf9359200dfb596527cb07da74f6ca9 to your computer and use it in GitHub Desktop.

Event driven notes

Dependency inversion

Customer -[Address changed]-> Billing

When this is passed as an event, the Customer system does not need to know about the Billing system. The direction of the dependency has been reversed.

https://www.infoq.com/presentations/event-driven-benefits-pitfalls/

Topics, consumers, etc.

A topic can have multiple producers and consumers. Kafka provides at-least-once messaging guarantees.

A consumer is typically part of a Consumer Group.

each consumer in the group will receive messages from a different subset of the partitions in the topic. [...] If we add more consumers to a single group with a single topic than we have partitions, some of the consumers will be idle and get no messages at all.

I.e. there is only one consumer for each partition (per consumer group). When this consumer becomes a bottle neck, the number of topic partitions must be increased to be able to scale using more consumers.

One consumer per thread is the rule. To run multiple consumers in the same group in one application, you will need to run each in its own thread.

one of Kafka’s unique characteristics is that it does not track acknowledgments from consumers the way many JMS queues do. Instead, it allows consumers to use Kafka to track their position (offset) in each partition.

Kafka can store a consumer's latest commited offset for a partition, but you don't have to store the offset in Kafka.

Most developers exercise more control over the time at which offsets are committed—both to eliminate the possibility of missing messages and to reduce the number of messages duplicated during rebalancing.

enable.auto.commit=true commits every X seconds. The "normal" commit[A]Sync() commit uses the offset of the latest message (of a batch) received. To commit at any precise point, the developer must commit explicitly.

If we store the offset outside Kafka (like a DB), we can achieve exactly-once processing instead of potiential duplicates. How? By storing the offset in the DB for each message processed (along with the result of the processing), and when a consumer starts (or is rebalanced) it seek()s to the DB offset. (If the consumer produces Kafka messages instead of writing to DB, one might achieve the same by producing the message in a transaction together with the offset (to the topic __consumer_offsets?).)

https://www.oreilly.com/library/view/kafka-the-definitive/9781491936153/ch04.html

Events with the same event key (e.g., a customer or vehicle ID) are written to the same partition, and Kafka guarantees that any consumer of a given topic-partition will always read that partition's events in exactly the same order as they were written.

https://kafka.apache.org/documentation/#intro_concepts_and_terms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment