- The following file gives a summary and notes organized by section based on Leslie Lamport's paper related to distributed systems.
- Document: Time, Clocks and the Ordering of Events in a Distributed System
- Supplemental document 1
- Provides more details related to necessary math prerequisites in context to logical clocks.
- Supplemental video 1
- Provides a verbal explanation with examples.
- Supplemental video 2
- An interview with Leslie Lamport providing comments about the document of interest.
This white paper was written by Leslie Lamport and published in 1978 within the journal, "Communications of the ACM" by the Association for Computing Machinery. The paper highlights a common concern within distributed systems related to the difficulty of anticipating the ordering of events between processes within the system. The paper suggests a solution by proposing an abstract mechanism known as a logical clock that seeks to accurately discern a total ordering by leveraging the "happened-before" relation. Furthermore, it demonstrates how the mechanism can be used to implement the bakery algorithm to enforce mutual exclusion within the system.
A distributed system using the logical clock design will assert that every process within the system contains their own local logical clock. A process will assign it's name as well as a monotonically increasing integer to every event that occurs within it which is known as a Lamport timestamp. These timestamps can be used to uniquely identify events and derive a total ordering that is consistent with their actual ordering. The total ordering of events in a distributed system is important because it allows the behavior of the system to be understood and analyzed. By knowing the order in which events occurred, it is possible to identify patterns and trends, and to detect potential problems or issues. The paper showcases this by introducing the bakery algorithm which is responsible for delegating access to a shared resource.
- A distributed system consists of a collection of distinct processes which are spatially separated, and which communicate with one another by exchanging messages.
- A system is distributed if the message transmission delay is not negligible compared to the time between events in a single process.
- It is sometimes impossible to say that one of two events occurred first.
- This section vaguely suggests a "happened before" relation and mentions that it can be used within an algorithm to manage the sequence of events.
-
If a system is to meet a specification correctly, then that specification must be given in terms of events observable within the system.
-
Priori probability is a likelihood of occurrence that can be deduced logically.
-
This section loosely defines a process to be a set of events.
-
The symbol
$\rightarrow$ is notation used to denote the "happened before" relation. -
The sequence of these events are often modeled using space-time diagrams
- The x-axis represents space.
- The y-axis represents time.
- A solid vertical line denotes a process.
- A point on a process denotes an event.
- Wavy lines between points denote messages.
-
The definition of the "happened before" relation asserts that the following three conditions must be satisfied:
-
If
$p_{1}$ and$p_{2}$ are events in the same process, and$p_{1}$ comes before$p_{2}$ , then$p_{1} \rightarrow p_{2}$ . -
If
$q_{5}$ is the sender of a message by one process and$p_{4}$ is the recipient of the same message by another process, then$q_{5} \rightarrow p_{4}$ . -
If
$a \rightarrow b$ and$b \rightarrow c$ then$a \rightarrow c$ .
-
-
If event
$a$ happens before event$b$ then it is possible for$a$ to casually affect$b$ . -
Two distinct events
$a$ and$b$ are said to be concurrent if$a \cancel{\rightarrow} b$ and$b \cancel{\rightarrow} a$ - Concurrent events are not required to occur at the same time in order to be considered concurrent; they're simply events that can occur in any order, without any causal relationship between them.
$p_{3}$ and$q_{3}$ are concurrent because they're in different processes and cannot affect each other.
- Concurrent events are not required to occur at the same time in order to be considered concurrent; they're simply events that can occur in any order, without any causal relationship between them.
-
The aforementioned characteristics imply that
$\rightarrow$ is a irreflexive partial ordering on the set of all events in the system. -
The paper appears to only be considering messages that are sent and does not include message that could be sent.
-
A logical clock is just an abstract concept whose primary purpose is to assign numbers of an incremental nature to an event.
-
A distributed system consists of multiple processes all of which have their own local logical clock.
-
The numbers assigned to events can loosely be perceived as a timestamp for when the event occurred.
-
We can conclude that an event with a lower timestamp on the same process occurred earlier than an event with a higher timestamp.
- This isn't always the case for two events that occurred on different processes due to concurrent events.
-
The clock condition asserts that an event's timestamp must satisfy the following conditions that are loosely related to the happen-before relation:
- If
$a$ and$b$ are events in process$P_{i}$ , and$a$ comes before$b$ , then$C_{i}(a) < C_{i}(b)$ . - If
$a$ is the sender of a message by process$P_{i}$ and$b$ is the receiptient of that message by process$P_{k}$ , then$C_{i}(a) < C_{k}(b)$ .
- If
-
Observe that the converse of our clock condition cannot always hold: If
$C_{i}(a) < C_{i}(b)$ then$a$ happens before$b$ .- This is because
$a$ and$b$ could be concurrent.
- This is because
-
This section attempts to build a relation between the clock condition that governs our logical clocks and temporal time. The dashed lines are used to create a coordinate plane that represent units of physical time. Observe that
$p_{2}$ happens before$p_{3}$ this implies there must be a unit of time inbetween them represnted by the dashed line.- Observe that there is at least 1 line between every event with the same process.
- Observe that there is at least 1 line between any two events sending / receiving a message.
-
The paper suggests that a 3D space-time diagram will better illustrate the temporal relationship between events. This makes it easier to observe the passage of time and the relations between events of different processes.
-
The section describes a methodology consisting of two properties applied to the system of clocks used to satisfy the clock condition.
- Each process
$P_{i}$ increments$C_{i}$ between any two successive events.- This satisfy clock condition 1
- Each message requires the sender to attach their timestamp. The recipient must then compare that timestamp with their own then take the max and increment that value by 1.
- This satisfy clock condition 2
- Each process
- The beginning of this section is just describing how we can create a total ordering of the events.
- Essentially, an event happens before another event if its timestamp integer is lower. If the timestamps are the same then we settle a tie breaker based on names attached to the timestamp.
- The document mentions that our ordering is non-unique because of our tie breaking condition. If we use a different system of clocks with different names it'll yield a different total ordering.
- Essentially, an event happens before another event if its timestamp integer is lower. If the timestamps are the same then we settle a tie breaker based on names attached to the timestamp.
- The total ordering of events in a distributed system is important because it allows the behavior of the system to be understood and analyzed. By knowing the order in which events occurred, it is possible to identify patterns and trends, and to detect potential problems or issues. The paper demonstrates this by using the total ordering to implement the bakery algorithm which enforced concurrency within the system.