# Understanding Causality and Order: Distributed Systems

### Time, Clocks and Order

In the following post, we will understand and examine various methods used to understand time and event order in a distributed system.

## Introduction

Distributed systems are powerful and they provide solutions to complex problems by clubbing multiple computers together. One of the biggest problems in distributed systems which we gain over a single machine system is establishing a correct timeline of events.

This is needed for understanding the ordering and causality of events within system. We need this to do multiple operations. Let us say there are two operations received INSERT and UPDATE in a distributed database. Unless we have a correct order, the operation will throw the error because you can’t update something which hasn’t been inserted yet.

There are two ways to record time in distributed systems -

Clock Synchronisation (Physical Clocks)

Logical Clocks

## Clock Synchronisation

Physical Clocks are tied to the notion of real time. It can be used to order events, find time difference between two events, etc.

In this system, each node has a local clock used by it to timestamp events at the node. The local clocks of different nodes may vary but they need to be synchronized. Read about Clock Synchronisation problem is you are more interested in this topic. A common approach for pair-wise synchronisation between computers is through the use of the client-server model i.e let clients contact a time server, i.e Network Transfer Protocol. Berkeley’s algorithm is another way to solve this problem.

We won’t go into too much detail in this section for this article.

## Logical Clocks

An approximation of time based on Physical Clocks may be good enough if events infrequently. However, in most distributed systems this is not a good enough guarantee. Therefore, we have to look to a virtual way of expressing time between machines, so we can keep the ability to place events into an accurate timeline. This is the notion of **Logical Clocks**.

**Logical Clocks **refer to implementing a protocol on all machines within your distributed system, so that the machines are able to maintain consistent ordering of events within some virtual timespan.

### Lamport’s Clocks

A breakthrough paper *Time, Clocks, and the Ordering of Events in a Distributed System*, Leslie Lamport defined the **“happens-before”** relationship “**→” **as follows:

If a and b are events in the same process, and a comes before b, then a→b.

If a is the sending of a message by one process and b is the receipt of the same message by another process, then a→b.

If a→b and b→c, then a→c.

If events x, y occur in different processes that do not exchange messages, neither x → y nor y → x is true, x and y are said to be **concurrent**.

A **Limitation** of Lamport Clock(or Scalar Logical Clock) is the following - x → y implies C(x) < C(y) but C(x) < C(y) doesn’t imply x → y!

Also, In order to become causally consistent we need a way of representing local time and global time separately. This is where **Vector Clocks** come in.

### Vector Clocks

Vector Clocks, a more advanced version of Lamport’s Logical Clock. They are designed to overcome scalar clock problem.

For each process Pᵢ of the system, the algorithm maintains a vector VCᵢ with the following attributes:

Cᵢ[j]: local logical clock at Pᵢ, or the number of events that have occurred before the current timestamp.

If Cᵢ[j] = k, Pᵢ knows that k events have occurred at Pj.

The algorithm of Vector Clocks then goes as follows:

Before executing an event, Pᵢ records a new event happens at itself by executing

**Cᵢ[i] <- Cᵢ[i] + 1.**When Pᵢ sends a message m to Pj, it sets timestamp t(m) equal to Cᵢ after having executed the previous step.

When message

**m**is received, process**Pj**update each**k**of its own vector clock:**Cj [k] ← max { Cj [k], t(m)[k]}**. Then it continues executing the first step and delivers the message to the application.

By these steps, when a process Pj receives a message m from process P**ᵢ **with timestamp t(m), it knows the number of events that have occurred at P**ᵢ **casually preceded the sending of m. Furthermore, Pj also knows about the events that have been known to P**ᵢ **about other processes before sending m.

#### Problems of Vector Clock

Message size increases since each message needs to be tagged with the vector

Size can be reduced in some cases by only sending values that have changed

Can also send only a scaler to keep track of direct dependencies only, with indirect dependencies computed when needed Tradeoff between message size and time

When it comes to the concept of time in distributed system, the primary goal is to achieve the **correct order of the events**. It can be done either in chronological order with Physical Clocks or in logical order with Lamport’s Logical Clocks and Vector Clocks along the execution timeline.

References -

https://lamport.azurewebsites.net/pubs/time-clocks.pdf

https://levelup.gitconnected.com/distributed-systems-physical-logical-and-vector-clocks-7ca989f5f780

https://medium.com/distributed-knowledge/time-synchronization-in-distributed-systems-a21808928bc8

https://profile.iiita.ac.in/bibhas.ghoshal/lecture_slides/distributed/Clock.pdf

Please feel free to reach out to me if you need help with building large scale distributed systems, deep learning or data engineering solutions at your organisation, I would be more than happy to help.

Read more of my work -

Also check out my Youtube Channel -

I hope that you enjoyed this read, if so then please share this article with your friends, let’s build a solid community. I will be back soon with another well thought/researched article delivered straight in your inbox. See you!