Engineering

CRDT by a simple example

Alexey Kuznetsov ·

Senior front-end developer with strong math background from Vilnius, Lithuania. Passionate about vue.js, reactivity, browser performance.

Workload management for engineering

This is a short abstract about what CRDT is and how to solve concurrency issues in the existing collaboration software by applying CRDT principles.

The Issue

Multiple clients are working with the same data, the changes should be asynchronously propagated between instances, and if some client has a connection blip, the state should be automatically synced back after the connection is restored.

The solution

CRDT stands for Conflict-free/Commutative Replicated Data Type. It’s the answer to the problem of synchronizing data in distributed environments. There’s a well-developed theory coming from the 1970s. There are many papers authored by Marc Shapiro and other scientists about CRDT, order theory, partially ordered sets, Lamport timestamps, and others. https://crdt.tech and https://github.com/alangibson/awesome-crdt might be a good starting point. https://github.com/yjs/yjs is probably the best implementation of CRDT. 

We are not going into the theory in this abstract. Instead, we take an existing system and check what is not CRDT there and how to make it CRDT.

CRDT principles

  1. Every client has a state that is updated independently, locally, without coordinating with other clients. Every update is an event that goes to other clients asynchronously. 
  2. Any inconsistencies are resolved automatically.
  3. Although replicas may have a different state at any particular time, it’s guaranteed that they will have an equal state when all the events are applied everywhere.
  4. A view of CRDT can be only out-of-date but never incorrect.

A trivial example of CRDT system is a counter. Imagine the state is just a number, and allowed operations are increasing/decreasing by 1. Multiple clients start with the same number, they generate ±1 events, these events are commutative (it does not matter in which order they go), hence when all the events are received on every client, the resulting state will match everywhere. 

Event-driven updates

The principle is familiar to those working with modern front-end frameworks. React/Redux/Mobx and Vue/Vuex/Pinia have a state that is updated by a well-defined set of mutations. Every mutation might be considered as an event that is pushed to other clients. Hence, we define a growing-only array of events, every event defines a mutation and has a unique timestamp. In practice, a proper timestamp is a join of:

  • a global timestamp millisecond
  • growing-only sequential number of an event on the current client
  • a random hash

This guarantees that all the timestamps are unique, and events from different clients might be combined into the strictly ordered set. 

Todo list

Two clients are working on the same todo list which is an array of entries like {title: string; isDone: string;}. They can create, update, and delete any entry at any index. Suppose, a create event has the following structure: {index: number; title: string}. It’s not CRDT:

Initial state:
[first, second, third]
Initial state:
[first, second, third]
Insert {next} at index 1Insert {another} at index 2
Local state before synchronization:
[first, next, second, third]
Local state before synchronization:
[first, second, another, third]
Receiving events from other clients:
Insert {another} at index 2
Receiving events from other clients:
Insert {next} at index 1
Result:
[first, next, another, second, third]
Result:
[first, next, second, another, third]

The root cause of the issue is that CRDT can not maintain global invariants, and array indexing is a global invariant. Another simpler example of a global invariant is an accounting system with a requirement for a balance to be positive. Then, if an initial balance is 100, and one client spends 70 while another client spends 40, both transactions are valid, but the resulting balance becomes negative after synchronization.

An obvious fix for the todo list is to assign a unique ID to every element. Then, inserting a new entry should specify the previous entry ID instead of the index. Another requirement of CRDT is that we should not hard delete anything. Imagine one client adds a new todo after some entry while another client deletes it – how to calculate the index of where to put the new todo?

Finally, the last-write-wins principle is taking place. If two clients update the same todo title at the same time, there should be a deterministic way to know the result. We order events by the timestamp and take the last title.

Clone & rename

Suppose, there’s an ability to clone an existing todo entry, and the clone command has just a single argument, {todoIdToClone}. This is not CRDT:

Initial state:
[{id: 123, title: "Initial title"}]
Initial state:
[{id: 123, title: "Initial title"}]
Clone 123 entryRename 123 entry to “Changed title”
Local state before synchronization:
[{id: 123, title: "Initial title"}, { id: 456, title: "Initial title"}]
Local state before synchronization:
[{id: 123, title: "Changed title"}]
Receiving events from other clients:
Rename 123 entry to “Changed title”
Receiving events from other clients:
Clone 123 entry
Result:
[{id: 123, title: "Changed title"}, { id: 456, title: "Initial title"}]
Result:
[{id: 123, title: "Changed title"}, { id: 789, title: "Changed title"}]

The reason is clone command is not CRDT compatible. It uses the current state which is a global invariant. Instead, it should be converted into create command which has all the data and does not refer to the current state:

Initial state:
[{id: 123, title: "Initial title"}]
Initial state:
[{id: 123, title: "Initial title"}]
Clone 123 entry:
Run create command with the following data: {id: 456, title: "Initial title", placeAfter: 123}
Rename 123 entry to “Changed title”
Local state before synchronization:
[{id: 123, title: "Initial title"}, { id: 456, title: "Initial title"}]
Local state before synchronization:
[{id: 123, title: "Changed title"}]
Receiving events from other clients:
Rename 123 entry to “Changed title”
Receiving events from other clients:
Clone 123 entry:
Run create command with the following data: {id: 456, title: "Initial title", placeAfter: 123}
Result:
[{id: 123, title: "Changed title"}, { id: 456, title: "Initial title"}]
Result:
[{id: 123, title: "Changed title"}, { id: 456, title: "Initial title"}]

Convert to parent & move to another board

In our system, todos are distributed along different boards and might have children. There are two commands:

  • {command: move-to-another-board, arguments: {todoId, newBoardId}}
  • {command: convert-to-parent, arguments: {todoId}}

What if one client moves an item with the child to another board, while another client converts the child to parent? The order of commands matters, and this is a clear sign something is not CRDT here:

Initial state:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1, parent: 123}]
Initial state:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1, parent: 123}]
Move 123 to board 2Convert 456 to parent
Local state before synchronization:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 2, parent: 123}]
Local state before synchronization:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1}]
Receiving events from other clients:
Convert 456 to parent
Receiving events from other clients:
Move 123 to board 2
Result:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 2}]
Result:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 1}]

In this case, the issue is an excess of the data. If a child has 123 parent, it automatically means its board matches the parent’s board. A minimal structure could have just a single-parent property that can be another item or board. Then, we have just a single change-parent command. Since we live in SQL world, a simpler would be to supply convert-to-parent command with the boardId:

Initial state:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1, parent: 123}]
Initial state:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1, parent: 123}]
Move 123 to board 2Convert 456 to parent: set board to 1 and unset parent
Local state before synchronization:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 2, parent: 123}]
Local state before synchronization:
[{id: 123, title: "Root", board: 1}, {id: 456, title: "Child", board: 1}]
Receiving events from other clients:
Convert 456 to parent: set board to 1 and unset parent
Receiving events from other clients:
Move 123 to board 2
Result:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 1}]
Result:
[{id: 123, title: "Root", board: 2}, {id: 456, title: "Child", board: 1}]
Teamhood uses cookies, to personalize content, ads and analyze traffic. By continuing to browse or pressing "Accept" you agree to our Cookie Policy.