Consistency: A Mis-explained Concept in Distributed Systems

What exactly is consistency in a distributed system? When can I determine that a system is consistent? 4 out of 5 nodes in my system are up, can I call it consistent?
Typical answers on the internet will tell you something like,
In a strongly consistent system, all nodes in the system agree on the order in which operations occurred
[strong consistency is] where all the replicas read the same value for a given data item at the same point in time
As far as phase and domain synchronous are concerned, [consistency] guarantees that different nodes of the distributed system make use of the simultaneous copy of the data
A consistent distributed system should behave as if it was a single node.
The problem with these definitions is that it is (1) misplaced and (2) incomplete. It is misplaced because you do not need to know how consistency is actually achieved internally, before knowing what it is. It is incomplete because it does not tell you how the system is supposed to behave.
On top of that, it confuses people because it brings in additional unexplained concepts of nodes, replicas, copies of data and agreement. These concepts are only relevant to understanding the difficulty of achieving consistency, not the criteria itself.
You may also find self referential definitions like the last one. It defers the problem to a single node system without describing how a single node should behave and the reader is again left to his own imagination.
In this article, I will not use these terms to explain what consistency means. This sets the stage for the next section.
What it is not
Consistency is not an internal property, but an external one. Yep!
You do not need to learn about replicas, nodes, multiple copies of data, and agreement as these are all internal descriptions of the system. Clients do not need to - crack open the system and find out the number of nodes or read the algorithm being used to sustain agreement - before declaring a system as consistent.
Instead, a handful of external queries treating the system as a black box will reveal the consistency level of the system with ease. In other words, consistency is a declarative property, not an imperative one. So let us avoid peeking into the system to understand the term.
Additionally, terms like serialisability, linearizability, eventual consistency and causal consistency, are helpful but not necessary to understand consistency.
What it is
Moving the needle from the internal to the external, you would realise that both the purpose of and the problem in distributed systems lies in simultaneous queries, in fact lots and lots of them bombarding the system at the same time.
So the question is, how is a system supposed to behave when there are simultaneous queries?
Let us concretely define how a correct system is supposed to behave.
A consistent system should give right answers or perform actions, in the order of the questions asked or actions requested, without interfering with other people's questions or actions.
This gives rise to the three conditions integrity, ordering, and isolation, i.e. right answers, ordered, and without interference. It turns out these three conditions are sufficient to ensure a system is purely consistent.
- Integrity, with regards to the response. The response needs to be correct, not just in terms of a single value but across correlated values in the entire dataset.
For example in a SQL database, not only should a row INSERT be executed correctly, but a COUNT query following that should also have an incremented value. This is because a row is not a self contained piece of information but tied together with counts, indexes, and many other correlated values which may be stored or calculated. This is where the concept of predicates and predicate-locks come into picture.
Note: It may sound trivial but you will be surprised to find that there is hardly any well distributed database that fulfills this condition to the fullest. In addition, there are limits to this concept. Integrity may also extend to application specific conditions or even across databases, which is generally out of scope of many database solutions. It is usually left for the application to handle such criterias. - Ordering, from the external perspective, should be maintained. If query B is run after query A, query B's result should take effect going forward. For example,
- WRITE 1, WRITE 2, READ should yield 2 and not 1
- WRITE 1, READ, WRITE 2 should yield 1 and not 2
- Isolation, from other simultaneous queries. This is completely different from ordering. The only criteria being that there should be some possible sequential execution of all those queries when run mutually exclusive of one another, which will reveal the same set of responses towards to client.
All of these correctness conditions are very difficult to achieve and have their own challenges.
For example, maintaining ordering is extremely tricky due to perfect time synchronisation across machines being theoretically and practically impossible.
This is why you will hardly find any distributed databases providing pure consistency, although claims will be plenty. Which condition do databases compromise on the most? In reality - all of them.
Compromises on consistency
There are various compromises databases make to trade for performance and simplicity, across all the three conditions we discussed earlier. I have listed resources for each of them for you to dive into if you are interested.
Bonus Article: Consistency Levels - Jepsen.io
Integrity Levels: Consistency is generally broken down into Single-Object and Multi-Object (Predicate Level) across this dimension. Almost all databases, including key-value stores, document stores and graph databases will provide single-object consistency, but multi-object is rarer.
Even the term Linearizable primarily only refers to single-object ordering, and not across multiple objects. Strict Serializability is the term used to refer to the broader form of pure consistency which encompasses the predicate level multi-object scenario.
Bonus Article: Predicate Locking Features in InnoDB
Ordering Levels: Consistency levels across this dimension can be broken down into Monotonic Writes, Monotonic Reads, PRAM, Causal, Sequential and Linearisable. If you are interested you can read about them in the following article.

Bonus Article: Time and Ordering
Isolation Levels: Consistency is broken down into these levels - Dirty Reads, Non-Repeatable Reads, Repeatable Reads, Phantom Reads, Serialisable. You may want to read the following article to dive deeper into each.

Bonus Article: Deep Dive into Isolation Levels
Most databases by default will make a lot of these compromises for performance.

Conclusion
Now that you have an understanding of how to determine consistency, you should go ahead and explore the challenges with each of these criteria.
Have a Good Day!