Eventual Consistency
Once in a while, an idea emerges that is contrary to the way we have grown accustomed to doing things. Eventual consistency is such an idea, and the way we used to build datastores was with SQL and ACID transactions.
So whats wrong with that?
Too many generals
Information always flows as messages using a medium – not transactions. Atomic transactions spanning multiple systems is an abstraction that doesn’t really exist in real life. It’s impossible to guarantee the atomicity of a distributed transaction as proved by the Two_Generals%27_Problem. The more generals – or partitions of a distributed system – the worse the problem. Not knowing the entire truth, is thus a fact of life that each partition must deal with, and therefore we need to ensure that the knowledge of all partitions converges when they send messages to eachother. This is eventual consistency.
Eventual Consistency in Real Life
Finance has always embraced eventual consistency. Money transfer is – contrary to popular belief – not done by transactions, but is an elaborate process where the money is first withdrawn from one account and after some time deposited at another account. Meanwhile the system is inconsistent for the outside observer, due to money being “in movement”. When the money arrives, eventually, the system will again be consistent.
If I send a cheque by mail when I have enough money on my account, but it turns out I have insufficient funds, when the receiver tries to cash it, a conflict arises due to balance information being slow and inconsistent. The entire system still manages because we have processes for resolving conflicts. In this case the cheque bounces.
It turns out, that in all cases of distributed updates to an object predating computerisation, eventual consistency is the norm, and conflict resolution processes exist. Medical records is another example where information from many sources about the same patient arrive out of order and is eventually reconciled.
Early computerisation changed this by building monolithic systems to model the truth about something – and when the system was not available, users had to wait or go back to the old and robust processes without computers.
Physical limitations
Consider the physics of sending information. Disregarding exotic theories involving multiple timelines, information cannot move faster than light. This would lead to a paradox, where you could make a phone call into the past with a so-called Tachyonic_antitelephone.
In the globally connected world, location and travel of information imposes very real limitations on the availability of up-to-date data. Connections drop, systems go down, sometimes entire data centers go offline for extended periods of times, and come back up using a restored, old version of data. When building highly distributed systems, these limitations cannot be sanely abstracted away in underlying distributed databases. The ugly head of reality will either cause distributed transactions to crumble write availability, or make cracks in the flawed illusion that all users always see up-to-date data.
Dealing with eventual consistency
It is the gut instinct of every programmer to avoid complexity by abstraction. Since the old abstractions of transactions and a perfectly up-to-date database don’t scale to distributed systems with high write availability, we need eventual consistency as theoretical basis and new abstractions and techniques for handling it. Luckily this has been the subject of academia for decades, and recently the NoSQL movement has pioneered numerous small and large-scale eventually consistency systems drawing many real-life experiences, which will be the topic of my next blog entry.
Rune Skou Larsen NoSQL evangelist @Trifork