Most software engineers are well versed in the CAP theorem but I feel some of us have not heard the PACELC Theorem which according to me is an updated version of the CAP theorem.
In the CAP theorem, we have to choose between consistency or availability when the network failure happens. Let’s say we have 2 nodes of a database and the user updates one record which will be updated for any one node. Since we do have a network partition, node 1 will not be able to send the updated data to node 2. Now if we choose availability, the application will read from both nodes which will hamper our consistency as data is not consistent and if we choose consistency, we need to stop serving from node 2 which will hamper our availability as it will not be100 percent available.
Now that we know the cap theorem we will think about what choice a distributed system has when there is no network partition. Is it consistent or available or both??? This is missing in the CAP theorem. This is where the PACELC theorem comes into the picture.
Fun fact: Think about how we have to pronounce PACELC Theorem. Its “pass-elk” (according to official paper) paper link
Now since you know how to pronounce it, it’s better to know what is this theorem.
The PACELC theorem states that in a system:
- If there is a partition (‘P’), we have to choose between availability and consistency (i.e., ‘A’ and ‘C’);
- Else (‘E’), when the system is running normally in the absence of partitions, we have to choose between latency (‘L’) and consistency (‘C’).
Lets dig a bit more deeper. Maybe we find something interesting
Okay so the first (i.e. PAC) is the same as the CAP just spelled backward. Easy right?
Now since you know half of this theorem better to know 2nd half as well. So 2nd half says that if there is no partition we have to choose between latency and consistency.
Let’s focus on ELC (2nd half of the theorem). Let’s say we have 3 nodes of a database. Since there will be no network partition so now we have 3 options.
Option 1 (Low latency): When a write request comes up to node 1, it will send the response immediately after writing data into node 1 and eventually node 1 will pass it to node 2 and node 3. In this scenario, we chose low latency because of which we have to settle with eventual consistency.
Option 2( Strong consistency): When a write request comes up to node 1, it will send the response after writing data into all the nodes. In this scenario, we chose Strong consistency which leads to high latency.
Conclusion
The tradeoffs involved in building distributed database systems are very complex, and neither CAP nor PACELC can explain them all. Nonetheless, incorporating the consistency/latency tradeoff into modern DDBS design considerations is important enough to warrant bringing the tradeoff closer to the forefront of architectural discussions
To learn about this in a fun way try this Fun Time