1. In a distributed computer system you can only support two of the following guarantees:
1. Consistency - Every read receives the most recent write or an error
2. Availability - Every request receives a response without guarantee that it contains the most recent version of the information
3. Partition Tolerance - The system continues to operate despite arbitrary partitioning due to network failures
2. Networks aren't reliable so you'll need to support partition tolerance. You'll need to make a software tradeoff between consistency and availability.
3. CP - consistency and partition tolerance
Waiting for a response from the partitioned node might result in a timeout error. CP is a good choice if your business needs require atomic reads and writes.
4. AP - availability and partition tolerance
Responses return the most readily available version of the data available on any node which might not be the latest. Writes might take some time to propagate when the partition is resolved.
5. AP is a good choice if the business needs to allow for eventual consistency or when the system needs to continue working despite external errors.
6. Note: Close to two months ago I wrote a blog post explaining the CAP Theorem. Since publishing I've come to realize that my thinking on the subject was quite outdated and is no longer applicable to the real world. I've attempted to make up for that in this post.
7. In today's technical landscape we are witnessing a strong and increasing desire to scale systems out when additional resources (compute storage etc.) are needed to successfully complete workloads in a reasonable time frame. This is accomplished through adding additional commodity hardware to a system to handle the increased load. As a result of this scaling strategy an additional penalty of complexity is incurred in the system. This is where the CAP theorem comes into play.
8. The CAP Theorem states that in a distributed system (a collection of interconnected nodes that share data.) you can only have two out of the following three guarantees across a write/read pair: Consistency Availability and Partition Tolerance - one of them must be sacrificed. However as you will see below you don't have as many options here as you might think.
1. Consistency - A read is guaranteed to return the most recent write for a given client.
2. Availability - A non-failing node will return a reasonable response within a reasonable amount of time (no error or timeout).
3. Partition Tolerance - The system will continue to function when network partitions occur.
9. Before moving further we need to set one thing straight. Object Oriented Programming != Network Programming! There are assumptions that we take for granted when building applications that share memory which break down as soon as nodes are split across space and time.
10. One such fallacy of distributed computing is that networks are reliable. They aren't. Networks and parts of networks go down frequently and unexpectedly. Network failures happen to your system and you don't get to choose when they occur.
11. Given that networks aren't completely reliable you must tolerate partitions in a distributed system period. Fortunately though you get to choose what to do when a partition does occur. According to the CAP theorem this means we are left with two options: Consistency and Availability.
12. CP - Consistency/Partition Tolerance - Wait for a response from the partitioned node which could result in a timeout error. The system can also choose to return an error depending on the scenario you desire. Choose Consistency over Availability when your business requirements dictate atomic reads and writes.
13. AP - Availability/Partition Tolerance - Return the most recent version of the data you have which could be stale. This system state will also accept writes that can be processed later when the partition is resolved. Choose Availability over Consistency when your business requirements allow for some flexibility around when the data in the system synchronizes. Availability is also a compelling option when the system needs to continue to function in spite of external errors (shopping carts etc.)
14. The decision between Consistency and Availability is a software trade off. You can choose what to do in the face of a network partition - the control is in your hands. Network outages both temporary and permanent are a fact of life and occur whether you want them to or not - this exists outside of your software.
15. Building distributed systems provide many advantages but also adds complexity. Understanding the trade-offs available to you in the face of network errors and choosing the right path is vital to the success of your application. Failing to get this right from the beginning could doom your application to failure before your first deployment.
16. The CAP FAQ
Version 1.0 May 9th 2013
By: Henry Robinson / [email protected] / @henryr
http://the-paper-trail.org/
17. 0. What is this document?
No subject appears to be more controversial to distributed systems engineers than the oft-quoted oft-misunderstood CAP theorem. The purpose of this FAQ is to explain what is known about CAP so as to help those new to the theorem get up to speed quickly and to settle some common misconceptions or points of disagreement.
18. Of course there's every possibility I've made superficial or completely thorough mistakes here. Corrections and comments are welcome: let me have them.
19. There are some questions I still intend to answer. For example
1. What's the relationship between CAP and performance?
2. What does CAP mean to me as an engineer?
3. What's the relationship between CAP and ACID?
Please suggest more.
20. 1. Where did the CAP Theorem come from?
Dr. Eric Brewer gave a keynote speech at the Principles of Distributed Computing conference in 2000 called 'Towards Robust Distributed Systems' [1]. In it he posed his 'CAP Theorem' - at the time unproven - which illustrated the tensions between being correct and being always available in distributed systems.
21. Two years later Seth Gilbert and Professor Nancy Lynch - researchers in distributed systems at MIT - formalised and proved the conjecture in their paper “Brewer's conjecture and the feasibility of consistent available partition-tolerant web services” [2].
22. [1] http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf [2] https://users.ece.cmu.edu/~adrian/731-sp04/readings/GL-cap.pdf
23. 2. What does the CAP Theorem actually say?
The CAP Theorem (henceforth 'CAP') says that it is impossible to build an implementation of read-write storage in an asynchronous network that satisfies all of the following three properties:
1. Availability - will a request made to the data store always eventually complete?
2. Consistency - will all executions of reads and writes seen by all nodes be atomic or linearizably consistent?
3. Partition tolerance - the network is allowed to drop any messages.
24. The next few items define any unfamiliar terms.
25. More informally the CAP theorem tells us that we can't build a database that both responds to every request and returns the results that you would expect every time. It's an impossibility result - it tells us that something we might want to do is actually provably out of reach. It's important now because it is directly applicable to the many many distributed systems which have been and are being built in the last few years but it is not a death knell: it does not mean that we cannot build useful systems while working within these constraints.
26. The devil is in the details however. Before you start crying 'yes but what about...' make sure you understand the following about exactly what the CAP theorem does and does not allow.
27. 3. What is 'read-write storage'?
CAP specifically concerns itself with a theoretical construct called a register. A register is a data structure with two operations:
1. set(X) sets the value of the register to X
2. get() returns the last value set in the register
28. A key-value store can be modelled as a collection of registers. Even though registers appear very simple they capture the essence of what many distributed systems want to do - write data and read it back.
29. 4. What does atomic (or linearizable) mean?
Atomic or linearizable consistency is a guarantee about what values it's ok to return when a client performs get() operations. The idea is that the register appears to all clients as though it ran on just one machine and responded to operations in the order they arrive.
30. Consider an execution consisting the total set of operations performed by all clients potentially concurrently. The results of those operations must under atomic consistency be equivalent to a single serial (in order one after the other) execution of all operations.
31. This guarantee is very strong. It rules out amongst other guarantees eventual consistency which allows a delay before a write becomes visible. So under EC you might have:
1. set(10) set(5) get() = 10
But this execution is invalid under atomic consistency.
32. Atomic consistency also ensures that external communication about the value of a register is respected. That is if I read X and tell you about it you can go to the register and read X for yourself. It's possible under slightly weaker guarantees (serializability for example) for that not to be true. In the following we write A: to mean that client A executes the following operation.
33. B:set(5) A:set(10) A:get() = 10 B:get() = 10
This is an atomic history. But the following is not:
34. B:set(5) A:set(10) A:get() = 10 B:get() = 5
even though it is equivalent to the following serial history:
35. B:set(5) B:get() = 5 A:set(10) A:get() = 10
In the second example if A tells B about the value of the register (10) after it does its get() B will falsely believe that some third-party has written 5 between A:get() and B:get(). If external communication isn't allowed B cannot know about A:set and so sees a consistent view of the register state; it's as if B:get really did happen before A:set.
36. Wikipedia [1] has more information. Maurice Herlihy's original paper from 1990 is available at [2].
37. [1] http://en.wikipedia.org/wiki/Linearizability [2] http://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf
38. 5. What does asynchronous mean?
An asynchronous network is one in which there is no bound on how long messages may take to be delivered by the network or processed by a machine. The important consequence of this property is that there's no way to distinguish between a machine that has failed and one whose messages are getting delayed.
39. 6. What does available mean?
A data store is available if and only if all get and set requests eventually return a response that's part of their specification. This does not permit error responses since a system could be trivially available by always returning an error.
40. There is no requirement for a fixed time bound on the response so the system can take as long as it likes to process a request. But the system must eventually respond.
41. Notice how this is both a strong and a weak requirement. It's strong because 100% of the requests must return a response (there's no 'degree of availability' here) but weak because the response can take an unbounded (but finite) amount of time.
42. 7. What is a partition?
A partition is when the network fails to deliver some messages to one or more nodes by losing them (not by delaying them - eventual delivery is not a partition).
43. The term is sometimes used to refer to a period during which no messages are delivered between two sets of nodes. This is a more restrictive failure model. We'll call these kinds of partitions total partitions.
44. The proof of CAP relied on a total partition. In practice these are arguably the most likely since all messages may flow through one component; if that fails then message loss is usually total between two nodes.
45. 8. Why is CAP true?
The basic idea is that if a client writes to one side of a partition any reads that go to the other side of that partition can't possibly know about the most recent write. Now you're faced with a choice: do you respond to the reads with potentially stale information or do you wait (potentially forever) to hear from the other side of the partition and compromise availability?
46. This is a proof by construction - we demonstrate a single situation where a system cannot be consistent and available. One reason that CAP gets some press is that this constructed scenario is not completely unrealistic. It is not uncommon for a total partition to occur if networking equipment should fail.
47. 9. When does a system have to give up C or A?
CAP only guarantees that there is some circumstance in which a system must give up either C or A. Let's call that circumstance a critical condition. The theorem doesn't say anything about how likely that critical condition is. Both C and A are strong guarantees: they hold only if 100% of operations meet their requirements. A single inconsistent read or unavailable write invalidates either C or A. But until that critical condition is met a system can be happily consistent and available and not contravene any known laws.
48. Since most distributed systems are long running and may see millions of requests in their lifetime CAP tells us to be cautious: there's a good chance that you'll realistically hit one of these critical conditions and it's prudent to understand how your system will fail to meet either C or A.
49. 10. Why do some people get annoyed when I characterise my system as CA?
Brewer's keynote the Gilbert paper and many other treatments places C A and P on an equal footing as desirable properties of an implementation and effectively say 'choose two!'. However this is often considered to be a misleading presentation since you cannot build - or choose! - 'partition tolerance': your system either might experience partitions or it won't.
50. CAP is better understood as describing the tradeoffs you have to make when you are building a system that may suffer partitions. In practice this is every distributed system: there is no 100% reliable network. So (at least in the distributed context) there is no realistic CA system. You will potentially suffer partitions therefore you must at some point compromise C or A.
51. Therefore it's arguably more instructive to rewrite the theorem as the following:
1. Possibility of