An efficient way to cache in distributed systems.
This is the fourth article of my Distributed Systems series, the previous one can be found here: CAP Theorem.
A way to cache in distributed systems.
The different databases responsible for remembering the values are sort of in a graph, in a circle, tied to each other, they need to know their position, and the previous and next position.
Whenever we come and ask for a value, we get a hash that connects us to one of the databases, so when we come again and ask for the value, the databases see our hash and know which one is responsible for our value.
What if one of the databases isn't available for some reason, this could lead to an issue. We can make sure that the previous and next database in the graph, also stores the hash and its value, this way if one of the databases isn't available, we can still give the user the value that they are mapped to via their hash.
Another issue is, what if two databases are down, and then the user decides to update their value, the database that is available will know of the update, but the databases that were down won't. This leads to the problem of the other databases becoming stale.
This would be called eventual consistency. We will be consistent with the values, but it could happen that we return the stale value.
Solution to eventual consistency
The solution for eventual consistency here, though the drawback here is that the latency increases due to more work, is for the databases to communicate with each other, and come to an agreement which write by the user (value) was the latest one, and use that value.
We could then ensure a strong consistency, though, we wouldn't want to do this in every place, only if it is a MUST that certain values of the system must be up-to-date every single second the user were to ask for that value, because the latency increases, leading to a worse user experience, but again, it depends on your use-case.