That is half 2 of the Crash Course in Caching sequence.
A distributed cache shops incessantly accessed information in reminiscence throughout a number of nodes. The cached information is partitioned throughout many nodes, with every node solely storing a portion of the cached information. The nodes retailer information as key-value pairs, the place every key’s deterministically assigned to a particular partition or shard. When a consumer requests information, the cache system retrieves the info from the suitable node, lowering the load on the backing storage.
There are completely different sharding methods, together with modulus, range-based and constant hashing.
Modulus sharding includes assigning a key to a shard based mostly on the hash worth of the important thing modulo the whole variety of shards. Though this technique is easy, it can lead to many cache misses when the variety of shards is elevated or decreased. It’s because many of the keys will likely be redistributed to completely different shards when the pool is resized.
Vary-based sharding assigns keys to particular shards based mostly on predefined key ranges. With this strategy, the system can divide the important thing house into particular ranges after which map every vary to a selected shard. Vary-based sharding might be helpful for sure enterprise situations the place information is of course grouped or partitioned in particular ranges, reminiscent of geolocation-based information or information associated to particular buyer segments.
Nevertheless, this strategy may also be difficult to scale as a result of the variety of shards is predefined and can’t be simply modified. Altering the variety of shards requires redefining the important thing ranges and remapping the info.
Constant hashing is…