Link to paper Characterizing Load Imbalance in Real-World Networked Caches
Modern web services nowadays heavily rely on the in-memory caches to reduce the request latency for users and reduce the load on the back-end storage servers. To avoid overloading a single cache server, data is partitioned into multiple segments (“shard”) and stored across different cache servers. Related objects tend to be mapped into the same shard; and ideally, each cache server should see similar request rates. However, in the real-world workload, we often observe the load imbalance among the cache servers. Potential causes of load imbalance include: hashing schemes, skewed access popularity, etc. Using real workload of Facebook’s TAO cache system, this paper tries to answer the following questions:
- What is the state of load imbalance?
- What contributes to load imbalance?
- How effective are current techniques?
- How to improve?
The authors find that significant load imbalance observed in TAO. Possible causes:
- Content popularity (ranking top 100 shards, and top 20 objects per server)
- Hot objects: (Interestingly) very hot hot object alone are not a major cause of load imbalance of cache server.
- Hot shards: Popular shards have much more traffics than the hot object
Next, the presenter evaluates the current techniques used to mitigate the load imbalance (consistent-hashing, hot-content replication) and find room for improvement. 3 approaches for consistent hashing: consistent hashing for in-memory networked caches implemented in libketama, TAO’s hashing, and baseline “perfect hashing”. 3 approaches for replication: None (no replication), TAO’s replication, optimal replication. The combination of these (hashing,replication) are evaluated using Facebook’s TAO workload. Metric used: max/avg, where max,avg are volume of requests of most loaded/average loaded servers). Some key results:
- standard hashing (implemented in libketama): 53% higher load on the most loaded server compared to the average. Only achieve 41% higher load when implemented with optimal replication.
- TAO’s hashing in general performs better than standard hashing, but still worse than when optimal hashing or replication is implemented in TAO
- The streaming algorithm outperforms other replication approaches, but still has room for further improvement.
- Characterize how hashing and replication affect load imbalance?
- Can streaming algorithm replicate content before its popularity surges?
- Can we predict the popularity spikes and prevent hotspots.