Simon Kassing (ETH Zürich), Asaf Valadarsky, Gal Shahaf, and Michael Schapira (Hebrew University of Jerusalem), and Ankit Singla (ETH Zürich)
It has been observed that datacenters experience skewed traffic -- there are hotspots. All-to-all non-blocking connectivity fabrics are expensive. When a core switch in a fat tree is removed, a fraction of . In Simon Kassing's example (k=4), the fat tree is at 75% capacity, but only 50% of demand can be sent at full line-rate. Past work has proposed dynamic interconnect - the topology in a datacenter is dynamically adapted to meet demands. However, some dynamic networks can have many engineering challenges. Kassing raises foundational questions for flexible networks:
- Rigorous benchmarks? (Fat trees are inflexible)
- What is the utility of dynamic links?
Kassing proposes at expander-based datacenters are static, but flexible. In particular, he points to Xpander (shown in the figure above). Kassing introduces is the throughput-proportionality metric:
The key experimental takeaway is that Xpander achieves comparable performance to non-blocking fabrics at a lower cost (2/3ds or less). Kassing argues that Xpander has introduced a new benchmark that all proposals to date do not meet.
This work critically examines current trends in datacenter network topologies, and is able to provide a case for using static datacenter networks, giving rise to a new topology benchmark.
Q (comment): Great open-sourcing / reproducibility. (note: code can be found here)
Q: How could a reconfigurable network be used? (i.e. why not expander graphs?)
A: Dynamic networks work better than expander graphs on the larger end of server traffic demand, or if you compare to just ECMP.
Q: Where is the cost of every topology?
A: In the paper.