Layer 9: HotNets'13: Towards Minimal-Delay Deadline-Driven Data Center TCP

Friday, November 22, 2013

HotNets'13: Towards Minimal-Delay Deadline-Driven Data Center TCP

Li Chen, Shuihai Hu, Kai Chen (HKUST), Haitao Wu (Microsoft Research Asia), Danny H.K. Tsang (HKUST).

Paper: http://conferences.sigcomm.org/hotnets/2013/papers/hotnets-final92.pdf

Data center workloads have flows with diverse deadlines as recognized by earlier work such as D3, D2TCP, PDQ, and pFabric. Some of these approaches are ad-hoc: For instance, DCTCP maintains shallow queues which indirectly affects flow completion times; while others require intrusive hardware changes (e.g. D3 and pFabric). This paper presents MCP, which is an end-to-end congestion control algorithm to determine the "right" rates to meet flow deadlines, while being readily deployable.

The key idea in the paper is formulating the problem explicitly as a stochastic network optimization problem and derived an end-to-end scheme using a standard technique called "drift plus penalty method." See paper for more details.

What I think is interesting about the paper is that the authors formulated the problem explicitly and derived the end-to-end window update algorithm to achieve the optimal rates needed to meet as many flow deadlines as possible.

Q: You are adapting rate over time and starting rate. Have you quantified the benefit of each modification's? How much of the benefit is just due to the right starting rate?
A: It has to start at expected rate, or dynamics would take a long time to converge. If flows just stick to starting rate, the network will be unstable.

Q: Is the objective to maximize the number of deadlines met?
A: No, it is to minimize per-packet delays.

Q: Recent works look at flow-level metrics (mean FCT, etc.). What impact do these metrics have on application performance (e.g. MapReduce)?
A: Most work in this area is on flow-level performance, so we used the same. For specific applications, I think there is room for improvement.

Q: In your graphs, what is "optimal"? Why is it different from the line labeled "throughput"?
A: Optimal is computed centrally using per-hop information. It is hard for a stochastic program to always operate at the optimal point -- we just proved convergence to optimality in the stochastic sense.

Friday, November 22, 2013

HotNets'13: Towards Minimal-Delay Deadline-Driven Data Center TCP

No comments:

Post a Comment