Friday, November 22, 2013

HotNets'13: Towards Minimal-Delay Deadline-Driven Data Center TCP

Li Chen, Shuihai Hu, Kai Chen (HKUST), Haitao Wu (Microsoft Research Asia), Danny H.K. Tsang (HKUST).


Data center workloads have flows with diverse deadlines as recognized by earlier work such as D3, D2TCP, PDQ, and pFabric.  Some of these approaches are ad-hoc: For instance, DCTCP maintains shallow queues which indirectly affects flow completion times; while others require intrusive hardware changes (e.g. D3 and pFabric).  This paper presents MCP, which is an end-to-end congestion control algorithm to determine the "right" rates to meet flow deadlines, while being readily deployable.

The key idea in the paper is formulating the problem explicitly as a stochastic network optimization problem and derived an end-to-end scheme using a standard technique called "drift plus penalty method."  See paper for more details.

What I think is interesting about the paper is that the authors formulated the problem explicitly and derived the end-to-end window update algorithm to achieve the optimal rates needed to meet as many flow deadlines as possible.

Q: You are adapting rate over time and starting rate.  Have you quantified the benefit of each modification's?  How much of the benefit is just due to the right starting rate?
A: It has to start at expected rate, or dynamics would take a long time to converge.  If flows just stick to starting rate, the network will be unstable.

Q: Is the objective to maximize the number of deadlines met?
A: No, it is to minimize per-packet delays.

Q: Recent works look at flow-level metrics (mean FCT, etc.).  What impact do these metrics have on application performance (e.g. MapReduce)?
A: Most work in this area is on flow-level performance, so we used the same.  For specific applications, I think there is room for improvement.

Q: In your graphs, what is "optimal"?  Why is it different from the line labeled "throughput"?
A: Optimal is computed centrally using per-hop information.  It is hard for a stochastic program to always operate at the optimal point -- we just proved convergence to optimality in the stochastic sense.