Presenter: Keqiang He (University of Wisconsin-Madison)
Coauthors: Eric Rozner (IBM Research)
Kanak Agarwal (IBM)
Yu (Jason) Gu (IBM)
Wes Felter (IBM Research)
John Carter (IBM)
Aditya Akella (University of Wisconsin-Madison)
Congestion is not rare in datacenter networks, and 99.9th tile latency is orders of magnitude higher than the median mainly because of queueing. Unfortunately, admins cannot control VM TCP stacks in multi-tenant datacenters. As a result, outdated, misconfigured TCP stacks could be runninn in VM. This could lead to two problems: Large queueing latency, and TCP unfairness (i.e., ECN vs non-ECN). Further, different congestion control algorithms could lead to unfairness.
AC/DC TCP (Administrator Control over Data Center TCP) implements TCP congestion control in the vSwitch and ensures the VM TCP stacks can not impact the network. AC/DCP run in the data plane of vSwitch. In particular, AC/DC TCP does not requires changes form VMS and hardware, provides low lateny by using state-of-the-art CC algorithms, improves TCP fairness, and enforces per-flow differentiation via congestion control.
The design of AC/DC TCP is as follows:
- Obtaining congestion control state: per-flow connection tracking
- DCTCP congestion control in the vSwitch: universal ECN marking, getting ECN feedback via being piggybacked on an existing TCP ACK.
- enforcing cc: Reuses RWND for congestion control or drops any excess packets for non-conforming flows
- per-flow differentiation via congestion control
AC/DC TCP is implemented in OVS (Open vSwitch), and uses hardware supports for performance improvements, such as TCP segments, and checksum offloading.
For evaluation, 17 servers and 6 switches were used, and AC/DC TCP was compared against CUBIC, and DCTCP. The results demonstrates the followings:
- Running DCTCP on top of AC/DC closely tracks the window size of native DCTCP.
- AC/DC has comparable convergence properties as DCTCP and is better than CUBIC
- AC/DC improves fairness when VMs use different congestion control algorithms
- Less than 1% CPU overhead. Each connections uses 320b to maintain CC states.
- In an incast scenario, AC/DC tracks the performance of DCTCP closely.
- Flow completion time: AC/DC performs as well as DCTCP and 36%~76% better than CUBIC.
Q: What lead both papers to come up with the same / similar solution?
A: These are independent works and this is the most natural solution.
Q: Some applications want to have control over congestion control. Is there a mechanism that allows the VMs to take over?
A: Within datacenters, the throughput of CUBIC and DCTCP are quite similar, and our solution will not impact the performance of applications, and more like an optimal for throughput.
Q: Can someone game the system by observing the changes you are making, and gain more share from the network?
A: The packets will be dropped, and will not gain much.
Q: In the incast experiment, why measured RTT instead of throughput?
A: Flow size is small in many incast scenarios. Haven't considered throughput.