Thursday, August 20, 2015

TIMELY: RTT-based Congestion Control for the Datacenter


This paper was presented in the "Congestion Control and Transport Protocols" session at Sigcomm 2015, London. The authors for the paper are : Radhika Mittal (University of California, Berkeley), Vinh The Lam (Google, Inc.), Noontide Dukkipati (Google, Inc.), Emily Blem (Google, Inc.), Hassan Wassel (Google, Inc.), Monia Ghobadi (Microsoft), Amin Vahdat (Google, Inc.), Yaogong Wang (Google, Inc.), David Wetherall (Google, Inc.), David Zats (Google, Inc.).

Radhika Mittal presented the paper. The links to the paper and the public review are here : PaperPublic review.

The authors claim theirs is the first RTT based congestion control scheme for data center network (DCN).  Data centers typically require a high throughput and low latency network and is less-tolerant to packet losses. Traditional transport protocols which are loss-based are not suitable for the data center environment; the state-of-art data center transport protocols like DCTCP and other ECN based schemes, use support from the network switches (through markings)  to indicate onset of congestion.

RTT although a direct indicator of latency, is not useful because of noise in measurements; the noise becomes prominent at micro-second latency levels. The authors argue that noise in measurements can be avoided by computing RTT using timestamps from the NIC; they show experimentally that there exists a strong correlation between RTT from NIC and queue length in the switches. The reader is referred to paper for the details of how timestamps are obtained from the NIC.

Using the computed RTT, authors propose TIMELY, a RTT-gradient based AI-MD rate control algorithm which increase the sending rate by a constant if the gradient, d(RTT)/dt, is non-positive and decreases rate multiplicatively if d(RTT)/dt > 0. They adopt a rate based approach, as opposed to window based rate control because it suits well with widespread use of NIC support. To address the jittery nature of RTT-gradient measurements because of traffic bursts, safeguards are provided so that multiplicative decrease does not kick in even when the absolute RTT is very low; similar safeguard exists to avoid rate-increase for high RTT values.

The evaluation compares TIMELY with, a kernel stack implementation of DCTCP, and priority flow control (PFC). Through small-scale experiments, it is shown that for roughly the same throughput, TIMELY provides an order of magnitude lower RTT. Large scale experiments (100 of machines in CLOS cluster) also show TIMELY is consistently able to support low latency requirements.

Q & A session

1) How do you account for one way congestion?
ACK prioritization to ensure RTT is not affected by reverse path congestion.

2) [Jitu Padhye, MSR] Would tighter correlation between ECN marking and queue length be observed if hardware pacing were  used ?
Yes.

3) How to deal with route flapping and multi path or ECMP ?
Left for future work.