Wednesday, December 14, 2016

CoNEXT 2016 Session 5: Datacenter 1

Enabling ECN over Generic Packet Scheduling

Wei Bai (Hong Kong University of Science and Technology), Kai Chen (Hong Kong University of Science and Technology), Li Chen (Hong Kong University of Science and Technology), Changhoon Kim (Barefoot Networks), and Haitao Wu (Microsoft)

In a data center context, different applications have different requirements (e.g., low latency vs high BW). To offer both of these, most DCs use both packet scheduling and ECN, and switching chips are beginning to support a wider variety of scheduling schemes.

Goal: Enable ECN for arbitrary packet schedulers.

With multiple queues, each queue needs its own ECN marking threshold (the current practice of static thresholds can lead to high latency). This means we need to estimate queue drain rate for each class of traffic, which is hard.

Instead, this paper presents Time-based Congestion Notification (TCN). Rather than marking packets when the queue length is above a threshold, TCN marks packets who spend longer than a threshold time in the queue.

Q: Can you actually measure real-time RTT?
A: Papers like PingMesh have shown that most DC RTTs are 10s of microseconds.

Q: Does the lambda parameter depend on which transport protocol you use? 
A: In the data center, you’ll likely use a single ECN-based transport, so the lambda can be fixed and known in advance.


Xpander: Towards Optimal-Performance Datacenters

Asaf Valadarsky (The Hebrew University of Jerusalem), Gal Shahaf (The Hebrew University of Jerusalem), Michael Dinitz (Johns Hopkins University), and Michael Schapira (The Hebrew University of Jerusalem)

Designing a data center network, need to decide: 1) topology, 2) routing, and 3) transport. These impact performance and deployability.

To achieve BOTH good performance and deployability, we need “expander data centers.”

A graph is an expander graph if it has good *edge expansion*, meaning the capacity across any possible cut in the network is large.

They present Xpander, an expander-graph DC topology (good performance) that’s designed with operational concerns in mind (good deployability).

Xpander achieves near-optimal performance while using fewer and shorter cables than a fat tree network.

Q: What happens to latency in the presence of congestion in Xpander.
A: We have not evaluated that, be we’re working on it now.

Q: Recently there have been proposals for flexible data center topologies. Can they borrow ideas from you?
A: We’re working on this now, and I think we’ll have some good results soon.


Network Scheduling Aware Task Placement in Datacenters

Ali Munir (Michigan State University), Ting He (IBM), Ramya Raghavendra (IBM), Franck Le (IBM), and Alex X Liu (Michigan State University)

Many data center applications deal with lots of data, and shuffling data among distributed applications causes lots of network traffic.

Current approaches are task-aware network scheduling (schedule related flows with similar priorities) and network-aware task scheduling (assigning related jobs to nearby nodes). The problem with these techniques is that they independently make decisions that impact one another, rather than sharing information and cooperating.

The authors present NEAT: Network scheduling aware task placement, which does both together. NEAT is a task scheduler that makes decisions based on network state and scheduling policies, in addition to the usual server loads and location of data on storage servers. The high-level intuition is to use knowledge about network scheduling policies to place tasks in such a way that minimizes the number of flows sharing each link.

Q: Have you considered interactions with load balancers?
A: We’re doing this right now.

Q: Since you have to maintain network state, and there are many short flows in data centers. Does this cause any limitations?
A: In our results, we saw that even when we made approximations about network state, the results are still good.