Authors: Mosharaf Chowdhury (UC Berkeley), Ion Stoica (UC Berkeley)
Public review: http://conferences.sigcomm.org/sigcomm/2015/pdf/reviews/188pr.pdf
Motivation: Based on a month-long trace of 320,000 FB jobs, it was found that on average about 25% of the runtime is spent in intermediate communications. As SSD-based jobs become more common, the network will become the bottleneck.
Flow–based solutions are either classified as per-flow fairness or flow completion time approaches. A coflow is a communication abstraction for data-parallel applications to express their performance goals:
- Minimize completion times
- Meet deadlines or
- Perform fair allocation.
Consider LAS (Least-Attained Service), which prioritizes the flow that has sent the least amount of data. Coflow-Aware LAS (CLAS) prioritizes the coflow that has sent the least total number of bytes. The challenges of CLAS, which are also shared by LAS, include:
- It can lead to starvation
- It is suboptimal for similar size coflows, since it reduces to fair sharing.
Discretized coflow-aware LAS (D-CLAS) uses the following:
- Priority discretization (Change priority when total number of bytes sent exceeds predefined thresholds).
- Scheduling policies (FIFO within the same queue and prioritization across queue).
- Weighted sharing across queues (guarantees starvation avoidance).
Aalo is a scheduler for DCLAS. It has a coordinator and a number of workers. Aalo is non-blocking, which means when a new coflow arrives at an output port, one puts its flow(s) in lowest priority queue and schedule them immediately. No need to syn all flows of a coflow as in Varys. Workers are sent information about active coflows periodically. The coordinator computes the total number of bytes sent and relays this information back to the workers.
Aalo is evaluated with a 3000 machine trace-driven simulation matched against a 100-machine EC2 deployment. Results show that Aalo is on par with clairvoyant approaches for EC2. Aalo generally outperforms Varys for job completion time. With regard to scalability, the results show that the faster Aalo jobs can coordinate, the better Aalo performs.
In summary: Aalo efficiently schedules coflows without complete information. It makes coflows practical in the presence of failures and DAGs. There is improved performance over flow-based approaches. It provides a simple, non-blocking API. The code is open-sourced at https://github.com/coflow
Q: Is there any benefit to fixing the priorities using a different approach from what you used in the paper?
A: Short answer is that is future work. Longer answer: You can do something as smart as that.
Q: I am trying to understand how to take these results in the context of Kay’s results from NSDI'15. What happens if I try your approach in Spark, and how will the results line-up?
A: All of the results depend on the type of work.
Q: This work seems similar to PeakFabric. Could you comment on the difference between the two?
A: In this dimension it is about the coflows, which makes the problem more challenging.
Q: What about coflow dependencies?
A: That is future work. This is a great question.
Q: What was the coflow distribution? Where there some which were large and others which were small?
A: In general there are differences between the sizes of the coflows. I don’t have the details on how they were different.