Authors: Alok Kumar (Google), Sushant Jain (Google), Uday Naik (Google), Anand Raghuraman (Premise Data Corp), Nikhil Kasinadhuni (Google), Enrique Cauich Zermeno (Google), C. Stephen Gunn (Google), Jing Ai (Google), Björn Carlin (Google), Mihai Amarandei-Stavila (Google), Mathieu Robin (Google), Aspi Siganporia (Google), Stephen Stuart (Google), Amin Vahdat (Google)
Presented by: Alok Kumar from Google
Public Review: http://conferences.sigcomm.org/sigcomm/2015/pdf/reviews/175pr.pdf
The BwE system has been deployed inside Google for the last 5 years.
Google's internal network traffic is enormous: If Google would be considered an ISP, it would be the 2nd biggest one in the world!
B4 is Google's internals WAN. It spans multiple continents (North America, Europe, Asia)
Such a giant-scale network has many inefficiencies in bandwith allocation.
The goal of the project: a centralized bandwith allocation algorithm at the scale of Google's WAN
(allowing for flexible allocation polcies; enforcement at hosts), to minimize inefficiencies.
System before BwE: Thousands of competing users classified in few classes; no differentiation
Search vs Gmail, instead of user by user.
No good solution for non-critical application such as backups, where latency is irrelevant, but high throughput necessary.
What problem does it solve?
Visibility into users
Sharing of WAN bandwith based on configured policies -> users can specify/buy requirements
System Architecture: Global enforcer (takes policies and network model as input), computes allocation -> send to several cluster enforcers -> send to multiple job enforcers -> send to host enforcers
Policies have the following form:
Guranteed Bandwith (with weight) + Best Effort (with weight)
For example: Gmail: 10 Gbps guranteed + 20 Gbps best effort with w=2 and 50 Gbps with w=1
Algorithm: path selection (traffic engineering) & bandwith allocation are optimized independently, rather than doing joint optimization. This is suboptimal, but scales better.
(1) Traffic Engineering (TE): run less frequently (so things scale): determines paths -> input to MPFA
(2) MultiPath Fair Allocation (MPFA): Can handle arbitrarily complex networks, flowgroups can take multiple paths, network can have bottlenecks
Failure Handling: Redundancy at each layer
Future Work: Deadline based scheduling, Joint BwE-TE optimization
* is single place for specifying bandwith polices
* enables efficient use of network resources
[Unfortunately very incomplete due to acoustic problems]
Q1: Do applications need to specify requirements?
A: [had problems to understand acoustically] We have both; services buy bandwith
Q2: Do you over subscribe to get good utilization?
A: On certain levels
A: Smooth polices over time (5 minutes)
Q4: Assumption that all hosts offer roughly the same amount of traffic; how much inbalance is there in practice?
Q5: What about delay? Do you take it into account?
A: TE does somewhat (tries to assign shortest paths)