Friday, November 22, 2013

Hotnets '13: Managing the Network with Merlin

Authors: Robert Soulé, Shrutarshi Basu, Robert Kleinberg, Emin Gün Sirer, Nate Foster (Cornell University)

Presented by: Robert Soulé

Enforcing network management policies is complicated using current techniques. A lot of different diverse machines have to be configured in multiple ways to enforce even relatively simple policies. When these rules need to be modified, it is even harder to reason about the effect of these changed rules upon the different devices and upon the network as a whole. Further, current techniques assume that networks belong to single administrative authorities and do not allow delegation of authority.

The paper addresses the above challenges. First, in order to make policies easier to express and easier to implement, the paper offers an abstraction for global networking policies using a high-level declarative language, which is then mapped to a constraint problem. Second, the paper introduces a framework where authority can be delegated to "tenants". To make sure that tenants are actually obeying their constraints, the paper also introduces a technique for verification.

Policies are expressed using a combination of logical predicates, regular expressions and bandwidth constraints, which can allow the administrator to specify the intended behavior of the network at a high level of abstraction. A compiler then evaluates these policies and generates the code. The problem is then mapped into a constraint problem using NFA, which is solved using linear programming.

To evaluate Merlin, the authors ran a TCP trace and a UDP trace in two network configurations - one with Merlin and another without. They noted that the TCP trace performed much better in the configuration with Merlin (which was configured to grant higher priority to TCP). The authors also evaluated Merlin on a testbed with 140 nodes to check how well the system scales. They noted that it took 6 seconds to generate and run the linear programming model, which they think is fairly low.

Q) What happens when there are failures?
A) One possible approach is to model the failures as input to the constraint problem.

Q) How scalable is Merlin? Constraint solving may not scale well with a higher number of devices and with different types of devices?
A) We looked at networks with 140 nodes. We think it's fairly good to look at numbers of this scale. Google B4 was doing the same, but they used only a dozen classes of traffic, whereas we handle about a hundred classes of traffic on 140 different nodes. Scaling to even larger sizes may be possible by coming up with engineering approaches to further reduce the time to solve the linear programming problem. Alternately, perhaps we can adopt appoximation solutions which could result in much faster run times at a small accuracy tradeoff

Q) What does the NFA abstraction buy you?
A) We are language researchers and want to program in abstractions. We are getting powerful abstractions such as verification of policies that are farmed out to tenants, something that nobody has done before.