Authors: Ratul Mahajan (Microsoft Research), Roger Wattenhofer (ETH).
They argue that a problem with SDN may be in the area of transitions between current states and future states during rule changes. When the SDN controller wants to change some rules and sends some updates into the network, which is asynchronous, the state does not change atomically and may result in things like network loops during the transition.
One approach is to try to make the state transition atomic by propagating version numbers and don't tell the nodes to start using a version until all nodes have received it. This slows down updates when some nodes don't aren't affected by the rule change. The way of assuming all nodes rules are dependent on one another gives stronger packet coherence, but if we instead targeted only the nodes that have dependencies on one another we can try to achieve minimal SDN updates.
It is possible to find a minimal set of nodes that need to be involved for a given set of properties for various kinds of consistency, but the difficulty is not just how to compute new rules, but how to get from the current configuration to the new configuration gracefully.
Suppose you have a set of new rules and run it through an update plan generator that creates an update DAG that defines when and where various rule updates need to be applied. If the graph is cyclic the cycle has to be broken and then an optimizer can reduce the graph into a smaller set of operations. At this point, the executor can then apply the updates.
Q: There have been a couple of papers looking at program synthesis. You can view these as instantiations of your update planner. Can you say more about what is in the planner and how it works?
A: We only have it for some kind of properties. In this paper we only look at loop freedom. It doesn't work in general for any property that you define. For us we want to look at specific properties and understand them better.
Q: Work has been done in this area for distributed networks (pre-SDN). Is there a formal version of how to map the distributed approaches to the central approach.
A: We don't know yet. At the surface they look different because you can exactly control which node you are sending an update to, which isn't true in the distributed case. It is possible those algorithms can be used in some p
Q: Is there a small set of properties that are broadly useful enough or do you have to fall back to something like version numbers for the general case?
A: I wouldn't characterize the version numbers as a general solution, because they don't work universally. I would choose the fastest thing you can do that has the correctness property.
Observation (from audience): Depending on the application people go with different data consistently problems. Maybe all you care about is looping.