Friday, December 23, 2016

CoNEXT 2016 Keynote: Enabling Software-Defined Network Security for Next-Generation Networks

Prof. Vyas Sekar, SIGCOMM's 2016 Rising Start Awardee, gave a remarkable talk about his research on Software-Define Security. His talk has been both a reflection on the work that he and his colleagues have been doing over the years and in the process of doing it. With that in mind, he speaks both on the technicals aspects of his work and on how he picked and solved the fundamental research problems that originated each work.

(Edit: You can find his slides here.)

His initial considerations:
The main motivation for his recent work on network security comes from the fact that the traditional ways network security is provided are not keeping up with the pace of innovation of the attacks. Currently, there is an asymmetry between the kinds of attacks which are growing in sophistication and scale (i.e., may change and become polymorphic, may use botnets etc) and the tools that are used to protect from them. Also, techniques often rely on the assumption that attacks always come from the outside, but the notion of "good guys inside" is not true. In particular, operators say current solutions are hopelessly practical in handling new types of attacks and the main reasons are that they may incur a high cost, management complexity, and user frustration (i.e., the fundamental tussle between security and usability).

In that sense, he believes that Software-Defined security is the right way forward on network security. The fundamental piece that has driven his research in recent years was applying the recent capabilities provided by SDN and NFV to solve the existing security problems in networks.

He advocates that the best chance to challenge the innovation of attacks is breaking away from parameters and hardware-centric mentality (dedicated boxes in fixed shocked points) and allowing a more software-driven vision for enabling network security solutions. The ultimate goal is having the ability to implement a flexible portfolio of defensive applications which are not constrained in location, capacity, or functionality.

Mainly, he argues, SDN and NFV provide three key capabilities. First, agility to change and customize the security profile of the network as threats evolve. Second, flexibility to place security appliances anywhere (as often the network topology gets in the way of acquiring the right context for each appliance). Third, performance elasticity, in order to scale security appliances up and down as needed.

Although promising, he says, the use of SDN and NFV in this context comes with a set of challenges. First, the data plane needs more flexibility to provide the right context to security applications. Second, the defense system requires an orchestration scheme that provides service chaining in an efficient, optimal, and scalable way. Third, the defense applications themselves should be both agile to a changing environment and robust to adversarial evasion. Finally, there is a call for a new set of test and verification tools in order to ensure correctness.

Before going through his work, Prof. Sekar takes a step back and gives an overview of how his research came to be. In his own words, "It is tempting to think that this high-level vision existed all along, and that 5-years ago we had this complete vision of the work, that we had all figured out, that it came in a dream and so on." In reality, he says, research is often non-linear and chaotic and comes in a more bottom-up and organic fashion. Most times there is a disconnection between the way research is presented, which is top-down, from the way it came along.

According to him, there are two main ways to do good research. One is getting through challenging pinpoints that researchers and practitioners face while working on tools in the lab or running an operational infrastructure. The other is the non-trivial aspect of chance conversations with student, professors, network operators etc., that leads to interesting research problems. This second is quite important but often gets dismissed.

He then goes through each step he took along the way of realizing this software defined security vision. At each step the gives credits to his co-authors and explains how his chance conversations with some of them were one of the main reasons the work exists today.

His work:
Prof. Sekar starts with SIMPLE [1]. SIMPLE is a policy enforcement layer for service chaining (e.g., making a given flow pass through a firewall, an IDS, and a proxy in a particular order). The fundamental question to be answered was:  can we use something like SDN to steer traffic through legacy middleboxes (which are fixed located).

The idea for this came from a previous work in which he developed a novel middlebox architecture. At the time, many researchers were puzzled on how to make the network steer traffic through middleboxes.

He and his colleagues identified three major problems when steering traffic through middleboxes. First, the same packet may end up in the same switch more than once (e.g., if a given middlebox is connected to the network through a single link) so the switch may not know what to do with it. In the general case, this creates a loop. Second, to avoid overload a network operator may want to balance the load across several instances of the same middlebox. However, generating an optimal set of flows which guarantee load balancing without consuming all space in TCAMs is an intractable problem. Third, some middleboxes modify packets and this makes the correlation between incoming and outgoing traffic very challenging. The solutions developed under the SIMPLE architecture resulted in new primitives for the data plane, better approximations for making good use of TCAM space and techniques for correlating flows.

One of the most important lessons that came from SIMPLE [1] was the understanding that the SDN data plane required richer southbound APIs, which later resulted in FlowTags [2]. In particular, FlowTags extends the middleboxes themselves so that they supply context to the network data plane. This effectively solves the problem of correlating flows that are modified by the middlebox which greatly simplifies service composition and allows new verification and network diagnosis methods. Moreover, one of the main aspects of the proposed solution is its simplicity and efficiency, as FlowTags require little modification to current software and incurs very small overhead.

FlowTags’ [2] improved southbound API widened the scope for new types of applications. While developing new applications, however, students often had to reinvent some form of resource allocation optimization function. This required them to use and debug low-level optimization tools such as Gurobi or CPlex and got in the way of solving the real networking problems that needed to be solved. This difficulty eventually led to SOL [3].

SOL [3] simplifies the development of applications by offering a simpler API to interface with low-level optimization tools and solve common optimization problems. To do this, SOL has to be general (in the sense that many different applications would benefit from it) and efficient (such that the time to compute solutions would be comparable to custom algorithms developed for each specific application).

The key insight in SOL [3] to achieve both generality and efficiency came from the observation that most researchers rewrite network optimization problems in terms of path constraints (as opposed to edge constraints). The reasoning behind this is that near-optimal solutions can be achieved with a small subset of all possible paths. Thus, SOL works by combining offline path preprocessing with simple, online path-selection algorithms.

With a richer southbound API and new optimization directives, it comes the question of what can actually be done. The next step in his research was BUZZ [4], a system that gives assurances on whether policies are implemented correctly or not. Although several work had already been developed in the area, BUZZ was the first to consider stateful, context-dependent policies. BUZZ’s insight also came from student’s difficulties in implementing SDN applications. In particular, the FlowTags architecture itself turned out to be very difficult to debug.

BUZZ [4] works by building a model of the correct network and testing the behavior of the network traffic against this model. Its development presented two main challenges. First, how to model the network and its functions. Second, how to solve the exploration problem of testing the traffic on the model, which is already hard in stateless cases.

The insight in BUZZ [4] was to think about the structure of the policies that are written on the middleboxes as a way to simplify the exploration space of the problem. For example, it is both easier and more realistic to think of rules in terms of TCP sessions granularity rather than at packet granularity. The verification itself is made using efficient symbolic execution techniques, which originated from talking with colleagues that work in formal model verification.

The use of BUZZ [4] showed that many many recent systems, including their own, presented bugs on context-dependent policies. Some of these systems were Kinetic, OpenNF, FlowTags, and PGA.

Simple [1], FlowTags [2], SOL [3], and BUZZ [4] are building blocks for new security applications. One of the first ones was Bohatei [5], a system that provides flexibility to handle changing or evolving patterns in DDoS attacks (both in magnitude and location).

Bohatei [5] relies on geographically distributed data centers that can be used to instantiate defenses on-demand and a predictor that detects attacks and provides the duration and volume of each attack. Bohatei decides the type and quantity of defense appliances and routes traffic through these appliances so that it can be scrubbed clean before reaching the customer.

The insight behind Bohatei [5] came from an NTT engineer which was a visiting student at the time. The problem was that their network had large-scale DDoS attacks and the defense appliances that they were buying were getting overloaded. The proposed solution was to attempt to use SDN and NFV to build elastic defense appliances.

The next security application he presents is PSI [6], a system that provides custom security defenses for each individual user or device. The insight behind PSI came from a conversation with security specialist Michael Collins, from RedJack. He stated the network gets in the way of enterprise security defenses. This happens for four main reasons. First, current defenses can easily be avoided since they are currently positioned in fixed choke points. Second, defenses generate a lot of false positives and negatives, since tools lack context (they don’t know where the traffic is coming from). Third, they lack isolation and so have to provide a general set of policies for all cases. Fourth, they lack agility for changing policies with respect to a dynamic environment.

In PSI [6], each security defense has its own context and will be isolated from others. They are composed of a set of micro security appliances (e.g., micro-firewall, micro-IDS, micro-proxy etc.) connected in particular order and with a custom set of policies. These micro appliances run inside an enterprise cluster and raise security alerts to a PSI controller which can dynamically change defenses as needed.

His final remarks:
Prof. Sekar concludes by highlighting some of the work that is still left to be done and mentioning aspects for a successful research. As future work, he mentions that researchers have to think about how to provide security defenses in the data plane; how to reason about adversarial evasions; how do we look into new domains, such as IoT security; how to create new abstraction and orchestration layers for an ensemble of new security applications in a way that can help reason the composition of broader security policies.

His two recommendations for a successful researcher are as follows. First, although the notion of a top-down approach to research is appealing, one must not overlook an organic bottom-up approach. In particular, looking into pinpoints and interesting opportunities as a way to find new research directions. Second, one should leave the comfort zone and talk to different people. Such interactions often cause people to talk about their particular problems and may lead to interesting research collaborations.

The questions that were posed:
Q1: Can you share some of the early pushback (hard criticism) that you got and the process of getting through that pushback?

Vyas Sekar: The work has received a lot of pushback. As an example, when we first started looking at middleboxes, since there was already a lot of literature about it, most people’s opinion was that the work was going to be massacred. But as the work developed, it was lined up current understanding of the community. We may also have had luck. So one of the ways to get through is to get lucky.

The other option is being persistent. Even when your paper is not admitted right away, the techniques you have developed may be of interest to the community. People may not have got it at the time, but it may remain valid five or six years from now. I have seen a lot of persistent people getting very interesting ideas through. So the another way of dealing with pushback is being stubborn.

Q2: Can't a controller also be used to obscure attacks?

Vyas Sekar: This is a recurring and valid question and should be taken seriously. We and many solved scalability aspects of it. As for other aspects, they can often be tackled through other techniques using resilience, penetration testing, etc. So, although it is a valid concern, it should not be a fundamental limiting factor that throughs us out of the potential security benefits achieved by programmability.

Q3: How do you evaluate this operational issues in a University setting?

Vyas Sekar: In our case, all of the evaluated components are real software artifacts and thus can be tested in a University. In our community, in the recent years, we have had a lot of open-source systems that are reasonably close to what is used in production.

Getting data is actually a very hard problem. This is where you may benefit from establishing contacts and rely on the knowledge of others. Some may have an interesting insight on problems or behaviors, others may have datasets that can be evaluated.

Many of these systems can be built and evaluated without actually going through a deployment and there is value in doing that. Just the fact of building a testbed or an emulation platform will highlight several scalability and testing issues that need to be solved.

I cannot say what is the gap between the open-source software and hardware that we have and an operational system. One thing I can say is that it runs much slower. There is probably a huge gap in what the industry does and what we do in the operational part but in terms of the techniques that are used they are very close.

Q4: What are the network security issues that IoT can bring and how they can be handled?

Vyas Sekar: There are three aspects to any security problem: (i) an enforcement mechanism that applies a security policy; (ii) a policy abstraction that translates into what that policy is supposed to be; and (iii) a learning process to know what is correct and what is not.

IoT changes all three of them. First, in terms of enforcement, current techniques cannot apply in constrained devices with unfixable flaws. Second, policy need to consider several new types of behavior (cyber-physical interactions, device-to-device communication, implicit dependencies across devices etc.). Finally, the semantics of the interactions change considerably which makes learning much more difficult.

References to some of his work:
[1] SIMPLE-fying Middlebox Policy Enforcement Using SDN
Zafar Qazi, Cheng-Chun tu, Luis Chiang, Rui Miao, Vyas Sekar, Minlan Yu
in SIGCOMM 2013

[2] Enforcing Network-Wide Policies in the Presence of Dynamic Middlebox Actions using FlowTags
Seyed Fayazbakhsh, Vyas Sekar, Minlan Yu, Jeff Mogul
in NSDI 2014

[3] Simplifying Software-Defined Network Optimization Applications Using SOL
Victor Heorhiadi, Michael K Reiter, Vyas Sekar
in NSDI 2016

[4] BUZZ: Testing Context-Dependent Policies in Stateful Networks
Seyed K Fayaz, Tianlong Yu, Yoshiaki Tobioka, Sagar Chaki, Vyas Sekar
in NSDI 2016

[5] Bohatei: Flexible and Elastic DDoS Defense
Seyed K Fayaz, Yoshiaki Tobioka, Vyas Sekar, Michael Bailey
in USENIX Security 2015

[6] PSI: Precise Security Instrumentation for Enterprise Networks
Tianlong Yu, Seyed K Fayaz, Michael Collins, Vyas Sekar, Srinivasan Seshan
to appear in NDSS 2017