Tuesday, August 23, 2016

Packet Transactions: High-Level Programming for Line-Rate Switches

Anirudh Sivaraman (MIT CSAIL), Alvin Cheung (University of Washington, Seattle), Mihai Budiu (VMWare Research), Changhoon Kim (Barefoot Networks), Mohammad Alizadeh (MIT CSAIL), Hari Balakrishnan (MIT CSAIL), George Varghese (Microsoft Research), Nick McKeown (Stanford University), Steve Licking (Barefoot Networks)

Presenter: Anirudh Sivaraman 

What does a programmable switch means? The ability to express data plane algorithms like Active Queue Management, Congestion Control etc. Line rate means the switch must forward the packet at the highest possible rate (1-10 Gbps with 10-100s of ports).  Practical programmable chips at line rate (for e.g. Flexpipe, Xpliant) are available in the market nowadays. However, the hardware is dedicated to stateless tasks which hinders data plane algorithms which require state at the switch. Also the language of programming the switch is low-level (match-action pipeline), and is difficult to translate these algorithms. The development of a stateful instruction set for switches is also required. 

Towards programming data plane algorithms at line rates, the authors make the following contributions: a high-level abstraction called packet transactions to describe the data plane algorithm; a low-level abstraction called atoms which helps in the development of the stateful instruction set, and a compiler to translate packet transactions to atoms. 

Packet transactions are a block of imperative code (in a language called Domino) acting on packets one at a time serially, thus simplifying the development of data plane algorithms (do not have to worry about concurrency issues, can be developed agnostic to switch hardware), and an example transaction to implement a count function was presented. However, under the hood, the execution is pipelined (and thus packets can be acted upon concurrently). The authors define 'atom', a action unit which can read and modify the packet, and also maintains a local state. Thus a switch model is multiple pipeline of atoms. The atom constitutes the switch instruction set. An example atom can read the packet and store a packet field in memory. While stateless operations can be pipelined with ease, many stateful operations require atomic operations in hardware which needs to be performed in a clock cycle to provide line rates. 

The compilation of packet transaction with a atom given as input provides the pipeline of atoms in the switch which implements the transaction at line rate (or rejects the input if there is no solution which can run at line rate). Thus, switch designers can run multiple iterations of the compiler to find an atom expressive enough for the data plane algorithm (the authors provided a list of atoms in order of expressibility). The author then presented a demo for finding the atom structure for two algorithms: learning bloom filters and heavy hitters using different atoms. For evaluation, the authors present the expressiveness of packet transactions which have a comparable LOC count as software routers. They evaluate the number of atoms required to implement different data plane algorithms and find they can implement most of them with < 100 atoms which requires 1% additional area. Thus, in conclusion, the authors present different abstractions and a compiler to implement data plane algorithms in switches at line rates. 

Q: Are packet transactions a good abstraction as it requires strong serializability? 
A: Weaker concurrency models were considered, but it is easier to implement algorithms with transactions. However, a concrete analysis of different models is a future arena of work. 

Q: P4 is used to make OpenFlow more expressible? How does this system compare with P4?
A: A P4 program could be generated from packet transactions, which is an interesting proposition. 

Q: The Domino language does not support loops which may be required for some algorithms? 
A: Only loops with constant number of iterations which can be unrolled at compile-time can be implemented by the system for now (this can be provided as syntactic sugar). However, dynamic loop iterations cannot be performed.