Authors: Gábor Rétvári, János Tapolcai, Attila Korösi, András Majdán, Zalán Heszberger
Presenter: Gábor Rétvári
Background and motivation:
IP forwarding state is ever increasing and is an attractive problem that the authors use to evangelize the idea of compressed data structures, which can be used to better design IP forwarding tables. IP forwarding table compression is well studied, but the abstractions especially, compressed data structures are very interesting to study.
Key goal: Introducing compressed data structures to the networking community !
Simple encoding strings are not memory efficient, but searchable and accessible.
Huffman coding, on the other hand, uses symbol frequency and variable length coding => there is no fast access to symbols and are not search friendly. A possible solution is to build index, but that could be expensive.
The authors employ wavelet trees, which comprises of
- Indexing + Huffman encoding
- stores bitmap at each level
This allows searching by traversing the branches of the tree. "Succinct bitstring indexes" allow access or rank queries in O(1), eg. finding 3rd symbol in the bit-string. Also, they stores only the bitmaps at each level and each symbol is stored with its Huffman code => size is similar to Huffman, but faster access !
Compressed data structures:
- Compression need not sacrifice fast access
- No space time tradeoff
- Stores information in a bounded space and allows fast in-place search/access as well
IP forwarding information base (FIB)
- Fundamental data structure
- large in size
A good use case for compressed data structures !
- Trie structures are excellent for FIBs
- Prefix tree forms, but are expensive
Compression of FIBs are currently done using pointer machine schemes
- IP FIBs are very regular - results in a entropy bound that is smaller than the theoretical limit !
- FIBs can be encoded with just 2-6 bits per prefix - smaller memory as well.
- allows 100,000 of updates per sec
Compressed data structures are feasible for problems in the networking domain.
Predictable, compression does not mean slow!
Might sometimes be faster than conventional approaches !!
Q: Other applications?
A: FIB is just a use case, many other like MPLS or BGPs can potentially use this - anything that requires large memory can benefit.
Q: This approach adds sensitivity to change when compression is performed. What is the author's comment on that ?
A: Understanding regularity is key to see if redundancy is inherent to the problem or not. If so, then compression can be very useful. For e.g., in IP-FIB, the entropy does not change significantly.
Q: The applicability of this solution is context based - what are the challenges?
A: It is open whether compression can be effective or not for a problem. Not enough data at moment to measure effectiveness as of now.
PS. There was a question on the entropy of look-ups that wasn't quite clear.