Layer 9: December 2013

Friday, December 13, 2013

Federated Flow-Based Approach for Privacy Preserving Connectivity Tracking

Presenter: Mentari Djatmiko

Authors: Mentari Djatmiko (NICTA & UNSW), Dominik Schatzmann (ETH Zurich), Xenofontas Dimitropoulos (ETH Zurich), Arik Friedman (NICTA), Roksana Boreli (NICTA & UNSW)

The paper is motivated by Internet outages, which have significant financial and reputation impact. Prior work either uses passive control-plane measurements using BGP data (which suffers from false positives), or active measurements (which suffer from overheads vs. detection granularity tradeoff), or passive data-plane measurements (which don't suffer from the aforementioned shortcomings but have privacy concerns).

The proposed scheme relies on passive data-plane measurements and aims to alleviate privacy concerns. The authors propose secure multi-party computation (MPC), which is a cryptographic protocol that enables privacy preserving connectivity computation. (Slide malfunction during presentation)

The authors present a case study for evaluation.

Q: You focus on outages (which is a binary performance problem). Can you use this scheme for fine-grained performance evaluation?

A: Yes, it's a possible future work.

Q: Does the solution work in real-time? Does it scale for the whole Internet?

A: We have conducted small-scale evaluations yet. It may be challenging to scale it to a large number of domains.

Q: What information are you trying to protect? Are there privacy concerns for connectivity information?

A: Yes, it can be sensitive. For example, access to porn is likely a private thing.

CoDef: Collaborative Defense Against Large-Scale Link-Flooding Attacks

Presenter: Min Suk Kang

Authors: Soo Bum Lee (Carnegie Mellon University), Min Suk Kang (Carnegie Mellon University), Virgil D. Gligor (Carnegie Mellon University)

Traditional DDoS attack target specific endpoints or servers. However, in recent years we have seen several attacks geared towards specific links, instead of a large number of hosts. Traditional flow filtering schemes are susceptible to these attacks because attack flows (which are typically low-rate, have diverse source/destination addresses, and are protocol conforming) are often indistinguishable from benign flows.

The proposed scheme (called CoDef) relies on collaboration among ASes. Attack source and target ASes are generally motivated to collaborate to curb this attack. CoDef uses collaborative rerouting in which target AS asks neighboring ASes to reroute traffic via other paths, essentially dispersing attack and benign traffic. If the attacker is aware of this reroute and it chooses to re-launch the attack by creating new flows, the attacker will be identified.

After collaborative rerouting, CoDef uses collaborative rate-control and path pinning (which were not discussed during the presentation). The evaluation was conducted using topology data from CAIDA. CoDef does not require changes to BGP or OSPF.

Q: Can CoDef identify attack source inside the attack AS?

A: No, CoDef would notify AS-owener/ISP.

Q: What is the cost of routing change employed by CoDef? Someone can abuse the system by false collaborative rerouting advertisements, how does CoDef cater for that?

A: We envision that CoDef will be a premium service. The costs of the service would hinder false use.

RiskRoute: A Framework for Mitigating Network Outage Threats

Presenter: Ramakrishnan Durairajan

Authors: Brian Eriksson (Technicolor Research), Ramakrishnan Durairajan (University of Wisconsin - Madison), Paul Barford (University of Wisconsin - Madison)

Network outages happen due to a wide variety of reasons. For example, censorship, cable cuts, or natural disasters. The paper present a framework for proactively mitigating network outages due to natural disasters, in particular weather-related network outages. The key idea is that weather-related events follow predictable geographical and temporal patterns; therefore, they can be predicted before occurring.

The authors propose a new metric called "bit-risk miles" that takes quantifies the sensitivity of a path to weather-related disasters and allows to study tradeoffs between shorter paths vs. outage risk. The framework forecasts outage probability at PoP locations and selects a path from a set of possible paths.

The evaluation is conducted using FEMA/NOAA weather data and real-world routing data from 16 regional networks. The results show that routing significantly changes from shortest path routing to be more risk averse. The framework also guides new intra-domain routes and new peering relationships for inter-domain routing. The presentation concluded with a video demo for hurricanes Irene and Katrina.

Q: In your evaluation, do you also take into account traffic volume?

A: No, we only accounted for link count.

Q: Does the framework work in real-time?

A: Yes, it does.

Q: Can you implement your framework in real routers? What changes would they require?

A: It will require some change. Detailed analysis is left as a future work.

On the Benefits of Applying Experimental Design to Improve Multipath TCP

Presenter: Christoph Paasch
Authors: Christoph Paasch (UCLouvain), Ramin Khalili (T-Labs/TU-Berlin), Olivier Bonaventure (UCLouvain)

Although MPTCP is there for a long time now, unfortunately, evaluating its performance in practical scenarios did not get much attention. Therefore, the authors tries to investigate the performance gains of using the Multipath TCP. They studied the effect of different environmental parameters on the performance of MPTCP. They quantified the performance gains from using MPTCP and highlighted the cases in which MPTCP did not achieve the expected performance. In addition, they proposed some solutions and evaluated their effect. In conclusion, the authors built an evaluation environment that can be adopted to measure the performance of multipath TCP schemes.

Q: What makes the aggregation bandwidth equal to 1 ?
A: When MPTCP achieves a throughput equal to the sum of the throughputs that can be achieved using each interface separately.

DomainFlow: Practical Flow Management Method using Multiple Flow Tables in Commodity Switches

Presenter: Yukihiro Nakagawa
Authors: Yukihiro Nakagawa (Fujitsu Laboratories Ltd.), Kazuki Hyoudou (Fujitsu Laboratories Ltd.), Chunghan Lee (Fujitsu Laboratories Ltd.), Shinji Kobayashi (Fujitsu Laboratories Ltd.), Osamu Shiraki (Fujitsu Laboratories Ltd.), Takeshi Shimizu (Fujitsu Laboratories Ltd.)

The demand for bandwidth in data servers is dramatically increasing. The need for a scalable network with high bandwidth is essential for data centers. There exists a lot of work in the literature that tries to enhance the physical layer of the switches in order to enhance their performance. Unfortunately, this introduce a lot of overhead in terms of controlling the switches. Therefore, the authors proposes DomainFlow as a practical flow management method based on open flow concepts and switches. They apply network virtualization approaches to enable the administrators/customers to easily control the system. One of the main gains of this virtualization is to utilize the multiple paths between the source and distention seamlessly. The authors prototyped their system measuring its performance, efficiency and controllability.

Q: Do you have to choose between WRT and ANT or you can use them together.

A: It is possible but needs some modifications to the system.

Thursday, December 12, 2013

DOMINO: Relative Scheduling in Enterprise Wireless LANs

Presenter: Wenjie Zhou
Authors: Wenjie Zhou (Ohio State University), Dong Li (Ohio State University), Kannan Srinivasan (Ohio State University), Prasun Sinha (Ohio State University)

This work mainly focus on solving the channel access challenges in enterprise networks. Although current Wifi Distributed Channel Access (DCF) is simple and robust, it suffers from the hidden terminal problem as well as efficiency issues. Other work in the literature tried to solve such issues but unfortunately they have major problems such as inefficiency, being not robust or being unable to leverage the channel to its maximum. Therefor, the authors developed DOMINO as a centralized channel access mechanism. DOMINO is able to detect the hidden terminals and avoid the hidden terminal problem efficiently. In addition, DOMINO achieve relatively high throughput while avoiding high accurate time synchronization. To achieve this, DOMINO used relative scheduling approach to avoid requiring high accurate clock synchronization. The authors implemented their scheme on USRPs in order to evaluate its performance. In addition, we made further evaluation using simulation.

Q: Is this is Wireless g or n compatible ?
A: Wireless g

Q: What is the overhead of relative scheduling ?
A: Sending the signature at the end of the transmission

Is There a Case for Mobile Phone Content Pre-staging

Presenter: Alessandro Finamore
Authors: Alessandro Finamore (Politecnico di Torino), Marco Mellia (Politecnico di Torino), Zafar Gilani (Universitat Politecnica de Catalunya), Kostantina Papagiannaki (Telefonica Research), Yan Grunenberger (Telefonica Research), Vijay Erramilli (Telefonica Research)

The authors propose a novel technique to implement content pre-staging (caching) in mobile network, specifically by pushing content on the user devices.
A new component is introduced: the content bundler. It is installed at the ISP side and it classifies the traffic and identifies the most popular items to bundle the set of content that will be pushed to the mobile terminals.
They study a 1-day trace (HTTP log) of a big metropolitan city to evaluate the performance of the scheme. It turns out that popularity is a good trigger for pre-staging.
Different strategies are proposed to bundle content. Among them, the popularity based one is the most practical, achieving 7% of saving in terms of volume of data transferred and tangible benefits for users too.

Q: It seams a macro network optimization, is this targeted for big events?
A: Yes, but not only. Imagining a big event I may say that if we push the bundler at the BST level we may achieve suboptimal performance

Q: It seams that you determine the bundle on a 24h basis, what about doing it hour by hours?
A: This is exactly what we have done.

Staying Online While Mobile: The Hidden Costs

Authors:
Andrius Aucinas, Narseo Vallina-Rodriguez (University of Cambridge), Yan Grunenberger, Vijay Erramilli (Telefonica Research), Konstantina Papagiannaki (Telefonica Research), Jon Crowcroft (University of Cambridge), David Wetherall (University of Washington)

Presenter: Andrius Aucinas

Motivation and introduction :
The authors study the energy and network costs of mobile applications that require continuous online connection (e.g. WhatsApp, Facebook, Skype). They find that the idle online presence drains the phone battery nine times faster, which can be explained by the high frequency of TCP keep-alive messages, as wells as the cross-layer interaction of TCP and cellular network protocols. The authors propose solving this problem with a two-way push notification system -- with messages being sent at a low frequency and low volume by a network-aware sender.

Measurements and Results:
The authors develop a tool, which they call Rilanalyzer, and they use to perform energy measurements in mobile devices. They show that application that require online presence force the RRC state machine to stay in high energy consumption mode; short messages that are used to keep TCP connections alive have a very high hidden energy cost. This occurs from the facts that mobile platform's push APIs are not sufficient for all applications and that there is not common mechanism to enable applications keep online presence, and therefore application developers prefer to use long-lived TCP connections.

Why you should read this paper:

The authors show how apps, such as WhatsApp and Skype, have a great impact the battery life of mobile devices. They also identify the reasons behind this large hidden cost of these apps, and propose novel ways how to solve them.

Questions & Answers:

Q1: How you compute the impact of background applications to the battery life.

Answer: We see how much time does the application keep the RRC state machine on high energy.

Q2: Is there something else that can be done, apart from using the push mechanism.

Answer: We must see what are the needs of the applications, in order to find more ways to improve their energy consumption.

RFID Shakables: Pairing Radio-Frequency Identification Tags with the Help of Gesture Recognition

Presenter: Giorgio Corbellini
Authors: Lito Kriara (Disney Research Zurich, Switzerland), Matthew Alsup (Disney Research Pittsburgh, PA), Giorgio Corbellini (Disney Research Zurich, Switzerland), Matthew Trotter (Disney Research Pittsburgh, PA), Joshua D. Griffin (Disney Research Pittsburgh, PA), Stefan Mangold (Disney Research Zurich, Switzerland)

In IoT there is a problem of associating and dissassociating devices. This requires pairing.
In the scenario proposed, there is a motion detection of gesture that triggers the pairing of objects, which in this case are toys.
The authors build a prototype where different gestures, namely circle and lines that correspond to association and disassociation respectively.
The results in term of payoff and correlation demonstrated the validity of this solution that can easy the procedures of association and disassociation between RFID devices.

Q: this is specific to RFID? Can be applied to NFC?
A: It apply to every kind of communication.

Q: Have you considered other applications?
A: we have not, but this can applied to everything related to IoT.

Towards a SPDY’ier Mobile Web?

Authors: Jeffrey Erman, Vijay Gopalakrishnan, Rittwik Jana, K.K. Ramakrishnan (AT&T Labs – Research)
Presenter: Jeffrey Erman

Motivation and introduction :
The authors compares the performance of HTTP and SPDY -- an open networking protocol developed primarily at Google for transporting web content. After a 4-month measurement-driven analysis, they concluded that SPDY does not outperforms HTTP. According to the authors, this is explained due to the non-harmonic interaction between TCP and the RRC state machine.

Background:
Since HTTP connections are typically short and exchange small objects, TCP does not have sufficient time to utilize the full network capacity. SPDY tries to solve this problem by opening one connection per domain; multiple data streams are multiplexed over this single TCP connection for efficiency.

Measurements and Results:
The authors measure the Page Load Time (PLT) for 20 popular website, both for HTTP and SPDY, and they show that for the majority of the websites there is no significant difference between SPDY and HTTP. This is true only for 3G/4G (while in the case of Wi-Fi SPDY is on average 56% faster than HTTP). They show that, in the case of 3G/4G, HTTP achieves much higher throughput due to the fact that it has more TCP connections, and multiple TCP connections are better than a single TCP connection due to the high fluctuation and multiple retransmissions that occur from how TCP and RCC interact.

Why you should read this paper:
SPDY is a promising new protocols for improving mobile browsing, and this paper shows an in-depth comparison with the existing web-browsing protocol (HTTP). Also, the authors show, in great details, where the greatest part of the delay in web-browsing comes from.

Q&A:
Q1: SPDY is doing a lot of content prioritization in order to reduce the perceived delay. Did you try to compare the user's perceived delay between HTTP and SPDY ?
A: No, but we found that SPDY does not request all objects at once, but goes through multiple rounds.

Q2: How can we design the web-sites so they are more compatible with SPDY?
A: We need to take a fresh look at it. Objects that have some dependency should come together, instead of multiple rounds

Q3: What changes would improve the delay of mobile-pages. Would re-designing web-browsers improve delay?
A: Changes in web-pages and browsers would help, but cross-layer interaction is important and would and changes in the cross-layer interaction between TCP and RRC would bring the most benefits.

Socket Intents: Leveraging Application Awareness for Multi-Access Connectivity

Presenter: Philipp S. Schmidt
Authors: Philipp S. Schmidt (TU Berlin / Telekom Innovation Laboratories), Theresa Enghardt (TU Berlin / Telekom Innovation Laboratories), Ramin Khalili (TU Berlin / Telekom Innovation Laboratories), Anja Feldmann (TU Berlin / Telekom Innovation Laboratories)

Many hosts (e.g., smartphones) have access to multiple networks with very different characteristics. The question is: how do you choose an interface for a particular application?

Currently, applications can only express the destination at the socket API; really, applications know more relevant information that the socket API lets them express. The authors present "socket intents," which allow the application to specify attributes like flow category, file size, sensitivity to timeliness, bitrate, duration, and resilience.

The socket API communicates these attributes to a policy module, which returns the appropriate interface to use for the connection.

Q: Cost is important to users --- how do you factor the cost of using a particular interface into your decision?
A: That's an important question --- it's difficult to quantify how much a faster download is "worth." We're still considering this.

Q: Have you thought at all about deterring flow attributes automatically in the case that the developer was too lazy to provide them?
A: We have not, but it would be an interesting future direction.

Low latency via redundancy

Presenter: Ashish Vulimiri
Authors: Ashish Vulimiri (UIUC), Brighten Godfrey (UIUC), Radhika Mittal (UC Berkeley), Justine Sherry (UC Berkeley), Sylvia Ratnasamy (UC Berkeley), Scott Shenker (UC Berkeley and ICSI)

The authors explore the use of redundancy to lower latency. The idea is to send multiple requests and use the first response; this can tame tail latency by avoiding failed/slow servers.

They perform case studies:

1) DNS. Experiments on PlanetLab. Clients issue multiple copies (to different servers) of each DNS query. 99th percentile response time improves by .44 seconds when contacting just 2(?) servers. Improvement is larger in the tail than the mean.

2) Distributed K-V store. Experiments on EC2 and Emulab. 2 copies --> improves mean by 1.5x, 99th percentile by 2.2x on Emulab. Even bigger improvement on EC2!

3) Memcached on Emulab: redundancy didn't help because variance in response time was very small to begin with.

Next the authors step back and propose a model for distributed applications to help decide whether or not redundancy will help a particular system. (You need to balance two factors: is the decrease in variability worth the increase in load?)

Another interesting result: in data centers where switches can support QoS, marking redundant packets as low priority completely eliminates the negative effects of the increased load.

Q: Redundancy only helps if system is over-provisioned/under-utilized. What real-world systems are over-provisioned?
A: Many data center applications, e.g., K-V stores, see bursty workloads. During bursts, you can throttle back the redundancy.

Q: You've used redundancy for DNS --- could you use it for web browsing?
A: It's hard to say without a root cause analysis of latency in web browsing. If the latency is caused by full buffers in the network, this would only work if you had, e.g., geographically distributed web servers.

Q: What if you get different answers? What if response quality is more important than latency?
A: Yes, we're working on quantifying this. This could certainly be the case for DNS. Informally, what we've seen suggests that this is only a problem for a small number of web sites.

Q: How could the client decide automatically how many copies of a request to send?
A: We don't have an answer at this point. If the server could tell the client its current load, that could help. It's difficult to do with only client-side measurements --- it's an interesting problem.

Q: Are you concerned about duplicate requests increasing system variability?
A: It depends very much on the specifics of the system --- I don't think there's a general answer, at least not one I could give you at this point.

CoNEXT'13: Inferring Multilateral Peering

Authors: Vasileios Giotsas, Shi Zhou, Matthew Luckie, KC claffy

Presented By: Vasileios Giotsas

An accurate Internet Autonomous System topology is necessary for a wide range of research problems. AS Topology Data sources include BGP data, Traceroute data, Internet Routing Registries. The available topology datasets are proven to be incomplete. There can be hidden ( Backup / regional transit links ) or invisible links ( Peering links etc. ) The discovery of peering links in IXPs is the key for obtaining complete AS connectivity. (An IXP provides a physical infrastructure to facilitate establishment of peering interconnections.)

This paper addresses the above mentioned AS topology incompleteness problem. Proposes and implements a new approach to infer Multilateral Peering Links, by utilizing only public data sources. Using new techniques to mine IXP route server data with a mapping of BGP community values, this work inferred 206K p2p links from 13 large European IXPs, four times more p2p links than are directly observable in public BGP data. The approach uses only existing BGP data sources, and requires only few active queries of Looking Glass servers, facilitating reproducibility of the results.

Limitations: Where the method can’t be applied.

- IXPs without Route Servers.

- Route Servers that do not use BGP Communities for advertisement.

- Route Servers that strip out BGP Communities before propagating advertisements.

The entire study is based on European IXPs, but they plan to soon extend it over N. America and Asia-Pacific regions as well.

Q: How hard it is to get peering data?

A: The data has to be collected in a distributed manner. Obviously it is labour intensive, data cleaning required. These IXPs are some of the world’s largest.

Q: How static/dynamic were they (links)?

A: There might be very frequent updates on topology. The study is one snapshop carried over for several days. If one single link leaves (taken out etc.) all paths change. The dynamicity would be even more for bilateral cased (N(N-1)/2 connections) and might be a bit less for multilateral.

Understanding Tradeoffs in Incremental Deployment of New Network Architectures

Presenter: Matthew K. Mukerjee
Authors: Matthew K. Mukerjee (Carnegie Mellon University), Dongsu Han (KAIST), Srini Seshan (Carnegie Mellon University), Peter Steenkiste (Carnegie Mellon University)

Deploying a new network architecture is difficult; after more than a decade of effort, we still haven't made the transition from IPv4 to IPv6.

The authors explore what it means to incrementally deploy a network architecture (that is, a new layer 3 protocol). They provide a framework for talking about incremental deployability by breaking the task into four parts:

Picking an egress router from the source network.
Picking an ingress router into the destination network.
Getting to the egress router.
Getting to the ingress router.

Using this framework, they compare 2 IPv6 deployment techniques (static tunnels and address mapping). Then they introduce two new deployment techniques made possible by recent innovations from the networking research community.

In summary: if incremental deployment is so easy ("just make tunnels!"), then why aren't we using IPv6 already? To realistically deploy a new layer 3 protocol, we must think carefully about how to do it, and the authors provide a framework for doing that.

Q: Are you concerned about n^2 at controllers?

A: Not really; if it's a problem, they can push the state to clients and then forget it.

Wednesday, December 11, 2013

CoNEXT'13: An Automated System for Emulated Network Experimentation

Authors: Simon Knight, Hung X Nguyen, Iain Phillips, Olaf Maennel, Randy Bush, Nickolas Falkner, Matthew Roughan

Presented By: Simon Knight

It is time consuming, tedious and error prone to setup and configure large scale test networks. This paper presents a system to facilitate emulation by providing translation from a high-level network design into a concrete set of configurations that are automatically deployed into one of several emulation platforms.

The system allows an user to specify the network topology at a high-level, in a standard graph exchange format. This allows the use of a GUI-based graph editor, such as yED. In these graphs nodes correspond to routers, with edges indicating their connectivity. Node and edge attributes allow hostnames, AS numbers, and link weights to be specified. From these IGP and BGP topologies can be inferred.

- Reduces configuration burden.

- Reduces the expense and inconvenience of real hardware

- Use of abstraction, graphs, and templates provides manageability and scalability

- Decouples network-level design from device-level configuration state.

Interesting demo video about the entire flow right from drawing a network topology in the yED editor to having the emulated network up and running. The system is built on Pyhton platform. The code is available in github, and all relevant details and links are here: autonetkit.org

Q: Can you take a cisco image and run in the vm environment?

A: You can generate IOS config files and run it.

Q: There can be a lot of constraints, limited number of ports, number of rules you handle etc. How do you take care of that?

A: You design the network, do sanity checks for the constraints before you run it.

Minimizing Network Complexity Through Integrated Top-Down Design

Presenter: Xin Sun
Authors: Xin Sun (Florida International University), Geoffrey Xie (Naval Postgraduate School)

The authors present a novel integrated ensign methodology to minimize network complexity.

E.g.: Reachability control: You want only research and engineering teams to have access to a server, but not the sales team. If subnets are based on physical location, your filter needs to have one entry per hosts. Easy to make mistakes, difficult to maintain. If subnets are assigned based on team, the filter needs only 3 entries: one per team. BUT: how you assign subnets *also* impacts how complicated it is to configure your VLAN.

Takeaway from this example: if you don't consider all aspects of your network at once (e.g., subnets, VLANs, and filters), a design that simplifies one might make the others unnecessarily complex.

As a proof of concept, the authors present an algorithm for designing a local network that minimizes the number of filter rules and VLAN trunk ports subject to constraints (e.g., max num VLANs, max num rules, reachability policy must be enforced correctly).

To evaluate their work, they compare the network designed by their algorithm to the two naive design heuristics: group hosts by location or group hosts by team.

Q: Does the order in which you optimize aspects of your networks matter?
A: In our example, the stages are independent of one another. In the general case,

Q: How does your algorithm interact w/ routing changes?
A: We assume filters are placed at the edge, so we could ignore routing. If filters are in the middle, the routing could also be a factor in the design process. Future work.

Silent TCP Connection Closure for Cellular Networks

Presenter: Feng Qian

Authors: Feng Qian (AT&T Labs - Research), Subhabrata Sen (AT&T Labs - Research), Oliver Spatscheck (AT&T Labs - Research)

Often application layer have unexpected conseguences on radio.
For example, a connection close at the transport layer can cause an energy waste due to the state machine of UMTS.
There is a tail of energy waste due to this state machine.
The power consumption has been measured with a power meter in different conditions.
TCP connection is studied and a great power waste is observed due to the tradeoff between early closure and the need to reuse TCP connections.
The author propose STC, silent connection closure. Both sides exchange the closure timer info when the connection is setup, negotiate on that, and they can close the connection without exchanging any packets! In this way the long tail of energy waste is avoided!
Of course, this scheme requires TCP change and API for upper layers. It can be incrementally deployed using proxies that are compatible.
A trace driven evaluation is presented, achieving 11.3% radio energy saving and 6% of signalling load reduction.
A real Android implementation is work in progress.

Q: What about HTTP 1.0?
A: If use HTTP 1.0 there is no need for our proposal

Q: How the proxy know the timeout on the server?
A: It can notify it

Q: What are the implication using SPDY?
A: STC is applicable to SPDY as well

Q: What about fast dormancy?
A: It was disabled for performance issues.

Q: The protection period could cause packet loss?
A: No, is is specifically designed to avoid that, and we do not expect losses

Q: Why google is forcing the connection to stay alive for 5s?
A: To reuse the TCP connection and carry multiple objects, reducing latency

Q: Security?
A: I do not see any security problem of our solution

Capturing Mobile Experience in the Wild: A Tale of Two Apps

Presenter: Ashish Patro
Authors: Ashish Patro (University of Wisconsin Madison), Shravan Rayanchu (University of Wisconsin Madison), Michael Griepentrog (University of Wisconsin Madison), Yadi Ma (University of Wisconsin Madison), Suman Banerjee (University of Wisconsin Madison)

The author study the deployment and usage of mobile application in order to understand the factors that impact the application experience. How a developer can capture that across all users?

The authors propose to use application as a vantage point. Developer can include a toolkit (including a library) that provides an API to get contextual measurements, device and user infos, and network state.

They study 2 applications: a MMORPG, named Parallel Kindom) (PK) and StudyBlue (SB).
For example, the impact of device on user interactivity is studied (user actions/times) showing that this increases with screen size. Moreover, battery consumption was higher with cellular with respect to wi-fi and it has an effect also on the session length, which was lower with higher battery consumption.

Furthermore, the impact on network performance of the toolkit are studied in terms of latency ans cellular usage.
For example, the interactivity is sensitive to latency, and decreases with latency increase.

Other measurements are shown along with their impact on application usage, including revenue! This new toolkit can give great insight for developers! Check-it out.

Q: Is it possible to colocate the toolkit server with application server?
A: Yes, we can colocate.

Q: How this toolkit impact the application usage in terms of energy, latency, etc..?
A: We benchmarked that, you can find it on the paper. However, we are not really adding somethings that harms application performances.

Q: Different isolate different resources. How you try to access that?
A: We did not have access to fine grained/low level measurements

3GOL: Power-boosting ADSL using 3G OnLoading

Presenter: Vijay Erramilli
Authors: Claudio Rossi (Politecnico di Torino), Narseo Vallina-Rodriguez (Univ. of Cambridge), Vijay Erramilli (Telefonica Research), Yan Grunenberger (Telefonica Research), Laszlo Gyarmati (Telefonica Research), Nikolaos Laoutaris (Telefonica Research), Rade Stanojevic (Telefonica Research), Konstantina Papagiannaki (Telefonica Research), Pablo Rodriguez (Telefonica Research)

The authors propose a very innovative idea: selectively onload a part of the traffic from the wired network onto the wireless network in order to overcome slow ADSL connections at home.
Yes, you have well understood! The authors propose to onload rather offload data to the cellular network!

Of course there are many challenges to be faced and the authors prove the feasibility of the proposed solution by measuring the capacity available on the cellular network, by trace driven analysis and by studying the problem of data caps on cellular data connections.

A fully application level prototype is implemented and evaluated in the wild in residential environment with two bandwidth hungry applications: VoD and picture upload. Up to a 4x speedup is obtained for downlink and up to 6x speedup for uplink, which demonstrate the great potential of this proposal, which is fully over-the-top. The solution is also compared with MPTCP achieving better result.

However, a network integrated solution is needed to apply this proposal at scale without harming the existing traffic on the cellular network.

This very novel and challenging idea is worth a read!

Robust Assessment of Changes in Cellular Networks

Presenter: Ajay Mahimkar
Authors: Ajay Mahimkar (AT&T Labs - Research), Zihui Ge (AT&T Labs - Research), Jennifer Yates (AT&T Labs - Research), Chris Hristov (AT&T Mobility Services), Vincent Cordaro (AT&T Mobility Services), Shane Smith (AT&T Mobility Services), Jing Xu (AT&T Mobility Services), Mark Stockert (AT&T Mobility Services)

Changes in the sense of software upgrades, configuration, hardware deployment... The question is how those affect user perception of service quality - accessibility, retainability, throughput, minutes of usage, etc. But no lab can fully replicate scale, complexity and diversity of operational networks! To measure these details also need to take into account external factors - seasonal changes (leaves on trees!), weather (worst when coincide with configuration changes), traffic pattern changes, other network events such as outages...

Idea: Litmus - compare performance between study and control group:

study group - network elements where change is implemented
control group - network elements without the change

Going to discuss the methodology of selection of the groups. Spatial regression in study and sampled control group to learn the coefficients and compare the differences before a change and after it. Using domain knowledge to select control group: select control group subject to same external factors and sharing similar properties with study group. Geo distance, topological structure of the cell net, etc.

Evaluation: Litmus outperforms study-group only analysis because of robustness to external factors. Also outperforms Difference in Differences analysis. Some operational experiences: self optimizing network doing automated load balancing, neighbor discovery, etc. - how did it perform during hurricane Sandy. Both study and control group were impacted due to Sandy (everything went down!), but study group did better than control - faster recovery, so the feature was rolled out network-wide.

Q: There is a way to do A/B testing in offline manner if you log enough data without actually running experiments - do you think it might be applicable?
A: Definitely would be, but the question is how to select the control group - e.g. completely random selection might not work

Q: How do you identify external factors?
A: It is very hard, but with this analysis you don't need to know what external factor is there, just automatically discount their impact. But external factor identification is not plausible without additional information.

SoftCell: Scalable and Flexible Cellular Core Network Architecture

Presenter: Xin Jin
Authors: Xin Jin (Princeton University), Li Erran Li (Bell Labs), Laurent Vanbever (Princeton University), Jennifer Rexford (Princeton University)

Cellular core networks are not flexible and most of functionality is implemented at packet data network gateway (content filtering, app identification, firewalls, etc). Combining functionality of different components from different vendors is not feasible.

The main question: can we make cellular networks like data center networks? Yes, with softcell.

SoftCell runs on commodity hardware as a controller for all the different components. Challenge: scalable support of fine-grained service policies, e.g. combination of various filters, firewalls QoS guarantees. Packet classification has to work with millions of flows. To scale it up, packets are classified at the edge and encoded in source port / destination port because classification at the edge gateway does not scale. Classification is then also piggybacked on destination address/port.
For traffic steering using the policies, aggregate across multiple dimensions: policy id, base station id and user equipment id. The forwarding is then simple based on just the tags.
Control plane load can also be a problem, therefore use a Local Agent at each base station.

Using LTE workload characteristics, each basestation handles a few hundreds of UE arrivals and handoffs, which are sufficiently low for the approach to scale. Commodity switches also have enough memory for policy storage, because generally there are not many.

Overall the softcell architecture runs on commodity hardware sufficiently well for existing workloads.

Q: what is fundamental to cellular networks with respect to traffic steering? What is different from
A: Dominant traffic pattern is North-South, between clients and gateways and few gateway switches.

Q: In cellular networks user traffic is tunnelled using GTP, so there is already aggregation - why can you not just use GTP headers?
A: For different policies you want to route to different middleboxes

CoNEXT'13: Explicit Multipath Congestion Control for Data Center Networks

Presenter: Enhuan Dong

Authors: Yu Cao, Mingwei Xu, Xiaoming Fu, Enhuan Dong

There are mainly two types of application demands for high performance communications in DCNs. They have conflicting requirements on link buffer occupancy. Although there are two related prior work, they can be improved. LIA does not take into account the tradeoff between throughput and latency. DCTCP can not fully utilize the paths in DCNs.

In order to balance throughput with latency in DCNs, authors developed XMP as a congestion control scheme of MPTCP. They propose the Buffer Occupancy Suppression (BOS) algorithm, which employs the ECN mechanism to control link buffer occupancy. Next, they find the utility function of BOS and then “multi-path-lize” it in the context of MPTCP. Finally, they construct the Traffic Shifting (TraSh) algorithm to couple subflows so as to move the traffic of an MPTCP flow from its more congested paths to less congested ones.

They have done experiments using personal computers. These experiments show that XMP can guarantee the fairness and shift traffic. Additionally, the simulations also show the good performance of XMP.

CoNEXT'13: Aspen Trees: Balancing Data Center Fault Tolerance, Scalability and Cost

Speaker: Meg Walraed-Sullivan
Authors: Amin Vahdat, Keith Marzullo

Fault recovery in Data Centers is the topic of this paper.

Single failure can disconnect large part of the network.

When a switch fails, it takes time for network to broadcast the new upgrades and connect all part of the network together.

Failure happens very frequent (80%) and they are impactful and far-reaching.

Have lasting effects 10 sec to recover.

The solution can be adding extra links but how we can find extra ports to add links.

1.       Increase the number of ports and add double link for each connection but this solution is expensive.

2.       Another way is to build a bigger network- add more switches on the top level and add one more layer of switches. However more switches make paths longer.

3.       Ports can be provided by removing some of the links of switches to give the chance for more redundant links. This scheme has scalability problem by losing some hosts.

Aspen trees:

Multi rooted tree with extra links at one or more levels (eg. VL2)

The contribution of this paper is to find the tradeoffs between fault tolerant, scalability and network size.

The evaluated results are compared against OSPF.

Q: there are a lot of works on fault tolerant, then whats new?

A: whats new is little bit different point of space. They are adding a little bit of latency instead of adding hardware. There are some topologies designed for super computer like aspen or subset of aspen link splitting aspen trees and doubling the number of links at all levels. The goal here is to see what the tradeoffs are and what the math is here that designers don’t need to guess what is the cost of their opinion of the fault tolerant need to be paid.