Saturday, September 7, 2013

SIGCOMM'13: Expressive Privacy Control with Pseudonyms

Authors: Seungyeop Han, Vincent Liu, Qifan Pu, Simon Peter,
Thomas Anderson, Arvind Krishnamurthy, David Wetherall

Authors have designed a cross-layer architecture that provides users with a pseudonym abstraction. Pseudonym represents a set of activities that the user is fine with linking. Pseudonym gives the illusion of a single machine. They are able to provide pseudonyms without modification to the browser, operating system, or network. But it is to be noted that IP address separation across pseudonyms only works when the destination server is using IPv6 addresses; however, cookie separation works even with IPv4 servers.

The number of pseudonyms supported by the system is limited by the number of IP addresses we can assign concurrently to a network interface without performance degradation. For example, the Linux operating system enforces a configurable default limit of 4096 addresses. Each privacy policy results in a different number of generated pseudonyms.

Thus, this paper presents an abstraction called a pseudonym, where each device and therefore users are able to control and use many, indistinguishable identities. The pseudonym abstraction gives users
control over which activities can be linked at remote services and which cannot. The authors have designed a cross-layer architecture that exploits the ample IPv6 address space and provides application layer  mechanisms for management. The given design provides the ability for users to choose expressive policies for controlling the privacy/functionality tradeoff on the web. Thus, proposed prototype system consists of a browser extension and a gateway proxy.

Sunday, September 1, 2013

SIGCOMM'13 : Mosaic: Quantifying Privacy Leakage in Mobile Networks

Authors: Ning Xia, Han Hee Song, Yong Liao, Marios Iliofotou, Antonio Nucci, Zhi-Li Zhang and Aleksandar Kuzmanovic.

     For a growing number of users, online social networking (OSN) sites such as Facebook and Twitter have become an integral part of their online activities. This paper calls attention to the privacy leakage in mobile network data. This paper also calls attention to an important aspect of the privacy leakage problem: namely, the potential danger to user privacy posed by a third party, not simply by crawling data directly from OSN sites, but by gathering digital footprints left by users in cyberspace. GPS and other location information in mobile cellular data make it possible to tie users’ cyber activities to their presence in the physical world. The confluence of smart phones and OSNs renders the ability to glean personal information from mobile data a far more potent threat to user privacy than attacks on each individual service. These pose a serious threat to user privacy. This happens because of some shortcomings of certain OSN design, as well as by the fundamental limitations of the current Web and Internet from a user privacy perspective, such as cookie mechanism used by the stateless HTTP protocol.
     They refer to this problem as constructing a MOSAIC of a user from their online digital footprints, and correspondingly refer to the gathered footprint pieces as TESSERAE.
     As a solution they have develop the Tessellation methodology. Through Tessellation, they show how user identity information such as OSN IDs and device tracking cookies can be extracted from the traffic. Furthermore, they describe how the remaining pieces of traffic with no identity leakages can be attributed to the known user identities. 
     They claimed that Tessellation can attribute 50% of traffic to the owners with only 5% error. Optionally, the coverage can be increased to 80%, with just a 2% increase in the error rate. Using this methodology, they were able to create mosaics for more than 16,000 users and classify their personal information into 59 categories including user demographics, locations, affiliations, social activities, interests, etc. And as a solution they suggest possible countermeasures to safeguard against the alarming leakage of private information.

====================== Q/A====================

Q. From where do they obtain OSN User Identifiers and Information?
A: Many OSN sites due to their weak designing “leak” their user identifiers allows Tessellation to attribute traffic to real users. HTTP headers are used to obtain URL, Cookies and payload information to get user login and session key information.

Q. How to get the value of coverage? What are the types of coverage?
A: There are two types of Coverage: a) Session Level Coverage and b) User Level Coverage. Session-level coverage is the number of sessions that are given a prediction (i.e., sum of sessions in all Ts), divided by the total number of sessions. User-level coverage is the number of ground truth users for whom Tessellation identified all or a subset of their sessions divided by the total number of ground truth users.