Wednesday, August 23, 2017

Paper 3: Pretzel: Email encryption and provider-supplied functions are compatible

Paper 3: "Pretzel: Email encryption and provider-supplied functions are compatible" by Trinabh Gupta, Henrique Fingler, Lorenzo Alvisi, and Michael Walfish.

Trinabh Gupta presented the Bretzels that proposes an end-to-end encryption mechanism between email servers while not compromising essential functionalities such as spam filtering.

Gupta et. al. claim that emails today are not encrypted, e2e, end-to-end (between clients), however  intermediate servers are able to handle these emails in plain text. This has been accepted to offer well-run services. The presenter says that e2e encryption breaks the businesses model (extract user interests, make targeted adv). On the other hand if mail servers can access email then hackers can access emails.

Pretzel establishes end-to-end encryption without compromising the benefits of such services. Their main design objective include e2e encryption, enabling basic services, and achieving low resource costs. Pretzels proposes 2PC solution that  protects both the user's and the server's content (the filter and the email content) Gupta says.

Authors gave the example of sharing salary between two entities. both users gives their salaries to a black box that will allow the exchange.

However, Existing 2PC solutions such as Yao 2PC are very costly mainly because of the size (1 Million of rows and probabilities to compute). Pretzel reduces this cost by 100x.

They test two function services spam filter and topic extraction which implements
linear classifiers (extract words, add properties, compare probabilities). Pretzel does such classification privately.

To reduce the cost, Pretzel adapts packing to reduce client storage cost. It concatenates probabilities before encrypting. It also implements a decomposed classification.

Authors compare Pretzel to a no private system and Yao+GLLM. They measure resource cost. They show that Pretzel achieves 100x less (compare to Yao) CPU-time at the server, 100x less traffic at the network, but with a cost of 3x more storage at the client side. 

Q&A session (2 questions)

Q1: If I organize a protest, I don't want the email provider to know about my topic. How can your model preserve privacy or what's the middle ground.
A1: Client can state that they do not allow topic extraction. Client preferences can be set at the end hosts. Author said that their solution is much better than existing but doesn't fix all issues.

Q2: In your work, you had to change the classifier, how different is your classifier from existing services and what's the impact / tradeoff on performance.
A2: We can implement better classifier using neural networks but performance remain similar (80 to 90% similar).