Tuesday, August 18, 2015

How to Bid the Cloud

Authors: Liang Zheng, Carlee Joe-Wong (Princeton University), Chee Wei Tan (City University of Hong Kong & National University of Singapore), Mung Chiang (Princeton University), Xinyu Wang (City University of Hong Kong)

Paper (pdf)
Public review (pdf)

This paper studies the auction-based spot pricing of Amazon's Elastic Compute Cloud. Users can place bids (above a price set by Amazon) to have computation performed on EC2; jobs that are currently running can be outbid and interrupted. Hence, EC2 spots are ideal for jobs that can be suspended and the results of which are not needed immediately. The goal of this paper was twofold, to understand how the cloud provider sets their prices and to determine which prices users should bid.

The authors present two different types of bidding strategies, one-time bids and persistent bids. One-time bids are submitted once and then exit the system once they fall below the current spot price. The risk with one-time bids is that they be interrupted without completing. Persistent bids are resubmitted in each time period until the job finishes or is manually terminated by the user. This results in longer waiting and completion times. In contrast, one-time bids provide better control over bid completion times. The authors test these strategies to bid for MapReduce jobs. They propose placing a single one-time bid for the Master node, which prevents interruptions and using persistent bidding requests for slave nodes.

Q&A

Q: You talk about both a cloud provider price model and user price model. What if you cannot control the provider's price model? What if you have no knowledge about how Amazon prices? Will your model work?
A: We tried to model how the cloud provider sets a price to get more precise prediction of the spot price. If we do not have any knowledge of how Amazon, then we can still use probability distribution of spot price offered by Amazon to predict future prices. As a result, maybe it will not be as accurate, but it is still doable.

Q: Two questions. Do you provide any estimation or prediction models to the user that is currently bidding? In other words, apart from the current minimum bid, what other information is publicly available? Also in your map reduce model you can reduce your cost by taking some slaves online, but isn't this cost offset by the increased need for storage when the host goes offline?
A: To answer your second questions, for map reduce jobs, if some slave node goes offline, I think the slave nodes will upload some result to the master node. But when they jump off then there may be some results they haven't uploaded to the master node. This will be included in the recovery time. If another slave node takes over for the slave node that fails, then we implement this as a recovery/overhead time in our work. The other slave node will need more time to complete. Can you repeat your first question?

Q: Say I am a bidder, and I want to bid for some containers. I do know the current minimum bid price, but what other information is available to me? Is there an estimation model? For example, what is the probability my job is killed at some time for a given bid? Do you provide information on what the probability of a job being killed is if you bid some price. Say, if I bid 500 dollars I have 60% probability of being killed, if I bid 800 dollars I have 30% of my job being killed.
A: Yes, this information can be calculated by our model. The public information that is provided to users is spot instance prices of the past two months. From this, we can calculate the PDF of the spot price. Then we can calculate the probability by considering the total running time of the job.

Q: There's an assumption that the computation results don't vary in value. For example, the computation results are not going to be useful if the conference is already over. Do you have any thoughts on what affect the value of computation over time would have on the model? Often, the busiest times to bid are more expensive because the results of the computation are more valuable at that time. A real life bidding strategy would have to take that into account the effects of running the job later during the off hours. Do you have any thoughts on how that would affect the bidding strategy?
A: Actually we did not find any correlation between the time and the spot price, because people from all over the world can bid at any time. But we do think that if a lot of people use our bidding strategies then this would affect the spot price. It should make the market efficient. We want to improve this in future work.

Q: How will things change if you consider competitors? That if customers had the choice to select between different providers?
A: This is a good question which we never considered. If we have multiple cloud providers, then the user can have more choices to choose from by comparing the price they offered and the spot price distribution may be more variable. This will complicated model. I think that is an interesting point to study in the future.