Differences

This shows you the differences between two versions of the page.

Link to this comparison view

projects:caaws [2019/12/18 23:54]
projects:caaws [2019/12/18 23:54] (current)
Line 1: Line 1:
 +~~NOCACHE~~
 +~~NOTOC~~ ​
 +====== Data Replication For AWS Spot Market ======
 +
 +This is a framework for geo-diverse data services hosted on EC2 spot instances. Spot instances implement market
 +driven pricing for spare resources within Amazon'​s data centers. On average, they can be 78\% cheaper than
 +instances provided with fixed, on-demand pricing. However, it is challenging to serve data from spot instances
 +because their prices can change every hour. Instances hosting critical data can be suspended with little warning
 +when spot prices change. We studied a trace of spot prices provided by Amazon and observed that prices change
 +more than 300 times per month. Further, the relative cost of spot instances in different regions changes more
 +frequently. Naively migrating data to sites with low cots would incur prohibitive bandwidth costs. Consistent
 +hashing, a widely used approach for data replication,​ would also incur significant migration costs. Thus, it is not
 +tailored to geo-diverse settings where latency aware placement is needed.
 +
 +Our cost-aware data replication framework uses online data replication to reduce migration costs and make wise
 +decisions regarding price volatility. The key insight is that price volatility and non-uniform access rates magnify the
 +cost for poor replication policies on popular data. If we target these heavy hitters, by predicting them and carefully
 +allocating resources, we can significantly reduce the total cost. We have implemented our framework using novel
 +intra- and inter-region data management policies. When considering data replication across regions, the framework
 +forecasts the price at each site and replicate data to sites that combined to yield low cost. Such replication decisions
 +are made online, i.e., when data is created (after a short profiling period), and thus avoids overhead by moving data
 +frequently in response to the changing price. The framework manages the intra-region data replication to meet the
 +dynamic workload. We built a 0+1 raid scheme to spawn new spot instances for workload peaks. That is, we
 +maintain a service mirror in an on-demand instance in case the spot bid fails. We have evaluated our framework at a small
 +scale, using up to 10 spot instances and 1 on-demand instance. The results show that, compared to consistent
 +hashing, our approach reduces cost by 80\% while increasing response time by less than 5\%. All queries are served
 +without failure reported.
 +
 +
 +For more information,​ please check [[http://​pacs.ece.ohio-state.edu/​proposal/​aws_proposal.pdf|AWS Proposal]] or email [[http://​www2.ece.ohio-state.edu/​~xuz/​|Zichen Xu]]. 
  

Personal Tools