Extending MapReduce across Clouds with BStream
Rs3,500.00
10000 in stock
SupportDescription
Batch processing frameworks like Hadoop MapReduce are difficult to scale to multiple clouds due to latencies involved in inter-cloud data transfer and synchronization overheads during shuffle-phase. This inhibits the MapReduce framework from guaranteeing performance at variable load surges without over-provisioning in the internal cloud (IC). We propose BStream, a cloud bursting framework that couples stream-processing in the external cloud (EC) with Hadoop in the internal cloud (IC) to realize inter-cloud MapReduce. Stream processing in EC enables pipelined uploading, processing and downloading of data to minimize network latencies. We use this framework to guarantee service-level objective (SLO) of meeting job deadlines. BStream uses an analytical model to minimize the usage of EC and burst only when necessary. We propose different checkpointing strategies to overlap output transfer with input transfer/processing while simultaneously reducing the computation involved in merging the results from EC and IC. Checkpointing further reduces the job completion time. We experimentally compare BStream with other related works and illustrate the benefits of using stream processing and checkpointing strategies in EC. Lastly, we characterize the operational regime of BStream.
Only logged in customers who have purchased this product may leave a review.
Reviews
There are no reviews yet.