Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Twister4Azure : Iterative MapReduce for Azure Cloud<br />ThilinaGunarathne, Judy Qiu, Geoffrey Fox<br />{tgunarat, xqiu,gc...
MapReduceRolesfor Azure<br />Familiar MapReduce programming model<br />Built using highly-available and scalable Azure clo...
MapReduceRolesfor Azure<br />
Twister for Azure<br /><ul><li>Merge Step
In-Memory Caching of static data
Cache aware hybrid scheduling using Queues as well as using a bulletin board (special table) </li></li></ul><li>Twister fo...
Performance – Kmeans Clustering<br />Performance with/without data caching. <br />Speedup gained using data cache<br />Inc...
Performance Comparisons<br />BLAST Sequence Search<br />Smith Watermann Sequence Alignment<br />Cap3 Sequence Assembly<br />
Upcoming SlideShare
Loading in …5
×

Twister4Azure - Iterative MapReduce for Azure Cloud

3,374 views

Published on

Twister4Azure is an iterative MapReduce framework, which support development and execution of Iterative MapReduce and traditional MapReduce application in Microsoft Azure cloud.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Twister4Azure - Iterative MapReduce for Azure Cloud

  1. 1. Twister4Azure : Iterative MapReduce for Azure Cloud<br />ThilinaGunarathne, Judy Qiu, Geoffrey Fox<br />{tgunarat, xqiu,gcf}@indiana.edu<br />CCA 2011<br />April 12 – 13, 2011<br />
  2. 2. MapReduceRolesfor Azure<br />Familiar MapReduce programming model<br />Built using highly-available and scalable Azure cloud services<br />Co-exist with eventual consistency & high latency of cloud services<br />Decentralized control<br />No single point of failure.<br />Supports dynamically scaling up and down of the compute resources.<br />MapReduce fault tolerance<br />
  3. 3. MapReduceRolesfor Azure<br />
  4. 4. Twister for Azure<br /><ul><li>Merge Step
  5. 5. In-Memory Caching of static data
  6. 6. Cache aware hybrid scheduling using Queues as well as using a bulletin board (special table) </li></li></ul><li>Twister for Azure<br />
  7. 7. Performance – Kmeans Clustering<br />Performance with/without data caching. <br />Speedup gained using data cache<br />Increasing number of iterations<br />Scaling speedup<br />
  8. 8. Performance Comparisons<br />BLAST Sequence Search<br />Smith Watermann Sequence Alignment<br />Cap3 Sequence Assembly<br />
  9. 9. Conclusion<br />Enables users to easily and efficiently perform large scale iterative data analysis and scientific computations on Azure cloud. <br />Utilizes a novel hybrid scheduling mechanism to provide the caching of static data across iterations. <br />Utilize cloud infrastructure services effectively to deliver robust and efficient applications.<br />http://salsahpc.indiana.edu/twister4azure<br />

×